Yep - you evidently didn't read mine as 165 k lines of c++ isn't exactly a small...

sanxiyn · on Feb 10, 2015

There are many measures of benchmark sizes. One important measure is size of codes that account for 99% of execution time. If your codebase is a million lines but your hotspot is a thousand lines, benchmark result is sensitive to optimization quirks and in some sense benchmark is small.

More on this idea here: http://blog.pyston.org/2014/12/05/python-benchmark-sizes/