I'd say we have about 60% library code and 40% application code internally, so m...

paddy_m · 2025-06-01T17:21:10 1748798470

which implementation of `std()` did you go with?

I was writing unit tests once against histograms. That code is super finnicky, and I couldn't get pandas and polars numbers to tie out. I wasn't super concerned with the exact output for my application, just that number of buckets was the same and they were roughly the same size. Just bumping to a new version of numpy would result in test breakages because of floating point rounding errors. I'm sure there are much more robust numerical testing things I could have done

SatvikBeri · 2025-06-01T19:24:56 1748805896

Yeah this kind of stuff drove us crazy. Numpy uses things like SIMD everywhere with no option to turn it off, which is good for performance but makes life really hard when you want consistently reproducible results across different machines.

Before switching to Julia we eventually standardized on numpy.std with ddof=1, and with some error tolerances in our tests.