> And performance of awk is crap. [...] I had replaced all the awk scripts in python and everything is a lot faster.
My experience points exactly the other way: for data-processing tasks, especially streaming ones, even Gawk is a lot faster than Python (pre-3.11), and apparently I’m not the only one[1]. If you’re not satisfied with Gawk’s performance, though, try Nawk[2] or, even better, Mawk[3]. (And stick to POSIX to ensure your code works in all of them.)
Do you know of any performance comparisons vs. PyPy? I find it works extremely well as a drop-in replacement for CPython when only the built-in modules are needed, which should generally hold for awk-like use cases. Yet some brief searching doesn't seem to yield any numbers.
You gotta share the code how you are doing it. If you are using awk alternative, you would be comparing against pandas or pypy.
I will do a comparison as soon as I am free.
My experience points exactly the other way: for data-processing tasks, especially streaming ones, even Gawk is a lot faster than Python (pre-3.11), and apparently I’m not the only one[1]. If you’re not satisfied with Gawk’s performance, though, try Nawk[2] or, even better, Mawk[3]. (And stick to POSIX to ensure your code works in all of them.)
[1] https://brenocon.com/blog/2009/09/dont-mawk-awk-the-fastest-...
[2] https://github.com/onetrueawk/awk
[3] https://invisible-island.net/mawk/