Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's mawk. I'm talking about the implementation that post calls "nawk", and either way, I mean orders of magnitude - I care about a 10-100+x difference in speed, not a 1.1-5x one. Awk and Python fall in roughly the same performance tier for that kind of code.

Also: "I have since found large datasets where mawk is buggy and gives the wrong result. nawk seems safe." makes me uneasy, as does the fact that it was unmaintained for a while.



Afaict, mawk's maintenance seems to be a bit up in the air--- the original maintainer basically disappeared years ago and hasn't blessed any successor, so the Debian-patched version became the de-facto current version, since at least it staved off bitrot. Recently someone (Thomas Dickey) picked up maintenance of a new upstream version unilaterally, starting from the Debian-patched version, but he hasn't managed to convince the Debian mawk maintainer to accept his new version as a new upstream (somewhat testy thread here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=554167). I'm personally a little more comfortable with something actively maintained like gawk, despite the speed differences.


Right. I usually use (n)awk because it's the default on OpenBSD, but have to admit gawk's artificial-filesystem-based networking support is pretty cool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: