Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On the other hand, both awk and sed quickly spiral out of control if you need to do anything nontrivial that spans newlines.

If the unit of input in this kind of stream processing system doesn't match the problem domain exactly, things get very difficult very quickly.



Awk isn't so bad if you're clever about RS, but sed sucks. A tragic gap in the Plan 9 legacy has been structural regular expressions, which deal with these situations adroitly.


(RS = record separator, it just defaults to newline. You can handle multi-line patterns in awk.)


My problem was that the records really were purely newline-delimited, but I needed to process them using information from their context in the stream.


Fair enough. That's beyond the common cases awk addresses. At that point, I just switch to Lua. (I forget if you're a Python or Ruby guy.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: