Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a great talk, although maybe a little bit too precious (but this is common to a lot of [security] conference talks so I don't hold it against her). Pushing people towards using more formal methods to generate and accept protocols is always good.

That said, I thought the argument against length fields was somewhat.. weak. But maybe I'm misunderstanding the context. The question at the end was not answered satisfactorily. A protocol with a length field is most certainly deterministic, and if you go the other way and use a delimiter, escaping/encoding is the only way you're going to work with arbitrary user data. I would argue the length field is miles better. If someone injects bytes into your stream that match the protocol, your recognizer isn't going to save you, just the same way someone rewriting the length field is going to blow up.

What this talk seems to argue for is making the language simpler (e.g. context-free), so you can validate that the semantics are working as intended, but transferring arbitrary blobs of data is always going to be an issue as long as people enter random musings in text boxes that are rendered by Turing complete software. :-)



As I understood it, the arguments against using an unbounded length field is that it makes the language recognizing it context-sensative. When processing some inner payload of a data packet you need to carry around the state of outer context-sensative protocol layers to make sure your inputs are well formed.

The fact that it is trivial to maliciously craft the length field makes it cheap for the attacker to try to exhaust receiver memory, overflow buffers or make DDoS attack more effective. If you use a delimiter, the attacker has to at least spend the required bandwidth to try to exhaust resources.

I suspect that if your protocol specification bounds the length field to some finite amount, then your language can be classified as a regular language for verification purposes, just with a FSM branch for each possible value of the length field.


The fact that it is trivial to maliciously craft the length field makes it cheap for the attacker to try to exhaust receiver memory, overflow buffers or make DDoS attack more effective. If you use a delimiter, the attacker has to at least spend the required bandwidth to try to exhaust resources.

A length field doesn't mean that you have to pre-allocate that amount of memory. Never do that! Robust implementations use the length field only as a hint, as an hidden delimiter, and allocate memory as the data comes in.

That said, she does have a point. Though escaping is fraught with dangers, too (remember PHP in the beginnings? magic quotes, ugh).


Aha, that was it. I was assuming you'd bound the length. :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: