This is a great talk, although maybe a little bit too precious (but this is comm...

flipgimble · on April 9, 2013

As I understood it, the arguments against using an unbounded length field is that it makes the language recognizing it context-sensative. When processing some inner payload of a data packet you need to carry around the state of outer context-sensative protocol layers to make sure your inputs are well formed.

The fact that it is trivial to maliciously craft the length field makes it cheap for the attacker to try to exhaust receiver memory, overflow buffers or make DDoS attack more effective. If you use a delimiter, the attacker has to at least spend the required bandwidth to try to exhaust resources.

I suspect that if your protocol specification bounds the length field to some finite amount, then your language can be classified as a regular language for verification purposes, just with a FSM branch for each possible value of the length field.

wladimir · on April 11, 2013

The fact that it is trivial to maliciously craft the length field makes it cheap for the attacker to try to exhaust receiver memory, overflow buffers or make DDoS attack more effective. If you use a delimiter, the attacker has to at least spend the required bandwidth to try to exhaust resources.

A length field doesn't mean that you have to pre-allocate that amount of memory. Never do that! Robust implementations use the length field only as a hint, as an hidden delimiter, and allocate memory as the data comes in.

That said, she does have a point. Though escaping is fraught with dangers, too (remember PHP in the beginnings? magic quotes, ugh).

ezy · on April 9, 2013

Aha, that was it. I was assuming you'd bound the length. :-)