Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Protocol buffers don't always support "default" values, because certain types will include a protocol-default value. For example, a `bool` which is not present will return `false` instead of say, a nil type.

Required fields were a feature of Proto2. In proto3 syntax (latest), the required field concept was dropped because it caused issues with protocol evolution and was easy to misuse.

In essence, because of the backward and forward compatibility guarantees supported by protobuf, a "required" field must be required for the entire lifecycle of the protocol.

For these reasons and others, protobuf takes a stance where unrecognized fields are not necessarily errors. If it took a strict stance and failed in this condition, the presence of new fields in an "evolved" protocol would be an error which would break forward compatibility; old clients would not be able to communicate with new servers, and vice versa.

Protobuf guarantees that new fields will not break forward or backward compatibility.

This is why parsing unknown fields in protobuf is a feature, working as intended, not a flaw. In some language SDKs for protobuf I believe you can customize this behavior but it really isn't a good idea.

This is also why the app authors might want to consider a hash instead. Tampering with the payload would break the hash, and without the schema, the author would not necessarily know where the hash was situated in the payload to fix it, or even that one is present at all. The complexity of recalculating the hash (assuming they find it) vastly multiplies the attacker's burden at little cost to the application; adding a few rounds and a salt, for instance, would make this kind of attack significantly harder to pull off.

It's not perfect security, but it would certainly be better.



Well, for the field in question, I imagine that it should be easy to distinguish meaningful content from whatever it defaults to when the field is missing.


In this case since he mentions it's a "tree" of data (he means a "message"), it would be a sub-object that would become an initialized default. So there would be an "object" there, but it would have no "ads" array in it, or what not.

Protobuf does this so you can do `deep.dotted.paths` and you won't get null exceptions (probably a side effect of starting partly in Java). The leaf fields end up as empty strings, `0`, `false`, or an empty array for repeated fields.

It's a neat trick to get it to ignore a field, just not a "flaw." It's actually a compatibility feature in disguise.

(So it might be pretty hard to detect, versus the potentially-legitimate case of just not having any ads to show.)


The article actually shows a screenshot of the structure they are cutting off.

I would think that a dummy object would be trivial to detect.


It would be like receiving `{ads: []}`. How do you distinguish between the case of having no ads to show, and someone tampering with the data?

Is it ever reasonable to assume you have a properly decoded empty array because a user tampered with it, instead of that being what the server gave back to you?

If you have to choose between (a) the app shutting down or (b) the user not seeing ads bc the ad array is empty, you are probably going to pick B.


Anyway, none of this would be a problem if the TLS tunnel broke because the cert failed to validate against an issuer pin shipped with the app.


Maybe the reason they don’t do this is that many enterprise networks break up TLS.


Good point, hadn't thought of that




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: