by muldvarp on 12/25/23, 12:25 AM
I did research on parser differentials for my bachelor's thesis. My initial hope was that I would find a few mismatches for formats without a formal specification. I found mismatches for _every single_ pair of parsers I looked at and that included formats with formal specifications. My personal takeaway was "If you use one parser for validation and another parser for evaluation, you're fucked. No exceptions."
by SAI_Peregrinus on 12/24/23, 11:35 PM
As the article mentions, Postel's Law is likely to create vulnerabilities. It makes individual systems more robust, but the whole becomes fragile.
by Turing_Machine on 12/25/23, 12:36 AM
> Well, these browsers "helpfully" fix the URL to change backslashes into regular forward slashes, I suppose because people sometimes type in URLs and get their forward and back slashes confused.
More likely because Windows has historically used \ rather than the / that's standard in Unixish systems. Windows people are used to typing \, so it's indeed somewhat helpful for the browser to accept either (e.g., in file:// URLs).
by cjbprime on 12/25/23, 2:43 AM
Odd that the article doesn't use the more standard term "parser differential", with "differential fuzzing" as the fuzzing community's method for finding those.
by ngneer on 12/25/23, 2:09 AM
by nayuki on 12/25/23, 3:12 PM
by dandanua on 12/25/23, 10:07 AM
I guess if we add all the problems in IT that were caused by bugs and poor designs of parsers/serializations, e.g. SQL injections, XSS, null byte vulns etc., we get billions of human hours in damages.
What should be instead is an absolutely clear serialization format into a byte string of ANY data structure that must processed by two different programs.
Parsers are programs, they should "parse" bytes, not strings, like we humans do.
by conartist6 on 12/25/23, 12:54 PM
If BABLR succeeds in creating a shared instruction set for defining parsers, you'd just have portable parser grammars running on compatible parser VMs
by o11c on 12/24/23, 10:55 PM
Usually? a result of the parser not having a machine-readable specification.
For parsing proper, `bison --xml` is useful if you're allergic to code-generation. I don't have an equivalent for lexing.
by RicoElectrico on 12/25/23, 11:33 AM
Honestly we should have a name for such class of bugs. It's not an "I didn't know" kind of mistake. Every person sufficiently intelligent to program should figure out by themselves that having 2 parser implementations can cause various undesired consequences.
by sylware on 12/23/23, 10:52 AM
Usually, some not verified and cleaned enough external input text managed to get into some complex and often brain damaged text parser (printf,sql,etc).