by substation13 on 4/5/23, 8:05 AM with 4 comments
Often, these are often not a good fit for the problem at hand.
However, no one seems to be writing parsers for their own custom formats. If you did, it would certainly get a few strange looks in code review!
But why is this? Why is working with JSON etc. still so much easier than writing quick parsers?
Why haven't common parsing techniques been streamlined to the point where this is the easiest path?
by DemocracyFTW2 on 4/5/23, 1:41 PM
Personally I too think parsing should be easier in this day and age; I believe Raku (Perl 6) has made meaningful strides in that direction. Other than that, I feel parsing is somewhat over- and lexing is somewhat underrated, if anything. In my experience lexing is really the step you want most to get data out of a byte sequence, and I agree that that should and could be much easier. FWIW JavaScript's RegExes recently obtained the 'sticky flag' which is ultra-beneficial for lexing. Not sure why that bit took so long.
by surprisetalk on 4/5/23, 1:26 PM
People use JSON because they need to send information over a wire, and JSON serializers are abundant. I agree that this is problematic for a bunch of different reasons:
[1] https://taylor.town/json-considered-harmful
But note that this problem has been solved many, many times:
[2] https://en.wikipedia.org/wiki/Comparison_of_data-serializati...
Formats like MessagePack and Cap'n_Proto have a lot of nice properties.
Writing parsers is not easy. And it's especially not easy when you have a custom format that different people want to do different things with.
---
Btw, I've tried out pretty much every parsing library in Rust, Typescript, and Haskell.
Elm's parsing library is the only one I enjoy using:
[3] https://package.elm-lang.org/packages/elm/parser/latest
I think nearley.js has a cool interface but poor execution:
by pestatije on 4/5/23, 8:47 AM