from Hacker News

JC – JSONifies the output of many CLI tools

by pgl on 11/3/22, 7:55 AM with 132 comments

by asicsp on 11/3/22, 9:08 AM
See also:
* "Bringing the Unix philosophy to the 21st century (2019)" (https://blog.kellybrazil.com/2019/11/26/bringing-the-unix-ph...) - https://news.ycombinator.com/item?id=28266193 238 points | Aug 22, 2021 | 146 comments
* "Tips on adding JSON output to your CLI app" (https://blog.kellybrazil.com/2021/12/03/tips-on-adding-json-...) - https://news.ycombinator.com/item?id=29435786 183 points | 11 months ago | 110 comments
by ducktective on 11/3/22, 8:48 AM
Then pipe it into jq [1] to query parameters or build up a formatted string.
Or pipe it into rq [2] to convert the format to yaml, toml etc.
[1]: https://stedolan.github.io/jq/tutorial/ [2]: https://github.com/dflemstr/rq#format-support-status
by spinningslate on 11/3/22, 9:03 AM
Didn't know about this, the HN dividend pays out again!
When wrestling with sed/awk in trying to parse results of a shell command, I've often thought that a shell-standard, structured outpout would be very handy. Powershell[0] has this, but it's a binary format - so not human-readable. I want something in the middle: human- and machine-readable. Without either having to do parsing gymnastics.
jc isn't quite that shell standard, but looks like it goes a long way towards it. And, of course, when JSON falls out of fashion and is replaced by <whatever>, `*c` can emerge to fill the gap. Nice.
[0]: https://learn.microsoft.com/en-us/powershell/
by vesinisa on 11/3/22, 9:15 AM
Seems to be part of the Arch repository: https://archlinux.org/packages/community/any/jc/
Why does this site recommend using "paru", "aura" or "yay" to install it on Arch? I have been using Arch for a decade or so but have never even heard of such tools. They don't even have pages in the Arch wiki, and only yay ("Pacman wrapper and AUR helper written in go") is available via the standard repository.
Begs the question: what is so wrong with plain pacman?
EDIT: Okay so seems they were previously on AUR and once accepted to community repository they just forgot to stop recommending an AUR wrapper for installing: https://github.com/kellyjonbrazil/jc/commit/f2dd7b8815edc92e...
EDIT2: Created a PR with the GitHub.dev editor .. Absolutely blown away by how easy it was! Feels like the future of development.. https://github.com/kellyjonbrazil/jc/pull/310
by dmoura on 11/3/22, 11:07 AM
This is great!
I am the author of SPyQL [1]. Combining JC with SPyQL you can easily query the json output and run python commands on top of it from the command-line :-) You can do aggregations and so forth in a much simpler and intuitive way than with jq.
I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.
[1] https://github.com/dcmoura/spyql [2] https://danielcmoura.com/blog/2022/spyql-cell-towers/
by garfieldnate on 11/6/22, 9:45 AM
One of the features of Julia that I really like and wish would be adopted elsewhere is that the stringification methods take a context object that can indicate the string is to be presented in: plaintext, ansii-colored terminal, json, HTML, LaTeX, GraphViz, etc. It would have been amazing to have this kind of flexibility considered in the Unix tools.
That said, I would argue that JSONLines is a better universal output format when you're dealing with pipelines. If the output is one giant JSON array, then you have to wait for a long-running program to finish completely before the output can be passed on to the next long-running program. If you output one JSON line at a time as you process your input, then the next program in the pipeline can get started on processing already without waiting for the first to finish completely.
by enriquto on 11/3/22, 8:41 AM
In the opposite direction, if you want to recover a human-editable text stream from json data, you run gron.
by pgl on 11/3/22, 8:00 AM
Command line output as JSON. Very handy and has parsers for tons of common utilities, written in Python.
Blog post with examples here: https://blog.kellybrazil.com/2020/08/30/parsing-command-outp...
by anecdotal1 on 11/3/22, 12:01 PM
Can you trust it? Cli tool output is not exactly stable. I thought that's why libxo exists?
https://github.com/Juniper/libxo
by rob74 on 11/3/22, 9:14 AM
Great idea, but sounds like a maintenance nightmare to me. Not only that many users will complain that their favorite CLI tool isn't supported, but also a new release of any of the supported CLI tools might break the support without any kind of warning, as I don't think changes to the (human-readable) output are considered major changes.
by HyperSane on 11/3/22, 6:05 PM
The greatest thing about powershell is that all commands returns structured output like this.
by otikik on 11/3/22, 11:47 AM
This is a good step in the right direction, which IMHO should be that all those cli tools should have a —-json flag.
by nixcraft on 11/3/22, 9:00 AM
Last year I looked for JSON output with the dig command on Linux but found the yaml option while reading the man page. It was handy. I always wondered why yaml/JSON output is not standard with Linux and Unix utilities for scripting needs. Anyway:
```
     dig +yaml google.com
```
by khiqxj on 11/3/22, 3:32 PM
There are at least 3 ways this can create bugs:
```
  - It has to parse output of commands which may or may not be intended to be parsed and may or may not have a predictable format. The only way to overcome this is if this program becomes one of the Big Four "UN*X command output -> data" converters

  - It casts things to "float/int"

  - Depending on who made this library, the output itself may not be strict / predictable. Perhaps it will output JSON with two different key names in two different scenarios.
```
And don't forget that any of these issues will still come up even if they are accommodated for, due to versions of programs changing without the author of this tool knowing.
But yeah basic things having intrinsic shortcomings is a given, when you're using UN*X.
From the nested article:
> Had JSON been around when I was born in the 1970’s Ken Thompson and Dennis Ritchie may very well have embraced it as a recommended output format to help programs “do one thing well” in a pipeline.
They had S-expressions and plenty more options. They also could have just made a format as you can tell with thousands of ad-hoc trendy new formats like YAML and TOML being spewed out every recent year now that programmers discovered data structures.
by fabianthomas on 11/5/22, 10:24 AM
I'm really concerned about performance/latency with the json approach. I couldn't find anyone discussing that yet completely.
Even when using jq (written in C) my quick tests show that parsing json is really slow compared to parsing with simple unix tools like awk. I suspect that to come from the fact that the parser has to check the full json output first in order to print a result, while awk does not care about syntax of the output.
I compared two shell scripts both printing the ifindex of some network device (that is the integer in the first column) 10 times.
Using awk and head combined gives me 0,068s total time measured with the time command.
Using ip with the -j flag together with jq gives 0,411s.
Therefore the awk approach is 6 times faster. And here I used a binary (ip) that already supports json output and doesn't even need the mentioned jc.
While this whole test setup is somewhat arbitrary I experienced similar results in the past when writing shell scripts for, e.g., my panel. Reach out to me if you are interested in my test setup.
by uvesten on 11/3/22, 8:49 AM
As someone who uses `jq` almost daily, this looks like a great tool I didn’t know I needed. Thanks for the tip!
by mg on 11/3/22, 9:59 AM
```
    dig example.com | jc --dig
```
Seems a bit redundant. Maybe it should be the other way round?
```
    jc dig example.com
```
Similar to how you do
```
    time dig example.com
```
by pastage on 11/3/22, 10:21 AM
I Love using this in streams of data, but there is a lot to be said about the pit falls. Some of them; first you might want to filter with grep first and that leads to missing metadata, second error handling is a good thing but people tend to ignore errors and then not handle them. This is basically what filebeat/logstash from Elastic does, which is a beast at parsing (and impossible to use from command line).
The power of plain text pipes is that you do not interpret them and that makes them fast, that is usefull because you handle both 100 bytes, 1MB and 1TB as input. You choose what you parse keeping it simple, fast and usually error free. This tool miss the, fast, simple and human readable part of debugging pipes. Which is fine!
by naikrovek on 11/3/22, 9:31 PM
This seems very fragile, to me, without support from the application whose output is converted to JSON.
minor updates to command-line tools can and do subtly alter the textual output of the tool, and the outputs of these tools are not standardized.
This is a step towards "objects passing messages" as originally conceived by Alan Kay, if my incomplete understanding of what he's said is correct, and that's a good thing, I think. Objects passing messages around is a very solid model for computing, to me. Note that I am stupid and don't understand much, if I'm honest.
by synergy20 on 11/3/22, 12:12 PM
to make this great tool truly universal, it has to be written in c instead of python these days, then provide python|javascript|etc bindings if possible.
I'd like to use it on embedded systems, where python is too large to fit. this tool can be widely deployed just like awk|sed|etc but it has to be in C for that.
by codedokode on 11/3/22, 12:31 PM
I think that JSON is a bad choice here.
It is obvious that CLI commands should produce machine-readable output because they are often used in scripts, and accept machine-readable input as well. Using arbitrary text output was a mistake because it is difficult to parse, especially when spaces and non-ASCII characters are present.
A good choice would be a format that is easily parsed by programs but still readable by the user. JSON is a bad choice here because it is hard to read.
In my opinion, something formatted with pipes, quotes and spaces would be better:
```
    eth0:
      ip: 127.15.34.23
      flags: BROADCAST|UNICAST
      mtu: 1500
      name: """"Gigabit" by Network Interfaces Inc."""
```
Note that the format I have proposed here is machine-readable, somewhat human-readable and somewhat parseable by line-oriented tools like grep. Therefore there might be no need for switches to choose output format. It is also relatively easy to produce without any libraries.
Regarding idea to output data in /proc or /sys in JSON format, I think this is wrong as well. This would mean that reading data about multiple processes would require lot of formatting and parsing JSON. Instead or parsing /proc and /sys directly, applications should use libraries distributed with kernel, and reading the data directly should be discouraged. Because currently /proc and /sys are just a kind of undocumented API.
Also, I wanted to note that I dislike jq utility. Instead of using JSONPath it uses some proprietary query format that I constantly fail to remember.
by Too on 11/5/22, 10:46 AM
While I can appreciate the immediate value this tool brings for some applications.
At the same time it’s an epitome of everything that is wrong with current software landscape. Instead of fixing the deficiencies in upstream, once and for all, we just keep piling more and more layers on top.
Programs having structural data and APIs inside, that get translated into human representation - only to be re-parsed again into structural form. What could possibly go wrong?
If you are already in any programming environment, many of the tools already have better built in APIs. I mean who needs an “ls” or timestamp parser. Just use os.listdir or equivalent. As someone previously pointed out in this thread, the ls parser is in fact already broken, unsurprisingly. Mixing tools made for interactive use in automation is never a good idea.
The Unix philosophy sounds romantic in theory, but need structural data, throughout, to work reliably in practice. Kids, go with the underlying apis unless your tool has structured output.
by pessimizer on 11/3/22, 5:06 PM
I don't know why I never considered this sort of option to powershell out bash. My only problem with it is that it's in python (also why I don't really use jq), and that it's not something that just sets aliases behind the scenes.
If this were written in a performant language, if it simply aliased (i.e. invisibly) all common cli commands to a wrapper which would obviate the need for all of the text processing between steps in command pipelines, if it were versioned and I could include the version number in scripts, and finally if I could run versioned scripts through it to compile them into standard bash scripts (a big ask), I'd give it a 3 month test starting today. There'd be nothing to lose.
Just putting that out there for people who like to rewrite things in Rust. A slightly different version of this concept could allow for nearly friction-free adoption.
by esskay on 11/3/22, 9:00 AM
Been using this a while to pull out a bunch of server stats for a monitoring dashboard and it's been great. I just wish there were more supported services. The total mess of different outputs and config types for Linux packages is extremely annoying to deal with.
by wooptoo on 11/3/22, 11:56 AM
Unrelated tool but also very useful in combination with JQ: gron
https://github.com/tomnomnom/gron
by hinkley on 11/3/22, 5:14 PM
I discovered process.send() in Node a couple years ago and it made the decision to fork a child process a lot easier. No need to sanitize command line output when you can do direct IPC over a connection that uses JSON under the hood.
The itch I can’t seem to scratch is how to run tasks in parallel and have logs that are legible to coworkers. We do JSON formatted logs in production and I’m wondering if something like this would help solve that set of problems.
by wodenokoto on 11/4/22, 12:26 PM
Something like this is what I've been thinking is the path forward to non-posix shells and a way to get away from cumbersome foot-gunny bash.
Nushell that hit the front page earlier this week seemed to me to be limited by "compatible" apps, but wrapping all the big ones in a json converter superficially seems like a great solution to me.
by menjaprunes on 11/3/22, 8:57 AM
I stumbled upon this few weeks ago to parse the output of nmcli and now it is in my toolset for scripting
I love it
by fimdomeio on 11/3/22, 11:42 AM
There's a clash of names between this jc and autojump (https://github.com/wting/autojump) jc (jump to child)
by friendzis on 11/3/22, 10:19 AM
Sounds a lot like Powershell, which returns rich objects instead of a text blob
by exabrial on 11/3/22, 3:45 PM
Have a parseable output is great. What would be more incredible is to have a parseable output with a schema definition and/or formal grammar of some sort.
by psadri on 11/3/22, 9:31 PM
Is anyone aware of something similar but for the args? A database that maps a command to a schema for all it’s possible cli arguments?
by sindoc on 11/3/22, 10:20 AM
OMG, I was looking for exactly something like this. I’d like to create immersion between Logseq and CLI. This can very useful.
by visarga on 11/3/22, 8:56 AM
love it, json could be the best way to pipe data between processes because it is text, but structured
by deafpolygon on 11/3/22, 11:07 AM
Bringing a powershell feature to the Unix CLI.
by berkes on 11/3/22, 12:00 PM
A slightly related pet-peeve: I don't like it when "random" commands "squat" the two-letter domain, or worse, the one-letter domain: t, jq, jc etc.
In my perfect world (which, obviously doesn't exist), commands from tools "in the wild" are at least three letters long. With historical exceptions for gnutools: preferably they'd take the three-letter space, but two-letters (cd, ls, rm etc) is fine.
Two letter space outside of gnutools, is then reserved for my aliases. If jsonquery is too long to type, AND I lack autocomplete, then an alias is easy and fast to make. alias jq=jsonquery.
In the case of this tool, it will conflict with a specialised alias I have: `alias jc=javac -Werror`. Easy to solve by me with another alias, but a practical example of why I dislike tools "squatting" the two letter namespace.