Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A better tool than jq for just extracting data, antonmedv's fx (github.com/antonmedv)
76 points by gigatexal on June 6, 2023 | hide | past | favorite | 19 comments


This is cool -- I like not having to learn jq's bespoke syntax for finding and filtering. But, at a glance it looks like it just shells out to whatever Node.js it finds, so it's not exactly portable or self-contained.

What I really want is a version of this type of tool that's truly a single binary, that implements JSONPath, or some spec'd alternative that I can invest in learning, knowing it's not just a fad.


If you want to explore fully specified alternatives, there is also the RumbleDB engine (www.rumbledb.org), which implements JSONiq, like Xidel. It can be downloaded as a single jar file, and used with a simple java command: queries can be run as single commands (possibly reading from stdin, like jq), or on the shell, or in Jupyter notebooks.

RumbleDB is free, with no restriction on commercial use, and open source. It is an academic project, the product of now 6 years of work by 20+ ETH Zurich BSc and MSc students in their projects and theses.

You will find a video tutorial given at the Declarative Amsterdam conference last year here: https://www.youtube.com/watch?v=3YkLXQVyN2o (link edited, it was wrong)

RumbleDB works with small JSON files but also with several terabytes of JSON. It also works with many other formats; for example, it allows validating data and then storing it in a more efficient binary format (like Parquet).

It works on a laptop by spreading the computations on all the cores, but also in the cloud (it was tested on a cluster of 64 Amazon EMR machines manipulating billions of objects).

It can read and write data to and from a laptop's local drive, but also to and from data lakes (S3, HDFS...).

Feature-wise, JSONiq can do everything SQL can do (projection, selection, grouping, sorting, joins, etc), while also supporting heterogeneous and nested datasets seamlessly (normalization, denormalization, navigation, etc). 95% of it is directly based on a W3C standard including many of its builtin functions.

RumbleDB also has a Machine Learning library.


You could try Xidel[1]. It supports JSON, XML and HTML using XPath/XQuery 3.1

It has some extensions to the standard that are pretty nice (JSONiq, CSS selectors, html “template” matching), but you can limit it to just standard XPath/XQuery if you like.

I recommend getting the nightly v .99 build if you give it a try, the stable .98 version is pretty old and I’ve had no issues with .99

1. https://www.videlibri.de/xidel.html


So I've heard this a lot. Am I the only one who found jq to be fairly sensible and intuitive?

I'm not saying I don't need to pull up the docs once or twice for regex or leafpaths (or was it leaf_paths, lol), but overall I can't think of something that needs to be pulled into the standard install more. It's near perfect.


yes, you are the only one. that's part of what makes you special!

[j/k. i like it too, but i'm sure you're still special. :D]


Weird sensationalized title to just show an alternative.

I expected a blog post comparing both tools.


I feel like the animated gif does a great job explaining how it works. If you're at all familiar with jq, it's immediately obvious what this tool does.


Seeing what a tool does isn't a comparison. A blog post would hopefully be insightful and point out non-obvious things, like how jq is arguably abandonware. https://github.com/jqlang/jq/issues/2305


Looks like the jq repo just got some new maintainers and has had ≈20 commits over the last few days.


I don't see how jq being abandonware is relevant to a different tool with a different feature set.


See, this is exactly what a blog post could explain. It's relevant because people who need a tool that hasn't been abandoned are looking for an alternative. The reducers in fx could be a replacement for jq in many use cases, and their feature sets overlap quite a lot.


Apples vs oranges.

`jq` is a CLI processor for JSON (you can filter values etc), while `fx` is a JSON viewer (with some `reducers` functionality).


Jq's syntax is complex. So there are so many alternatives to jq. Here's another one

https://github.com/prashanth-hegde/jpath


apologies for the clickbait title. i'm dumb like that sometimes.

fx is dope but i think zq from the folks behind the zed language might even be nicer: https://www.brimdata.io/blog/introducing-zq/


What a heck of a great blog post. I learned a lot about jq from how he described the computational model and I’m keen to try it zq now, thanks!


Aww shucks. You’re welcome!


Sometimes, for extremely simple uses, when I don't want the hassle of selecting a Docker container that has jq, I'll just use a Python oneliner so I don't have to learn extra syntax, can use Python's list comprehensions, etc:

  <<< '{"a":"b", "c":"d"}' python3 -c 'import json,sys;print(json.loads(sys.stdin.read())["c"])'
This also works with a Python script with more lines if I need to add complexity or readability, or any feature, API calls, etc.


I’ve done this too. I almost went as far as creating my own version of fx with it but then decided like AWK if I wanted to do something really fancy I’d just write up a script or do that << EOF thing and do the code in line


This is the first non-jq tool I've seen that actually excites me enough to try it

I'm curious though, would you be able to add a transformation with this into a pipeline? It'd be cool if it had a way to convert to jq syntax if not




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: