by diqi on 1/2/24, 6:21 PM with 36 comments
by jbellis on 1/2/24, 10:26 PM
I'm the lead author of JVector, which scales linearly to at least 32 cores and may be the only graph-based vector index designed around nonblocking data structures (as opposed to using locks for thread safety): https://github.com/jbellis/jvector/
JVector looks to be about 2x as fast at indexing as Lantern, ingesting the Sift1M dataset in under 25s on a 32 core aws box (m6i.16xl), compared to 50s for Lantern in the article.
(JVector is based on DiskANN, not HNSW, but the configuration parameters are similar -- both are configured with graph degree and search width.)
by nerfborpit on 1/2/24, 10:58 PM
I agree that Usearch is fast, but it feels pretty dishonest to take credit for someone else's work. Like maybe at least honestly profile what's going on with USearch vs pgvector (..and which settings for pgvector??), and write something interesting about it?
The last time I tried Lantern, it'd segfault when I tried to do anything non-trivial with it, and was incredibly unsafe with how it handled memory. Hopefully that's at least fixed.. but lantern has so many red flags.
by lettergram on 1/2/24, 9:10 PM
by mattashii on 1/2/24, 9:06 PM
by levkk on 1/2/24, 8:24 PM
by justinclift on 1/2/24, 10:31 PM
by TuringNYC on 1/2/24, 9:15 PM
by netcraft on 1/2/24, 9:12 PM
Still, very impressive