from
Hacker News
Top
New
Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage
by
cubie
on 3/22/24, 4:00 PM with 0 comments