from Hacker News

  • Top
  • New

Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage

by cubie on 3/22/24, 4:00 PM with 0 comments