by softwaredoug on 5/5/25, 7:13 PM with 2 comments
by jbellis on 5/8/25, 3:24 PM
> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.
Yes, exactly, that is the whole point of Splade.
Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.
Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1