from Hacker News

Show HN: ColBERT Build from Sentence Transformers

by raphaelty on 11/18/23, 11:14 AM with 18 comments

  • by ramoz on 11/18/23, 12:59 PM

    Anecdote: neural-cherche seems useful as I have analysts creating positive & negative feedback data (basically thumbs-up/down signals) that we will use to fine tune retrieval models.

    Assuming not much effort is required to make this work for similar models? (i.e. BGE)

  • by tinyhouse on 11/18/23, 12:12 PM

    Looks cool. A couple of questions: 1. Does it support fine tuning with different losses? For example, where you don't need to provide negatives and it uses the other examples in the batch as negatives 2. Can you share inference speed info? I know that Colbert should be slow since it creates many embeddings per passage
  • by kamranjon on 11/18/23, 6:02 PM

    What sort of high level user facing feature could you build with this?
  • by espadrine on 11/18/23, 12:15 PM

    I like the inclusion of both positive and negative examples!

    Do you have advice for how to measure the quality of the finetuning beyond seeing the loss drop?

  • by barefeg on 11/18/23, 1:03 PM

    Do you need to have the same number of positive and negatives? Is there any meaning of pairing a positive an a negative in the triplet?
  • by vorticalbox on 11/18/23, 12:26 PM

    Is a negative document one that doesn't match the query?