by pps on 12/30/23, 1:07 PM with 42 comments
by minimaxir on 12/30/23, 7:46 PM
1) There are two primary ways to have models generate embeddings: implicitly from an LLM by mean-pooling its last hidden state since it has to learn how to map text in a distinct latent space anyways to work correctly (i.e. DistilBERT), or you can use a model which can generate embeddings directly which are trained using something like triplet loss to explicitly incentivise learning similarity/dissimilarity. Popular text-embedding models like BAAI/bge-large-en-v1.5 tend to use the latter approach.
2) The famous word2Vec examples of e.g. woman + king = queen only work because word2vec is a shallow network and the model learns the word embeddings directly, instead of it being emergent. The latent space still maps them closely as shown with this demo, but there isn't any algebraic intuition. You can get close with algebra but no cigar.
3) DistilBERT is pretty old (2019) and based on a 2018 model trained on Wikipedia and books, so there will be significant text drift in addition to being less robust with newer modeling techniques and a more robust dataset. I do not recommend using it for production applications nowadays.
4) There is an under-discussed opportunity for dimensionality reduction techniques like PCA (which this demo uses to get the data into 3D) to both improve signal-to-noise and improve distinctiveness. I am working on a blog post of a new technique to handle dimensionality reduction for text embeddings better which may have interesting and profound usability implications.
by tikimcfee on 12/30/23, 4:50 PM
I have this except you can see every single word in any dictionary at once in space, it renders individual glyphs. It can show an entire dictionary of words - definitions and roots - and let you fly around in them. It’s fun. I built a sample that “plays” a sentence and its definitions. GitHub.com/tikimcfee/LookAtThat The more I see stuff like this, the more i want to complete it. It’s heartening to see so many people fancied with seeing words… I just wish I knew where to find these people to like.. befriend and get better. Im getting the feeling I just kinda exist between worlds of lofty ideas and people that are incredibly smart sticking around other people that are incredibly smart.
by wrsh07 on 12/30/23, 4:25 PM
Eg what is the real distance between the two vectors? That should be easy to compute
Similarly: what do I get from summing two vectors and what are some nearby vectors?
Maybe just generally: what are some nearby vectors?
Without any additional context it's just a point cloud with a couple of randomly labeled elements
by granawkins on 12/30/23, 11:48 PM
I hadn't planned to keep building this but if I do, what should I add/change?
by chaxor on 12/30/23, 7:42 PM
by kvakkefly on 12/30/23, 3:03 PM
by thom on 12/30/23, 2:46 PM
by pamelafox on 12/30/23, 9:59 PM
by tetris11 on 12/30/23, 4:51 PM
by eurekin on 12/30/23, 2:31 PM
> man woman king queen ruler force powerful care
and couldn't reliably determine position of any of them
by smrtinsert on 12/30/23, 6:13 PM
by larodi on 12/30/23, 2:46 PM
by cuttysnark on 12/30/23, 7:50 PM