from Hacker News

Show HN: Semantic Search on Wikipedia

by jonas_kgomo on 4/29/23, 11:12 PM with 0 comments

This is a semantic search engine that understands the meaning of words and can return results that match the user's intent, even if the words used in the search query are not an exact match for the text in the articles.

To achieve this, we used Cohere's 100M embeddings on Wikipedia, which allowed us to map each article to a high-dimensional vector space [786] where similar articles are cosine close to each other.

We then built a simple web app that allows users to search for articles using natural language queries and returns the top results, along with a 2D visualization of the articles using UMAP.

https://txt.cohere.com/embedding-archives-wikipedia