by freedmand on 4/23/23, 10:44 PM with 1 comments
by freedmand on 4/24/23, 4:10 AM
Semantra is a Python CLI that analyzes specified documents (text/pdf files), embeds them in chunks of text, and launches a local web server to interactively query them by semantic meaning. It is highly configurable and offers different embedding models, from Hugging Face transformers model (e.g. ones from SentenceTransformers) to OpenAI's embedding model (text-ada-embeddings-002).
The web app allows iteratively refining queries via tagging and adding/subtracting additional queries, allowing for very expressive and refined search experiences. It also presents "explanations" for search results by highlighting sub-regions within the results that most closely match the query.
Take it for a spin! `pipx install semantra`
---
Detailed instructions, tutorial, guide, and concept documentation at the repo: https://github.com/freedmand/semantra
Twitter thread with demo videos/content: https://twitter.com/dylfreed/status/1650268405881085952