from Hacker News

PrivateGPT

by antouank on 5/21/23, 8:40 PM with 142 comments

by davidy123 on 5/22/23, 10:19 AM

Granted I'm not coming from the python world, but I have tried many of these projects, and very few of them install out of the box. They usually end with some incompatibility, and files scattered all over the place, leading to future nightmares.

  ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
  sentry-sdk 1.22.2 requires urllib3<2.0.0, but you have urllib3 2.0.2 which is incompatible

Just for fun, here's the result of python -m pip install -r ./requirements.txt for tortoise-tts;

…many many lines

          raise ValueError("%r is not a directory" % (package_path,))
      ValueError: 'build/py3k/scipy' is not a directory
      Converting to Python3 via 2to3...

…

  /tmp/pip-install-hkb_4lh7/scipy_088b20410aca4f0cbcddeac86ac7b7b1/build/py3k/scipy/signal/fir_filter_design.py
      [end of output]

  note: This error originates from a subprocess, and is 
  likely not a problem with pip.
  error: metadata-generation-failed

I'm not asking for support, just saying if people really want to make something 'easy' they'd use docker. I gather there are better python package managers, but I gather that's a bit of a mess too.

Someone is thinking "this is part of learning the language," but I think it's just bad design.

by j_shi on 5/22/23, 10:38 AM
Self-hosted + self-trained LLMs are probably the future for enterprise.
While consumers are happy to get their data mined to avoid paying, businesses are the opposite: willing to pay a lot to avoid feeding data to MSFT/GOOG/META.
They may give assurances on data protection (even here GitHub copilot TOS has sketchy language around saving down derived data), but can’t get around fundamental problem that their products need user interactions to work well.
So it seems with BigTechLLM there’s inherent tension between product competitiveness and data privacy, which makes them incompatible with enterprise.
Biz ideas along these lines: - Help enterprises set up, train, maintain own customized LLMs - Security, compliance, monitoring tools - Help AI startups get compliant with enterprise security - Fine tuning service
by simonw on 5/21/23, 9:33 PM
I'm always interested in seeing the prompt that drives these kinds of tools.
In this case it appears to be using RetrievalQA from LangChain, which I think is this prompt here: https://github.com/hwchase17/langchain/blob/v0.0.176/langcha...
```
    Use the following pieces of context to answer the question at the end. If you don't
    know the answer, just say that you don't know, don't try to make up an answer.

    {context}

    Question: {question}
    Helpful Answer:
```
by skykooler on 5/22/23, 1:23 AM
"System requirements" section should really mention what amount of RAM or VRAM is needed for inference.
by hodanli on 5/22/23, 5:29 PM
These are the similar projects I've come across:
- [GitHub - e-johnstonn/BriefGPT: Locally hosted tool that connects documents to LLMs for summarization and querying, with a simple GUI.](https://github.com/e-johnstonn/BriefGPT)
- [GitHub - go-skynet/LocalAI: Self-hosted, community-driven, local OpenAI-compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. No GPU required. LocalAI is a RESTful API to run ggml compatible models: llama.cpp, alpaca.cpp, gpt4all.cpp, rwkv.cpp, whisper.cpp, vicuna, koala, gpt4all-j, cerebras and many others!](https://github.com/go-skynet/LocalAI)
- [GitHub - paulpierre/RasaGPT: RasaGPT is the first headless LLM chatbot platform built on top of Rasa and Langchain. Built w/ Rasa, FastAPI, Langchain, LlamaIndex, SQLModel, pgvector, ngrok, telegram](https://github.com/paulpierre/RasaGPT)
- [GitHub - imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks](https://github.com/imartinez/privateGPT)
- [GitHub - reworkd/AgentGPT: Assemble, configure, and deploy autonomous AI Agents in your browser.](https://github.com/reworkd/AgentGPT)
- [GitHub - deepset-ai/haystack: Haystack is an open source NLP framework to interact with your data using Transformer models and LLMs (GPT-4, ChatGPT and alike). Haystack offers production-ready tools to quickly build complex question answering, semantic search, text generation applications, and more.](https://github.com/deepset-ai/haystack)
- [PocketLLM « ThirdAi](https://www.thirdai.com/pocketllm/)
- [GitHub - imClumsyPanda/langchain-ChatGLM: langchain-ChatGLM, local knowledge based ChatGLM with langchain ｜基于本地知识库的 ChatGLM 问答](https://github.com/imClumsyPanda/langchain-ChatGLM)
by monkeydust on 5/22/23, 10:44 AM
Got this working locally - badly needs GPU support (have a 3090 so come on!) there is some workaround but expect it will come pretty soon. This video was a useful walkthough esp on using different model and upping the CPU threads. https://www.youtube.com/watch?v=A3F5riM5BNE
by thefourthchime on 5/22/23, 3:49 AM
I tried this on my M2 Macbook with 16gb of RAM but got:
"ggml_new_tensor_impl: not enough space in the context's memory pool (needed 18296202768, available 18217606000)"
by aldarisbm on 5/22/23, 1:20 AM
One quick plug
I want to have the memory part of langchain down, vector store + local database + client to chat with an LLM (gpt4all model can be swapped with OpenAI api just switching the base URL)
https://github.com/aldarisbm/memory
It's still got ways to go, if someone wants to help let me know :)
by kordlessagain on 5/22/23, 12:09 AM
Working on something similar that uses keyterm extraction for traversal of topics and fragments, without using Langchain. It's not designed to be private, however: https://github.com/FeatureBaseDB/DocGPT/tree/main
by Wronnay on 5/22/23, 8:26 AM
Wow. I keep a personal Wiki, Journal and use plain text accounting...
This project could help me create a personal AI which answers any questions to my life, finances or knowledge...
by lysp on 5/22/23, 8:49 AM
Quick how-to/demo:
https://www.youtube.com/watch?v=A3F5riM5BNE
Also has a suggestion of a few alternative models to use.
by daitangio on 5/22/23, 7:19 AM
Hi, very interesting... what are the memory/disk requirements to run it? 16GB of RAM would be enough? I suggest to add these requirements to the README
by zestyping on 5/23/23, 6:52 PM
Would someone do me the kindness of explaining (a little more) how this works?
It looks like you can ask a question and the model will use its combined knowledge of all your documents to figure out the answer. It looks like it isn't fine-tuned or trained on all the documents, is that right? How is each document turned into an embedding, and then how does the model figure out which documents to consult to answer the question?
by behnamoh on 5/22/23, 1:14 AM
When you split a document into chunks, doesn't some crucial information get cut in half? In that case, you'd probably lose that information in the context if that information was immediately followed by an irrelevant information that reduces the cosine similarity. Is there a "smarter" way to feed documents as context to LLMs?
by divan on 5/22/23, 9:45 AM
This will still hallucinate, right?
Projects like this for using with your documents datasets are invaluable, but everything I've tried so far is hallucinating, so not practical. What's the state of the art of the LLM without hallucination at the moment?
by debbiedowner on 5/22/23, 5:07 PM
This is a shortcut/workaround to transforming the private docs to a prompt:answer dataset and fine tuning right?
What would be the difference in user experience or information retrieval performance between the two?
My impression is it saves work on the dataset transformation and compute for fine tuning, so it must be less performant. Is there a reason to prefer the strategy here other than ease of setup?
by superbiome on 5/22/23, 10:38 AM
Does something like this exist for local code repos? (Excuse my ignorance since the space is moving faster than light.)
by amelius on 5/22/23, 10:07 AM
With so many LLM options out there, how do we keep track of which ones are good?
by rolisz on 5/22/23, 6:10 AM
For some reason, downloading the model they suggest keeps failing. I tried to download it in Firefox and Edge. I'm using Windows, if that matters. Anyone else seeing similar issues?
by sinandrei91 on 5/22/23, 7:35 AM
Is there a benchmark for retrieval from multiple ft documents? I tried the LangchainQA with Pinecone and wasn't impressed with the search result when using it on my Zotero library.
by amelius on 5/22/23, 10:09 AM
How many tokens/second on an average machine?
by jaimehrubiks on 5/22/23, 4:52 AM
If you select a gpt4all model like GPT-J can this be used commercially or is there other dependency that limits the license?
by Havoc on 5/22/23, 12:16 PM
Would this work better with something like llama or a instruction following model like alpaca?
by bohlenlabs on 5/22/23, 8:20 AM
So many good links here, thanks to the OP for sharing, and to all commenters as well!
by seydor on 5/22/23, 8:55 AM
does this only work with llamaCPP ? I.e. can't use GPU models with this?
by ChocoluvH on 5/22/23, 8:02 AM
Always wondering pros/cons of Chroma and Qdrant. Can someone tell me?
by keeptrying on 5/22/23, 6:24 AM
This is the future.
by yosito on 5/22/23, 5:57 AM
> Put any and all your files into the source_documents directory
Why? Why can't I define any directory (my existing Obsidian vault, for example) as the source directory?
by udev4096 on 5/22/23, 8:12 AM
I posted it 9 days ago and somehow this one gets the attention. The same freaking post. Unbelievable
https://news.ycombinator.com/item?id=35914810