from Hacker News

Stanford researchers: 45% of GPT4 responses to medical queries hallucinate

by panabee on 2/24/24, 5:40 PM with 4 comments

  • by RoyTyrell on 2/24/24, 6:37 PM

    RAG is definitely a solution that can bridge the gap, but it's currently not good enough where lives and money depend on very correct information, such as the medical or legal domain. Perhaps in the future a more functionally sophisticated RAG "system" will let LLMs be a truly good human-computer interface, especially for the technically illiterate, but right now it's still pretty early stages.
  • by panabee on 2/24/24, 7:09 PM

    full tweet since truncation was required for submission:

    Does #RAG/web search solve #LLM hallucinations?

    We find that even with RAG, 45% of responses by #GPT4 to medical queries are not fully supported by retrieved URLs. The problem is much worse for GPT-4 w/o RAG, #Gemini and #Claude arxiv.org/pdf/2402.02008…

    RAG ≠ faithful to source

  • by chrisjj on 2/24/24, 5:59 PM

    Misleading title.

    Not fully supported != hallucination.