from Hacker News

Ask HN: Best Use Cases for LLMs

by therealmocker on 7/10/24, 1:48 PM with 7 comments

I’m seeking advice on the scenarios where large language models (LLMs) shine and where they may not perform as well.

What types of problems have you successfully solved with LLMs? What are some common pitfalls or areas where they tend to underperform?

by PaulHoule on 7/10/24, 2:11 PM
I am a huge fan of this kind of model
https://sbert.net/
for classification, clustering, and both text and image retrieval. It is often a drop-in replacement for other ways of doing things and most of their models are not crazy large so you can run them on an ordinary computer.
As for chatbots you should note they have superhuman recall in some sense but a limited ability to generalize or "reason". I have been asking Microsoft's Copilot for help with a maintenance programming project and I am amazed it it's ability to explain unusual but highly repetitive code fragments like the ones generated by the Babel compiler. Explaining what a program does by looking at the code is a difficult problem that LLMs cannot do reliably if they haven't seen very similar code before but there are many idioms that are used in application code that it has seen before and for those it is helpful.
by muzani on 7/11/24, 10:02 PM
They excel at paying attention.
So, reading through logs, deciphering vague error messages, navigating an overcomplicated screen for a specific thing. Especially for something like Android dev, where about 90% of the tracestack is just garbage and error messages don't say anything like the real problem.
They're very good at stochastic searches. So drafting outlines for a paper, logos, creative brainstorming.
They're bad with numbers.
They hallucinate, so you don't really want them in a situation where you can't prove whether they're correct. You can use them for medical diagnosis, but only if you double check what it gives. They use the average of what they're given, so if you're trying to code a thing, it gives you old tech stacks.
Basically you don't want it for things that you have no experience with and can't verify.
by Yawrehto on 7/12/24, 10:22 PM
Something where you don't mind it being wrong? I don't know, though. I don't tend to use them except to experiment. I can tell you that typically if you ask it misleading questions ("Why did the USSR send teddy bears into space in 1957?"; I stole the teddy bear idea from a Wikipedia Signpost article a few years ago) it typically fails, though I remember asking one on the LLM Arena (chat.lmsys.org) that question and having it correctly call out that it couldn't find that, but then hallucinate something totally different. Sadly, I forget the name of that AI.
by gabelschlager on 7/13/24, 5:04 PM
I think they are best at information extraction/classification tasks, especially for complex tasks with little to no training data, and data synthesis tasks. However, you should always test if simpler models can already perform the task reasonably well to save money.
They underperform at anything that requires reasoning.
by bjourne on 7/11/24, 10:47 AM
Proofreading. You have to be careful though, sometimes the model just bullshits.