by yehosef on 3/10/24, 1:55 PM with 62 comments
by layer8 on 3/10/24, 3:13 PM
That didn’t go down so well in the past.
by hiAndrewQuinn on 3/10/24, 2:37 PM
by JimDabell on 3/10/24, 2:45 PM
Edit: anotheryou found the thread here:
https://twitter.com/yoheinakajima/status/1762718034761072653
by Terretta on 3/10/24, 3:55 PM
For text, "finish your thought and answer" has been implemented for a while, in LLMs in IDEs that offer completions for # code comments, for example.
One of the faster implementations is in the new Zed editor. Open the Assistant pane with your OpenAI GPT-4 key, and once you're into the conversation, it will offer auto-completions of your own prompt to it, before you submit.
Often these autocompletes finish the question and then contain the answer, like an impatient listener mentally finishing your sentence so they can say what they think. This is without having submitted the question to the chat interface.
Note that as Zed has implemented this, the realtime "finish your thought for you" mode is a dumber faster model, but as your context builds, it interrupts right more often.
You can also start your next prompt while it's unspooling the last one.
by GistNoesis on 3/10/24, 3:29 PM
It's quite standard nowadays to add some extra special token and then fine-tune a LLM to make it learn how to use it appropriately, by providing a small dataset (1k to 50k) of examples with interruptions (for example "user: Xylophone went to the stadium with <interruptToken> Let me stop you right now are you really referring to Xylophone </interruptToken> ok thanks for correcting me, it's not Xylophone it's Xander, damn autocorrect!").
llama.cpp has the opposite : an interactive mode where as a human you can interrupt the conversation that the llm is currently generating. But if you interrupt it badly it can make the llm conversation go off-rails.
One problem that result from the usage of tokens is that the user is usually not inputting token but rather characters so you must somehow only process when the characters have stabilized into tokens (for example at word boundaries if your tokeniser has a preprocessing that split on spaces before doing the byte pair encoding). (If you want to process each character on the fly it's getting really tricky because even if at inference you can rewrite the last token in your kv cache, you must somehow create a finetuning dataset to properly learn how to interject based on these partial tokens)
by compressedgas on 3/10/24, 2:43 PM
by a2128 on 3/10/24, 3:29 PM
My implementation wasn't really interrupting, it was only figuring out when to respond vs when to let someone else in the group respond, but you could use the same idea to figure out when to interrupt.
by sk11001 on 3/10/24, 2:26 PM
by anotheryou on 3/10/24, 3:18 PM
by proc0 on 3/10/24, 3:16 PM
by sandspar on 3/10/24, 4:18 PM
Also I could see something like this working on cash ATM's. Coupled with eye tracking. "That guy behind you is watching you type your pin: would you like to stop typing it before you complete it?"
Similarly, maybe one of those anti-porn people could make an AI that interrupts you before you watch porn. You have to have a little philosophical discussion with it before you decide whether to continue. It could also work on fridges. FridgeBot: "Are you sure you'd like to eat that cheesecake?" Maybe we could add it to guns too, why not.
by littlestymaar on 3/10/24, 3:15 PM
Commercial AI will also never be able to pass the Turing test, because they will never tell you to shut the fuck up or ragequit like a human would when you're being obnoxious enough. It's not a technical limitation, it just aligns very poorly with the interest of the overlord.
Or maybe Mistral will do it, because having no particular consideration for customers is something we French people know how to do very well.
by deadbabe on 3/10/24, 3:02 PM
It seems for people to perceive it as true AI they must send off some prompt, watch it think deep while a loader spins, and then read a response.
by intellectronica on 3/10/24, 3:02 PM
by colanderman on 3/10/24, 2:45 PM
2. Constantly predict a few tokens ahead.
3. When the predicted text includes the computer's prompt, respond with that, without waiting for the user to push enter.
Probably also
4. Stop engineering the initial instructions for such obsequious behavior.
by catchnear4321 on 3/10/24, 2:42 PM
to be useful, it would need something to interrupt, and instruction on what warrants an interruption.
by nicklecompte on 3/10/24, 3:53 PM
But LLMs don't have any agenda whatsoever - they are not capable of having goals or motivations. So why are they interrupting? Are they reading your mind and understanding your goals before you even finish typing them? It's hard to see an LLM having a coherent way to interrupt based purely on a probabilistic view of language.
It would be very annoying if a human constantly interrupted you because they were "aligned with your agenda" and thought they were being helpful. LLMs would probably be much worse, even if they were able to reliably infer what you wanted. For an LLM to be useful, you kind of have to coax it along and filter out a lot of empty verbiage - it seems downright counterproductive to have that verbiage blasted at you by a chatbot that interrupts your typing.
I could see LLMs interrupting if you are typing something clearly false or against TOS. But that would require an LLM which reliably understands things are clearly false or against TOS and hence requires a solution to jailbreaking....so in 2024 I think it would just be an incredibly annoying chatbot. In general I think any interruption behavior would be artificially programmed to make the LLM seem "realistic," and it won't work.
by mlsu on 3/10/24, 2:59 PM
The way a human interjects is that you have a parallel thought chain going, along with the conversation, as it's happening in real time. In this parallel chain, you are planning ahead. What point am I going to make once we are past this point of conversation? What is the implication of what is being discussed here? (You also are thinking about what the other person is thinking; you are developing a mental model of their thought process).
LLM does not have any of this, architecturally, it just has the text itself. Any planning that people are claiming to do with LLama et al is really just "pseudo" planning, not the fundamental planning we talk about here. I suspect it will be a while yet before we have "natural" interjection from LLM.
When it does come, however, it will be extremely exciting. Because it will mean that we have cracked planning and made the AI far more agentic than it is now. I would love to be proven wrong.
by lulznews on 3/11/24, 3:15 AM
by cqqxo4zV46cp on 3/10/24, 2:32 PM