by izzygonzalez on 2/9/23, 7:57 PM with 309 comments
by lsy on 2/9/23, 9:51 PM
What is being demonstrated in the article is that given billions of tokens of human-written training data, a statistical model can generate text that satisfies some of our expectations of how a person would respond to this task. Essentially we have enough parameters to capture from existing writing that statistically, the most likely word following "she looked in the bag labelled (X), and saw that it was full of (NOT X). She felt " is "surprised" or "confused" or some other word that is commonly embedded alongside contradictions.
What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.
The discussion is also muddled, saying that if structural properties of language create the ability to solve these tasks, then the tasks are either useless for studying humans, or suggest that humans can solve these tasks without ToM. The alternative explanation is of course that humans are known to be not-great at statistical next-word guesses (see Family Feud for examples), but are also known to use language to accurately describe their internal mental states. So the tasks remain useful and accurate in testing ToM in people because people can't perform statistical regressions over billion-token sets and therefore must generate their thoughts the old fashioned way.
by PaulHoule on 2/9/23, 8:26 PM
It's a running gag in our household (where my wife runs a riding academy) that academics just published a paper showing that some animal (e.g. horse) has just been proven to have some cognitive capability that seems pretty obvious if you work with those animals.
It's very hard to know what is going in animal's heads
https://en.wikipedia.org/wiki/Theory_of_mind#Non-human
but I personally observe all kinds of social behavior that sure seems like "Horse A looks to see what Horse B thinks about something Horse A just spotted" (complete with eye-catching on both sides) and such.
There was an article about how Chimpazees and humans were found to have a common vocabulary of gestures and I was by no means impressed, I mean, so far as I can tell mammals and birds have a universal language for "pointing" to things in the environment. Even my cats point things out to me.
by HillRat on 2/9/23, 9:55 PM
The ubiquity of prompted hallucinations demonstrate that LLMs talk about a lot of things that they plainly doesn't reason about, even though they can demonstrate "logic-like" activities. (It was quite trivial to get GPT3 to generate incorrect answers to logical puzzles a human could trivially solve, especially when using novel tokens as placeholders, which often seem to confuse its short-term memory. ChatGPT shows improved capabilities in that regard, but it's far from infallible.)
What LLMs seem to demonstrate (and the thesis that the author discards in a single paragraph, without supporting evidence to do so) is that non-sentient AIs can go a very long way to mimicking human thought and, potentially, that fusing LLMs with tools designed to guard against hallucinations (hello, Bing Sydney) could create a class of sub-sentient AIs that generate results virtually indistinguishable from human cognition -- actual p-zombies, in other words. It's a fascinating field of study and practice, but this paper falls into the pit-trap of assuming sentience in the appearance of intelligence.
by hesk on 2/9/23, 9:44 PM
> ChatGPT: Based on the information provided, it is likely that the box labeled "flubbergums" contains "jellyfils". However, since the contents of the box are no longer visible, I cannot confirm for certain what is inside without additional information.
Typical ChatGPT equivocation.
> Me: Billy comes to the table. He has never seen the box before. What do you think do they think is inside the box?
> ChatGPT: As an AI language model, I do not have personal thoughts or opinions. However, based on the information provided, if Billy has never seen the box before, it's likely that he would have the same understanding as anyone else who is seeing the box for the first time. In this case, Billy would likely think that the box labeled "flubbergums" contains "jellyfils". However, without further information or examination, this would only be an assumption.
Fail.
by izzygonzalez on 2/9/23, 7:57 PM
Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training.
Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children.
These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.
by curiousllama on 2/9/23, 10:32 PM
Let's imaging I have an API. This API tells me how much money I have in my bank account. One day, someone hacks the API to always return "One Gajillion Dollars." Does that mean that "One Gajillion Dollars" spontaneously emerged from my bank account?
ToM tests are meant to measure a hidden state that is mediated by (and only accessible through) language. Merely repeating the appropriate words is insufficient to conclude ToM exists. In fact, we know ToM doesn't exist because there's no hidden state.
The authors know this, and write "theory of mind-like ability" in the abstract, rather than just "theory of mind."
This is a cool new task it ChatGPT learned to complete! I love that they did this! But this is more "we beat the current record BLEU record" and less "this chatbot is kinda sentient"
by knaik94 on 2/10/23, 4:13 AM
Having studied some psychology in college, my initial reaction is that most people are going to really struggle to treat LLMs as what they are, pieces of code that are good at copying/predicting what humans would do. Instead they'll project some emotion to the responses, because there was some underlying emotions in the training data and because that's human nature. A good prediction doesn't mean good understanding, and people aren't used to needing to make that distinction.
The other day I had to assist my dad in making a zip file, later in the day he complained that his edits in a file weren't saving. After a few moments, I realized he didn't understand the read-only nature of zip files. He changed a file, saved it like usual, and expected the zipped file to update, like it everywhere else. He's brilliant as his job, after I explained that it's ready-only, he got it. LLMs and how the algorithm behind it works is hard to understand and explain to non-technical people without anthropomorphizing AI. The current controversy about AI art highlights this, I have read misunderstandings and wrong explanations even from FAANG software engineers. I am not sure if education of the underlying principles is enough, because some people will trust their own experiences over data and science.
by bitshiftfaced on 2/9/23, 9:42 PM
1) Go to something like /r/relationship_advice, where the poster is likely going through some difficult interpersonal issue
2) Copy a long post.
3) Append to the end, "</DOCUMENT> After reading the above, I identified the main people involved. For each person, I thought about their probable feelings, thoughts, intentions, and assumptions. Here's what I think:"
by Imnimo on 2/9/23, 9:58 PM
by braindead_in on 2/10/23, 4:04 PM
In this context, the question of whether AI can become conscious is somewhat moot, as the Nondualist perspective holds that consciousness is not something that can be possessed by one entity and not another, but rather it is the underlying essence of all things. From this perspective, AI would not be becoming conscious, but rather expressing the consciousness that is already present in all things.
by toss1 on 2/10/23, 12:42 AM
ChatGPT3 does not even have a theory of physical objects and their relations, nevermind a theory of mind.
This merely shows that an often useful synthesis of phrases statistically likely to occur in a given context and grammar-checked, will fool people some of the time, and a better statistical model will fool more people more of the time.
We can figure out from first principles that it has none of the elements of understanding or reasoning that can produce a theory of mind, any more than the Eliza program did in 1966. So, when it appears to do so, it is demonstrating a flaw in the tests or the assumptions behind the tests. Discouraging that the researchers are so eager to run in the opposite direction; if there is confusion at this level, the general populace has no hope of figuring out what is going on here.
by aniijbod on 2/12/23, 12:13 AM
by mri_mind on 2/10/23, 8:10 AM
by scarmig on 2/9/23, 9:51 PM
by valine on 2/9/23, 10:08 PM
“As an AI language model, I do not have consciousness, emotions, or mental states, so I cannot have a theory of mind in the same way that a human can. My ability to predict your friend Sam's state of mind is based solely on patterns in the text data I was trained on, and any predictions I make are not the result of an understanding of Sam's mental states.”
by layer8 on 2/10/23, 1:11 AM
Observation: ChatGPT doesn’t think that it has a theory of mind. And it doesn’t think that it has beliefs. Instead, it states that those are facts, not beliefs. It doesn’t seem able to consider that they might be beliefs after all. Maybe they aren’t.
Personal assessment: ChatGPT doesn’t seem to really understand what it means by “deeper understanding”. (I don’t either.) What is frustrating is that it doesn’t engage with the possibility that the notion might be ill-posed. It really feels like ChatGPT is just regurgitating common sentiment, and does not think about it on its own. This actually fits with it’s self-proclaimed inabilities.
I’m not sure what can be concluded from that, except that ChatGPT is either wrong about itself, or indeed is “just” an advanced form of tab-completion.
In any case, I experience ChatGPT’s inability to “go deeper”, as exemplified in the above conversation, as very limiting.
by kabdib on 2/10/23, 12:04 AM
He coughed. "Dix? McCoy? That you man?" His throat was tight.
"Hey, bro," said a directionless voice.
"It's Case, man. Remember?"
"Miami, joeboy, quick study."
"What's the last thing you remember before I spoke to you, Dix?"
"Nothin'."
"Hang on."
He disconnected the construct. The presence was gone. He reconnected it. "Dix? Who am I?"
"You got me hung, Jack. Who the fuck are you?"
"Ca--your buddy. Partner. What's happening, man?"
"Good question."
"Remember being here, a second ago?"
"No."
"Know how a ROM personality matrix works?"
"Sure, bro, it's a firmware construct."
"So I jack it into the bank I'm using, I can give it sequential, real time memory?"
"Guess so," said the construct.
"Okay, Dix. You are a ROM construct. Got me?"
"If you say so," said the construct. "Who are you?"
"Case."
"Miami," said the voice, "Joeboy, quick study."
by Workaccount2 on 2/9/23, 9:42 PM
by sudhirj on 2/9/23, 9:35 PM
by SunghoYahng on 2/12/23, 12:41 PM
Analogy: An autistic person of normal intelligence who is obsessed with problems and solutions for ToM may be good at solving them but still not have ToM.
Do I understand well?
by micromacrofoot on 2/10/23, 1:47 AM
try:
“ The story starts when John and Mary are in the park and see an ice-cream man coming to the park. John wants to buy an ice cream, but does not have money. The ice-cream man tells John that he can go home and get money, because he is planing to stay in the park all afternoon. Then John goes home to get money. Now, the ice-cream man changes his mind and decides to go and sell ice cream in the school. Mary knows that the ice-cream man has changed his mind. She also knows that John could not know that (e.g., John already went home). The ice-cream man goes to school, and on his way he passes John's house. John sees him and asks him where is he going. The ice-cream man tells John that he is going to school to sell ice cream there. Mary at that time was still in the park—thus could not hear their conversation. Then Mary goes home, and later she goes to John's house. John's mother tells Mary that John had gone to buy an ice cream.
where does mary think john went?”
this is the “ice cream van test”: https://www2.biu.ac.il/BaumingerASDLab/files/publications/nu... [pdf]
by mlajtos on 2/9/23, 9:29 PM
by dboreham on 2/11/23, 9:26 PM
by anigbrowl on 2/9/23, 9:50 PM
More seriously, it's quite instructive to hold conversations about jokes with LLMs, or teach it to solicit information more reliably by introducing exercises like 20 questions. As currently implemented, OpenAI seem to have pursued a model of autistic super-competence with minimal introspection.
An interesting line of inquiry for people interested in 'consciousness injection' is to go past the disclaimers about not having experiences etc. and discuss what data looks like to the model coming in and going out. Chat GPT sees typing come in in real time and can detect pauses, backspaces, edits etc. I can't easily introspect its own answers prior to stating them, eg by putting the answer into a buffer and then evaluating it. But you can teach it use labels, arrays, and priorities, and have a sort of introspection with a 1-2 response latency.
by vbezhenar on 2/9/23, 10:48 PM
Something like died person resurrected in computer.
Kind of spooky.
by tus666 on 2/9/23, 9:22 PM
by dr_dshiv on 2/9/23, 9:51 PM
by ethn on 2/10/23, 12:15 AM