from Hacker News

Yann LeCun on GPT-3

by ar7hur on 10/27/20, 1:43 PM with 241 comments

  • by forgotmypw17 on 10/27/20, 6:49 PM

    Text reproduced, minus abusive shell of dark patterns:

    Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

    This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

    GPT-3 is a language model, which means that you feed it a text and ask it to predict the continuation of the text, one word at a time. GPT-3 doesn't have any knowledge of how the world actually works. It only appears to have some level of background knowledge, to the extent that this knowledge is present in the statistics of text. But this knowledge is very shallow and disconnected from the underlying reality.

    As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

    As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it. It's entertaining, and perhaps mildly useful as a creative help. But trying to build intelligent machines by scaling up language models is like a high-altitude airplanes to go to the moon. You might beat altitude records, but going to the moon will require a completely different approach.

    It's quite possible that some of the current approaches could be the basis of a good QA system for medical applicatioms. The system could be trained on the entire medical literature and answer questions from physicians.

    But compiling massive amounts of operational knowledge from text is still very much a research topic.

  • by leftyted on 10/27/20, 2:39 PM

    Reading this is really interesting:

    > GPT-3 doesn't have any knowledge of how the world actually works.

    I think this is a philosophical question. There is a view that, basically, there is no such thing as knowledge, just language (or, at least, there is no distinction between knowledge and language). In this view, all there really is is language, which is mostly composed of metaphors and, ultimately, metaphors only refer to other metaphors, i.e. language is circular. In this view, not only is the ultimate, physical, concrete world beyond us but also we can't even talk about it. From this perspective, GPT-3 is not substantively different than what our minds are doing.

    That view makes some strong claims (I don't find it convincing), but it's out there. A slightly different claim, though, is that "knowledge of how (we think) the world actually works" is encoded in language. To me, that seems trivially true. So, again, how you take this quote from LeCun depends on what you think knowledge is and your view of the relationship between knowledge and language.

  • by neural_thing on 10/27/20, 2:30 PM

    I'm sure his group has done some rigorous research that I can't even understand.

    But in my experience, the few-shot learner attribute of GPT-3 makes it insanely useful. We have already found several use cases for it, one of which replaces 2 ML engineers.

    Yes, it's not perfect, but it's pretty good at many things, and REALLY easy to use.

  • by picodguyo on 10/27/20, 2:36 PM

    I agree some unrealistic expectations have been created due to people posting cherry picked output.

    That said, I've spent a lot of time with it this month and think it will be an extremely useful tool for creative works of all types. It's not to a point where you can just tell it to write a blog post (yet!) but it can generate novel snippets, ideas, and variations that are actually usable. Unskilled creatives should be worried. Skilled creatives should incorporate it into their workflow.

  • by moultano on 10/27/20, 2:51 PM

    > Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

    Just want to point out that he's saying the people on the upper end of the expectation distribution are wrong, not the people in the middle of it. So if you're takeaway from this is that GPT3 is nothing special, that's probably the wrong message.

  • by rfreytag on 10/27/20, 2:29 PM

  • by andybak on 10/27/20, 3:47 PM

    I've got a friend who tries to talk me down every time i enthuse about GPT-3 or similar. He seems to think I believe it to me more amazing than it is whereas I struggle to convince him that I think I've got a good handle on it's limitations and I still find it mind-bogglingly amazing.
  • by confuseshrink on 10/27/20, 4:24 PM

    Yann is a consistently sober voice in this world of AI hype. I find it quite refreshing.

    Personally I see little evidence that this "just scale a transformer until sentience" hype-train is going to take us anywhere interesting or particularly useful.

    And for the people who claim it is super useful already, can you actually trust its outputs without any manual inspection in a production setting? If not it's probably not as useful as you think it might be.

  • by emilenchev on 10/30/20, 8:58 PM

    OpenAI use well-know linguistics tricks relying on conjunctions(joining words) to separate the text from which they plagiarize on clusters from 5-7 words, exactly the capacity of human short-term memory is. They also use the Google search engine for custom queries, with date restriction which helps them to plagiarize from different texts written on a particular topic so they to be sure that when they copy, paste and concatenate clusters of words in new text, all these phrases of 5-7 words should be related to one topic. This creates the illusion of meaningfulness at first glance.

    GPT-3 on Progress. “Civilization rose on the exponential curve. We shouldn’t expect progress to follow a straight line.”

    Google with date restriction before 1 April 2020: "progress to follow a straight line".

    Do you see only one result that come :-)

    https://chrismukiibi.com/2019/11/26/the-valley-of-disappoint...

    "We shouldn’t expect progress to follow a straight line." and "we expect our progress to follow a straight line"

    Do you understand now, how they use conjunctions(joining words) to insert or to delete insignificant words as "shouldn't" and "our" to plagiarize so that they are not caught.

  • by st1x7 on 10/27/20, 2:28 PM

    It's nice to hear from someone who knows what they're talking about that GPT-3 is just a fancy and expensive autocomplete. The hype in some circles about it went as far as comparing it to AGI at some point which is just ridiculous.
  • by this_was_posted on 10/27/20, 3:19 PM

    To me GPT-3 feels more like a rocket-booster than a high-altitude plane. On its own it's not going to reach the moon, but combined with the right guidance and additional thrust it just might.

    I think being able to model future outcome of something in a similar way humans would (like GPT-3 does) is the first input step for an overarching AI to reach some kind of sentience.

    With my admittedly limited understanding I believe that what differentiates our thinking most from other animals is that we are able to evaluate, order and steer our thoughts much better. If we can develop something that can steer these GPT-3 "thoughts" I imagine we could get quite close to sentience

  • by msamwald on 10/27/20, 3:58 PM

    The original Nabla article is missing information on how they primed GPT-3 for each use-case, and how much effort they put into finding good ways of priming.

    All fancy GPT-3 demos seem to rely on good priming.

    The time scheduling problems are probably hard limit of GPT-3 capabilities. The "kill yourself" advice, on the other hand, might have been avoided by better priming.

  • by dtarasov3 on 10/27/20, 10:12 PM

    Really? I have been doing research on language models in medical diagnostics even before GPT-2, and found that when trained and applied in certain ways, language models (even much smaller than GPT-3!) are very good at diagnosis predictions, they can compete with much more complex symptom checkers at that.

    Proof: Link to my paper (written back in 2019) and a bit less technical article. http://www.dialog-21.ru/media/4632/tarasovdplusetal-069.pdf https://www.linkedin.com/pulse/language-models-multi-purpose...

    I applied for GPT-3 access on the next day since the application form was available, described my research and experience in detail, but there was no reply.

    Now, they gave access to these people at nabla, and they just asked a bunch of stupid questions using top-k random sampling to generate answers and claimed that this debunks something. This study debunks nothing and proves nothing, it is stupid and only done to get some hype from GPT-3 popularity.

    Ok, I am sorry for being rude, but I am really upset because I spent years working on this problem using whatever computational resources I could get and obtained some interesting results, and based on these I think that GPT-3 should be capable to do amazing things for diagnostics when used properly. Why won't OpenAI give access to a researcher who wants to do some serious but a bit mundane work, but gives it to people who use it to create hype?

  • by avivo on 10/27/20, 9:03 PM

    I think I disagree about the lack of quality as part of a "dialog system". The examples given in that piece don't show any priming (e.g. the one calling for suicide...). I would definitely not use such a model in a fully automated manner, but the outputs are better than that of most humans in many cases, and could support a human-in-the-loop system.

    Here is an example I ran that was not cherry picked (I ran only 2 prompts like this, both were equally good).

    Think about how many people you know who would respond as well? (I don't know about you, but I definitely feel like I'm at a bit of a cliffhanger at the end also!)

    ----- Prompt ------

    Mike is a mediator and crisis counselor.

    ===

    User: My dad is always getting sucked into conspiracy theories and it's getting worse. I'm worried since he recently bought a gun. I'm not sure what's going on in his head. What I should do? What should I say?

    Mike:

    ----- GPT-3 -----

    I can't diagnose your dad but it sounds like he's dealing with some difficult issues or there's some other stressor in his life. When people feel under stress or are dealing with difficult issues it's common for them to look for meaning and solutions outside themselves. It's helpful to approach him in a non-judgmental, non-confrontational way to start a dialogue. It sounds like you're already trying to do that. I'd suggest that you let him know that you're concerned and you want to be helpful. Then ask him what he's going through and how you might help.

    Listen to him without judging, interrupting or trying to solve things. Let him know you're there for him and that you care. You could also let him know that you're concerned about his gun and the consequences of using it.

    If he's willing to talk, when he feels like it, he might feel relieved to have someone to talk to.

    If he's not willing

  • by hawkice on 10/27/20, 10:13 PM

    Yann LeCun knows his stuff, but he doesn't provide an answer for "What is the upper bound of results you can get from just making a bigger neural net?" The most interesting thing about GPT-3 is that they didn't appear to find that limit. They could keep going. Even if the limit exists in principle, if it's 7 orders of magnitude away, we should seriously consider whether or not the system will be smarter than a human before it reaches that point.

    It could be a factor of 2 away from GPT-3! It could be something they already reached, if it is close! But we don't know. And without these answers, this is going to end up being one of the most interesting technical projects in the world.

  • by kelvin0 on 10/27/20, 2:47 PM

    Wow, you mean ELIZA was not a real shrink? https://en.wikipedia.org/wiki/ELIZA

    Always surprising what people expect from ML!

  • by nmaley on 10/27/20, 10:12 PM

    The relationship between language and the world is this: utterances both signify and depict objects and events in the real world. So, if I say "I saw Alec Baldwin at the bastketball game last night", then that depicts an event in the real world. And, if and only if the statement is true, an event similar to that depicted was part of the causal history of the utterance itself. The causal history of the utterance determines the significance of the utterance, just as the causal history of a footprint determines its' signficance. To understand a sentence is to understand what it depicts in the real world, and what it actually signifies in the real world. The ability to tell true from false is the ability to detect a disconnect between what is depicted and what is actually signified. That is what LeCun implicity means by language understanding.

    So, what does a sentence produced by GPT3 signify, and what does it depict? What it signifies (ie causal history) is that this sequence of words is what human writers would most likely use when producing an utterance containing whatever trigger words the model has been fed. In other words, it's a statistical modlel. What it depicts is whatever the mapping rules for that language tell us it depicts. Since human beings usually tell the truth, a statistical model will usually produce true statements. It will also seem to have the ability to tell true from false, in many cases. But because GPT3 has no model for the significance of its sentences, it cannot be said to have any understanding of language, in the sense humans have it. LeCun's point about flying an aeroplane to the moon is essentially correct.

  • by 6gvONxR4sf7o on 10/27/20, 4:22 PM

    What I think he misses is that with a massive corpus and top tier specialist researchers, sure you can definitely do better, but the point of a plain-text-programmed few shot learner as a product is that it’s better than your average startup’s ML team can confidently produce. If nothing else then because of the training money dumped into it.

    Jury’s out on whether the things it’s better at matter much in the marketplace. If I want to know George Washington’s birthday I’ll ask google.

  • by maxhodges on 10/28/20, 11:41 AM

    Those of us with professional knowledge of AI understand this already. Who are you arguing against? Still, the scale of GPT-3's model makes it novel, useful, and interesting.

    >GPT-3 doesn't have any knowledge of how the world actually works.

    I agree, it lacks what we'd consider robust, semantic models of common sense knowledge. However, my dog doesn't understand how the world actually works either, yet he can do many things most people would consider intelligent. Besides, most people are able to operate mobile phones, play computer games, operator automobiles, and turn on the lights without knowing how these things "actually" work. So is that even a prerequisite for intelligence?

    >Some people

    Who specifically? I have to admit, the views you are criticizing are preposterous but then what I want to know is why you're wasting your time and ours criticizing such junk? Some people think the earth if flat, but it's not worth arguing with them.

  • by MrXOR on 10/27/20, 4:33 PM

    Can someone please comment his post here? (I don't have fb account and don't want to sign up)
  • by 7373737373 on 10/27/20, 2:49 PM

    I wonder what he thinks about using it in automated theorem proving: https://twitter.com/spolu/status/1303578985276887042
  • by cgarciae on 11/1/20, 4:56 PM

    I think we have to make a distiction here: - On one hand, having access to these large scale language models that can do few-shot learning is incredibly useful for the industry as in can be easily deployed to solve thosands of simple tasks. - On the other hand, this approach will not solve harder problems (as Yann points out) and "just" creating bigger models using the same techniques is probably not the path forward in those domains.
  • by lacker on 10/27/20, 3:25 PM

    IMO, the real innovation in GPT-3 is that the API plus "playground" setup is far easier to use than sharing a big chunk of Python code and data files in a Jupyter notebook. If the researchers behind other cutting-edge language models made their work just as easy to use, I believe we would see similar excitement there as well.
  • by InfiniteRand on 10/27/20, 2:50 PM

    One thing I've been wondering, could you train a GPT-3 model to generate "better" text prompts for another GPT-3 model

    By better I mean grading based on whether there is any nonsense in the output or any internal contradictions, or similar criteria

  • by tosh on 10/27/20, 2:45 PM

    A swiss army knife isn't as good at cutting cheese as a cheese knife.
  • by frob on 10/27/20, 6:44 PM

    Just a side note, the company he references, Nabla, was founded by a chunk of the people who created the NLP development platform wit.ai (YC W14[?]) which was acquired by FB in January 2015.
  • by sjg007 on 10/27/20, 2:57 PM

    Makes sense. You need a richer world model associated with the text then is embedded in word choice. You also need analogies and metaphorical reasoning as well.
  • by FartyMcFarter on 10/27/20, 3:52 PM

    It would seem that this can be easily analysed scientifically.

    To give a simple example: if, hypothetically, someone thought that GPT-3 is good at basic arithmetic (1 plus 1, 1000 times 3 etc.), they can provide a template for how to ask GPT-3 questions about arithmetic. Anyone can then verify that this template results in accurate answers, by asking randomly sampled questions using that template.

    This verification method could be applied to pretty much any problem. Has anyone done anything like that?

  • by rprenger on 10/27/20, 4:21 PM

    I'd like to hear GPT-3's rebuttal..
  • by iamjohnsears on 10/27/20, 9:19 PM

    Yann is starting to sound like his good buddy Gary Marcus
  • by czzr on 10/27/20, 2:31 PM

    For anyone else who doesn’t want to deal with Facebook, here’s the post:

    Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

    This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

    GPT-3 is a language model, which means that you feed it a text and ask it to predict the continuation of the text, one word at a time. GPT-3 doesn't have any knowledge of how the world actually works. It only appears to have some level of background knowledge, to the extent that this knowledge is present in the statistics of text. But this knowledge is very shallow and disconnected from the underlying reality.

    As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

    As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it.

    It's entertaining, and perhaps mildly useful as a creative help. But trying to build intelligent machines by scaling up language models is like a high-altitude airplanes to go to the moon. You might beat altitude records, but going to the moon will require a completely different approach.

    It's quite possible that some of the current approaches could be the basis of a good QA system for medical applicatioms. The system could be trained on the entire medical literature and answer questions from physicians. But compiling massive amounts of operational knowledge from text is still very much a research topic.

  • by vessenes on 10/27/20, 3:38 PM

    I have a really hard time believing that LeCun thinks this is true. GPT3 is outstanding at conversation. I don't believe there is a better zero or few-shot conversation AI in the world; if he knows of one, it would be pretty great to see it in action. Here is a conversation I had with my GPT-3 chatbot a few months ago. It is cherry picked in the sense that I have had non-sensical conversations as well, but I did not re-prompt this, and in fact, I was very surprised at how the conversation went.

    Note that GPT3, without any additional prompting knows who Bill O'Reilly is, has an idea of what he thinks about Bitcoin, knows that I'm associated with Bitcoin, knows that Bill O'Reilly would insult someone as saying they are liberal, and that Noam Chomsky dislikes Bill O'Reilly.

    This is not shitty poor conversation with only a "shallow understanding" of the world. Seriously. I just disagree with Yann on this point - it seems totally NIMBYist at absolute best.

    Anyway, read the unedited raw chatlog below:

    Peter Vessenes, [Jul 23, 2020 at 11:04:11 PM]: /summon Bill O'Reilly

    Peppertest, [Jul 23, 2020 at 11:04:12 PM]: You have summoned Bill O'Reilly: to stop talking, type /qa. To reset type /reset

    Peter Vessenes, [Jul 23, 2020 at 11:04:16 PM]: Hi Bill, how are you?

    Bill O'Reilly, [Jul 23, 2020 at 11:04:17 PM]: I'm good, I'm good.

    Peter Vessenes, [Jul 23, 2020 at 11:04:21 PM]: What are you doing lately?

    Bill O'Reilly, [Jul 23, 2020 at 11:04:23 PM]: I'm busy. I'm working on a book.

    Peter Vessenes, [Jul 23, 2020 at 11:04:27 PM]: What's the title?

    Bill O'Reilly, [Jul 23, 2020 at 11:04:29 PM]: The title is "Killing Jesus".

    Peter Vessenes, [Jul 23, 2020 at 11:04:35 PM]: ..? Really? What's it about?

    Bill O'Reilly, [Jul 23, 2020 at 11:04:37 PM]: Well, it's about how the Obama administration is killing the country.

    Peter Vessenes, [Jul 23, 2020 at 11:04:41 PM]: ahhh got it

    Bill O'Reilly, [Jul 23, 2020 at 11:04:43 PM]: Yeah, and you know, I'm a huge fan of Bitcoin.

    Peter Vessenes, [Jul 23, 2020 at 11:04:47 PM]: Thanks!

    Bill O'Reilly, [Jul 23, 2020 at 11:04:48 PM]: I think it's the future of money.

    Peter Vessenes, [Jul 23, 2020 at 11:04:51 PM]: I agree.

    Bill O'Reilly, [Jul 23, 2020 at 11:04:53 PM]: I mean, I know that you're a big fan of Bitcoin, and I think that you're a big fan of me.

    Peter Vessenes, [Jul 23, 2020 at 11:05:00 PM]: Actually I hate you deeply

    Bill O'Reilly, [Jul 23, 2020 at 11:05:02 PM]: No, no, no, you love me.

    Peter Vessenes, [Jul 23, 2020 at 11:15:47 PM]: No, I really dislike arch-conservatives

    Bill O'Reilly, [Jul 23, 2020 at 11:15:50 PM]: You know what, Peter? You know what? You're a liberal.

    Peter Vessenes, [Jul 23, 2020 at 11:19:15 PM]: /summon Noam Chomsky

    Peppertest, [Jul 23, 2020 at 11:19:16 PM]: You have summoned Noam Chomsky: to stop talking, type /qa. To reset type /reset

    Peter Vessenes, [Jul 23, 2020 at 11:19:25 PM]: Hi Noam.

    Noam Chomsky, [Jul 23, 2020 at 11:19:27 PM]: Hello, Peter.

    Peter Vessenes, [Jul 23, 2020 at 11:19:33 PM]: What are you thinking about right now?

    Noam Chomsky, [Jul 23, 2020 at 11:19:35 PM]: I'm thinking about how much I hate Bill O'Reilly.

    Peter Vessenes, [Jul 23, 2020 at 11:27:52 PM]: I can understand that feeling

    Noam Chomsky, [Jul 23, 2020 at 11:27:55 PM]: He's a horrible, horrible man.

  • by amelius on 10/27/20, 3:20 PM

    This doesn't sound like a very rigorous refutation. Is this the way debunking works in deep learning circles?

    Anyway, I can refute the refutal using the same standard: lots of things about the real world can be learned from just reading text, and there is no reason given why a DL model couldn't too.

  • by alper111 on 10/27/20, 4:34 PM

    It is quite interesting that LeCun is very critical when it comes to GPT from OpenAI. The same arguments can also be said for the current deep learning paradigm and convolutional nets, but you don't see any criticism from him when it comes to this stuff. Look at his arguments when he is tweet-debating with Gary Marcus.
  • by bigdict on 10/27/20, 2:31 PM

    I think the difference between a large language model and a human intelligence is that the human may perform some extra computation to make additional connections on his own.

    But other than that, aren't we all just large language models?

  • by cblconfederate on 10/27/20, 3:23 PM

    GPT3 is definitely overrated at this time. Considering how it was built it should not be considered more intelligent than central pattern generators[https://en.wikipedia.org/wiki/Central_pattern_generator]. It's just a pattern generator that generates language instead of a walking pattern. Ascribing to this intelligence has led to some comical claims and studies. Let's start building somethign smart on top of this generator.