from Hacker News

LLMs having capabilities they were not trained for – explained

by SleekEagle on 5/23/23, 2:59 PM with 1 comments

  • by sharemywin on 5/23/23, 3:11 PM

    "After crossing an apparent critical value (between 1010 and 1011 effective parameters), the model seems to be catapulted into a new regime in which it can complete this (fairly complicated) task with drastically increased accuracy."

    "The important part about this phenomenon is that we, a priori, do not know that this will happen in advance or even at what scale it might happen. "