from Hacker News

Pre-Training GPT-4.5 [video]

by waynenilsen on 4/11/25, 4:57 PM with 2 comments

  • by waynenilsen on 4/11/25, 4:57 PM

    The only reason that I'm sharing this is because there is a gem at the end. From the transcript

    44:26 its responses but it's incredible it is incredible related to that and sort of last question in some sense this whole 44:33 effort which was hugely expensive in terms of people and time and dollars and everything else was an experiment to 44:41 further validate that the scaling laws keep going and why and turns out they do and they 44:48 probably keep going for a long time um I accept scaling laws like I accept quantum mechanics or something but they 44:54 still don't like I still don't know why like why should that be a property of the universe so why are scaling laws a 45:01 property of the universe

    you want I can I can take a stab well the the fact that more compression will lead to more 45:07 intelligence that has this very strong philosophical grounding so the question is why does training bigger models for 45:15 longer give you more compression and there are a lot of theories here 45:20 there's the one I like is that the the relevant concepts are sort of uh sparse 45:27 in the in the the data of the world and in particularly it's is a power law so 45:34 that the like the hundth uh most important concept appears in one out of 45:39 a hundred of the documents or or whatever so there's long tales does that mean that

    if we make a perfect data set 45:44 and figure out very data efficient algorithms i mean can go home it it means that there's potentially 45:50 exponential compute wins on the table to be very s sophisticated about your choice of data but but basically when 45:59 you just scoop up data passively you're going to require 10xing your compute and your 46:07 data to to get the next constant number of things in that tail and there's just that tail keeps 46:14 going it's long you keep you can keep uh mining it although as you alluded to you 46:22 can probably do a lot better

    i think that's a good place to leave it 46:28 thank you guys very much that was fun yeah thank you