by Heidaradar on 6/5/24, 2:49 AM with 50 comments
Furthermore, what do you think they're going to do to make it as "safe" as possible. It's funny OpenAI didn't release GPT-2 immediately to the public because of safety worries, but has now been releasing models without the same care for safety and I imagine this will continue with GPT-5
[0] https://www.zdnet.com/article/openai-is-training-gpt-4s-successor-here-are-3-big-upgrades-to-expect-from-gpt-5/ [1] https://openai.com/index/openai-board-forms-safety-and-security-committee/
by ramblerman on 6/5/24, 11:12 AM
- a steady increment of GPT-n+1 every 6 months for marketing purposes.
- each will improve on the last by smaller and smaller margins.
- hallucinations won't be fixed anytime soon.
- We will hit a bit of a winter, as the hype was so big but like self driving cars the devil is in the details. The general public realizes these things are just essentially giving us averages.
- A big market will emerge around "authenticity" and "verified texts" as the internet continues to get flooded with AI generated content.
by randomtoast on 6/5/24, 7:54 AM
Since the start of their partnership in 2019, OpenAI has primarily utilized Microsoft's Azure data centers for training its models. In 2023, Microsoft acquired approximately 150,000 H100 GPUs. [1]
The initial version of GPT-4 ran on a cluster of A100 GPUs. It is likely that GPT-5 will run on the newly acquired H100 GPUs, and it is plausible that GPT-4 Turbo and GPT-4o also utilize this infrastructure. The inference speed of GPT-5 should not be significantly slower than that of GPT-4 to ensure it remains practical for most applications.
Assuming the H100 is 4.6 times faster for inference than the A100 [2], this gives us a lower bound for performance expectations. I anticipate GPT-5 to be at least five times larger in terms of model parameters. Given that both A100 and H100 have a maximum capacity of 80GB, it is unlikely we will see a single gigantic model. Instead, we can expect an increase in the number of experts. If GPT-4 operates as a mixture of experts with 8x220 billion parameters, then GPT-5 might scale up to something like 40x220 billion parameters. However, the exact release date, safety measures, and benchmark performance of GPT-5 remain uncertain.
[1]: https://www.tomshardware.com/tech-industry/nvidia-ai-and-hpc...
[2]: https://nvidia.github.io/TensorRT-LLM/blogs/H100vsA100.html
by wkat4242 on 6/5/24, 12:39 PM
If you mean the hallucinations I don't think that will ever really be solved. I think people just have to learn that LLMs are not divine oracles that are always correct. Just like the training data generated by the flawed humans that are often either wrong or outright lying.
Garbage in, garbage out.
Not saying that AI isn't useful. But expecting what is basically a "human simulator" not to inherit humanity's flaws is a bit disingenious.
by WheelsAtLarge on 6/5/24, 3:30 AM
by jankovicsandras on 6/5/24, 10:39 AM
GPT-4 is not an LLM, but a complex software system, which has LLM(s) at its core, but also other components like RAG, toxicity filter, apologizing mechanism, expert systems, etc. "GPT-4" is product name / marketing name. For OpenAI, this would be logical for performance and business reasons. This explains also how they can tune it, the apparent secrecy about the architecture, etc.
It's also logical to make small, incremental changes to this system instead of building whatever GPT-5 would mean from ground up. So I expect "GPT-5" is also just a marketing name for a slightly better black-box (for us) system and product line.
by dkobia on 6/5/24, 2:12 PM
Basically the same trap as CPU's in the 90's early 2000's where the naming convention had to change to reflect the fact that speeds can't continue to double every 2 years.
by thiago_fm on 6/5/24, 10:02 AM
I also believe that they will delay the release of GPT-5 as much as possible, the reason being that it will be underwhelming (at least in comparison to GPT3.5 hype). Possibly release close to some Google new release timeline (their main competitor).
They are the main driver of a bubble that has benefited a lot both Microsoft and NVidia and other hyperscalers, and if they release the model and display that we're at the "diminished returns" phase, this will crash a big part of the industry, not to mention NVidia.
Companies are buying H100s and investing in expensive AI talent because they believe they progress quickly, if the progress stalls for LLMs, there'll be a huge drop in sales and CAPEX in this industry.
There are still many up-and-coming projects that rely on NVidia hardware for training, like Tesla's autopilot and others, but the bulk of the investment in H100 in recent years has been mostly because of LLMs.
Also all the new AI talent will move on to do something new and hopefully we will have more discoveries and potential uses, but we're definitely peak LLMs.
(ps: just my opinion)
by razodactyl on 6/6/24, 8:40 AM
The later iterations are heavily censored so the public was provided a bit of a transition period before things got too chaotic.
I'm sure there were many other reasons the authors themselves weren't aware of at the time such as the inundation of AI content skewing further training quality.
Of course this is a roundabout explanation, there's always more detail that can be added and I'd rather be objective. There's always a financial motive for companies too so take that into consideration. The hype definitely played into their marketing.
by diego_sandoval on 6/5/24, 8:18 AM
From a product perspective, going back to unimodality after trying GPT-4o would be awkward, so there's reasons for them to go full multimodal, but I'm not fully educated about the trade-off from a technical perspective.
by russiancapybara on 6/5/24, 2:50 PM
by stormfather on 6/5/24, 12:51 PM
by ilaksh on 6/5/24, 11:22 AM
by tetris11 on 6/5/24, 2:58 PM
0: a type of evolved sea-slug
1: by capturing it and torturing it
by Vuizur on 6/5/24, 9:35 PM
I guess it will be this year, some guy working at OpenAI already posted "4+1=5" on Twitter, which is suggestive.
by kromem on 6/5/24, 12:21 PM
It will at first glance be a small step, but over the next 12mo after release, it will turn out to have been a giant leap.
It will be safe when being observed.
by throwaway211 on 6/5/24, 12:09 PM
by _davide_ on 6/5/24, 1:38 PM
by treprinum on 6/5/24, 10:35 AM