by picozeta on 4/16/23, 7:01 PM with 6 comments
At the moment Chat-GPT and other instruct-based models show what's possible with modern LLM models. Although most SOTA models need cloud-compute to predict, it's feasible to assume that they might work on beefy standard desktops in the next 5 to 10 years (IMHO).
Now, historically we had:
- thin-client main frame architecture (1970s - 1980s)
- fat-client "home computers" (1980s - 2010s)
- thin-client SaaS software platforms (2010s - 2025s)
- fat-client LLM inference engines (2025s - ?)
In particular I think there will be a lot of ethical questions and legal work for companies to sell LLMs as SaaS, and because of fear to "recommend stuff against the status-quo", they might be inferior to "open" (unconstrained) models and that might be just possible for private persons (at first).Just my 2 cents, what do you think?
by brucethemoose2 on 4/16/23, 8:26 PM
Which was in the pipe years before the LLM craze. Its reasonable to assume AMD is on a similar trajectory, making their future CPUs GPU/NPU heavy. Pair that with lots of RAM (and hopefully wider buses), and you have a respectable LLM client.
This might be a reasonable frontier for smartphone performance too, depending on how much DRAM continues to shrink. But maybe not, since mobile apps love their subscriptions and MTX, which rely on doing stuff in the cloud... otherwise why would you subscribe?
by zamnos on 4/16/23, 9:39 PM
What am referring to with "private output"? I'm referring to what we know is coming - easy at-home production of porn deep fakes. Some aren't going to mind using AWS to produce their weird fantasies (not kink shaming, we're all into some weird stuff, not everyone is into their cloud provider potentially having access to it). Others are going to want that produced privately at home, on their own hardware.
by detrites on 4/16/23, 7:54 PM
by dtagames on 4/16/23, 8:31 PM