from Hacker News

Meta and Qualcomm team up to run Llama 2 on phones

by tzm on 7/18/23, 4:58 PM with 3 comments

by russellbeattie on 7/18/23, 6:13 PM
To me, this is probably the most important announcement from today. The ability to have an on-device low-latency implementation of an LLM is going to be key to AI/UX. This is similar to Google's various PaLM versions for Android they announced. It will have a huge impact on mobile phones, TVs, smart screens, etc. Obviously the devil will be in the details - like how well will it actually work? But even if it's just the first step, it's an important one.
I bet next up will be at the end of the year when Apple surely jumps into the ring with their version of on-device Siri, powered by their ML chips.
by yumraj on 7/18/23, 6:31 PM
> Llama 2, to run on Qualcomm chips on phones and PCs starting in 2024
Are they expecting for Llama 2 to still be relevant in 2024? Shouldn't it have been LLM or Llama (without the version).
by brucethemoose2 on 7/18/23, 7:49 PM
Qualcomm wanted to run Stable Diffusion on phones as well, and IIRC they published a demo, but no one cared because it was a barebones demo.
MLC-LLM already runs LlamaV1 (and probably V2?) performantly on Android and iOS. But again, no one seems to care because its a relatively barebones demo.
Qualcomm needs to build and extensive feature set and probably a UI, otherwise no one is going to care about this implementation either.