from Hacker News

Meta and Qualcomm team up to run Llama 2 on phones

by tzm on 7/18/23, 4:58 PM with 3 comments

  • by russellbeattie on 7/18/23, 6:13 PM

    To me, this is probably the most important announcement from today. The ability to have an on-device low-latency implementation of an LLM is going to be key to AI/UX. This is similar to Google's various PaLM versions for Android they announced. It will have a huge impact on mobile phones, TVs, smart screens, etc. Obviously the devil will be in the details - like how well will it actually work? But even if it's just the first step, it's an important one.

    I bet next up will be at the end of the year when Apple surely jumps into the ring with their version of on-device Siri, powered by their ML chips.

  • by yumraj on 7/18/23, 6:31 PM

    > Llama 2, to run on Qualcomm chips on phones and PCs starting in 2024

    Are they expecting for Llama 2 to still be relevant in 2024? Shouldn't it have been LLM or Llama (without the version).

  • by brucethemoose2 on 7/18/23, 7:49 PM

    Qualcomm wanted to run Stable Diffusion on phones as well, and IIRC they published a demo, but no one cared because it was a barebones demo.

    MLC-LLM already runs LlamaV1 (and probably V2?) performantly on Android and iOS. But again, no one seems to care because its a relatively barebones demo.

    Qualcomm needs to build and extensive feature set and probably a UI, otherwise no one is going to care about this implementation either.