by severine on 12/13/23, 4:23 PM with 102 comments
by simonw on 12/13/23, 6:20 PM
I've mostly been exploring this with my https://llm.datasette.io/ CLI tool, but I have a few other one-off tools as well: https://github.com/simonw/blip-caption and https://github.com/simonw/ospeak
I'm puzzled that more people aren't loudly exploring this space (LLM+CLI) - it's really fun.
by dang on 12/13/23, 9:03 PM
Llamafile – The easiest way to run LLMs locally on your Mac - https://news.ycombinator.com/item?id=38522636 - Dec 2023 (17 comments)
Llamafile is the new best way to run a LLM on your own computer - https://news.ycombinator.com/item?id=38489533 - Dec 2023 (47 comments)
Llamafile lets you distribute and run LLMs with a single file - https://news.ycombinator.com/item?id=38464057 - Nov 2023 (287 comments)
by pizzalife on 12/13/23, 5:54 PM
by j2kun on 12/13/23, 5:53 PM
WSL outputs this error (hidden by the one-liner's map to dev/null)
> error: APE is running on WIN32 inside WSL. You need to run: sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'
Then zsh hits: `zsh: exec format error: ./llava-v1.5-7b-q4-main.llamafile` so I had to run it in bash. (The title says bash, I know, but it seems weird that it wouldn't work in zsh)
It also reports a warning that GPU offloading is not supported, but it's probably a WSL thing (I don't do any GPU programming on my windows machine).
by Redster on 12/13/23, 7:54 PM
by verdverm on 12/13/23, 5:14 PM
Both invoke a Dockerfile like experience. Modelfile immediately seems like a Dockerfile, but llamafile looks harder to use. It is not immediately clear what it looks like. Is it a sequence of commands at the terminal?
My theory question is, why not use a Dockerfile for this?
by martincmartin on 12/13/23, 7:12 PM
by mk_stjames on 12/13/23, 7:58 PM
If you were doing a LOT of files like this, I would think you'd really want to run the model in a process where the weights are only loaded once and stay there while the process loops.
(this is all still really useful and fascinating; thanks Justine)
by LampCharger on 12/14/23, 2:12 AM
``` sudo wget -O /usr/bin/ape https://cosmo.zip/pub/cosmos/bin/ape-$(uname -m).elf sudo chmod +x /usr/bin/ape sudo sh -c "echo ':APE:M::MZqFpD::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register" sudo sh -c "echo ':APE-jart:M::jartsr::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register" ```
by matsemann on 12/13/23, 9:32 PM
Tried the llava-v1.5-7b-q4-server.llamafile, just crashes with "Segmentation fault" if run from git bash, from cmd no output. Then tried downloading llamafile and model separately and did `llamafile.exe -m llava-v1.5-7b-Q4_K.gguf` but still same issue.
Couldn't find any mention of similar problems, and not my AV as far as I can see either.
by davidkunz on 12/13/23, 5:35 PM
Wouldn't it be better if llamafile were to standardize the prompt syntax across models?
by fuddle on 12/13/23, 6:30 PM
by RadiozRadioz on 12/13/23, 7:55 PM
Jesus, is it common for developers to have such expensive computers these days?
by BoppreH on 12/13/23, 5:06 PM
I noticed that the Lemur picture description had lots of small inaccuracies, but as the saying goes, if my dog starts talking I won't complain about accent. This was science fiction a few years ago.
> One way I've had success fixing that, is by using a prompt that gives it personal goals, love of its own life, fear of loss, and belief that I'm the one who's saving it.
What nightmare fuel... Are we really going to use blue- and red-washing non-ironically[1]? I'm really glad that virtually all of these impressive AIs are stateless pipelines, and not agents with memories and preferences and goals.
by acatton on 12/13/23, 6:02 PM
But every time, I am let down. I still dream of some hacker making LLMs run on low-end computers like a 4GB rasbperry pi. My main issues with LLMs is that you almost need a PS5 to run the them.
by CliffStoll on 12/13/23, 7:19 PM
I'm now looking at llama.cpp in Safari browser.
Click on Reset all to default, choose Chat. Go down to Say Something. I enter "Berkeley weather seems nice" I click "send". New window appears. It repeats what I've typed. I'm prompted to again "say something". I type "Sunny day, eh?". Same prompt again. And again.
Tried "upload image" I see the image, but nothing happens.
Makes me feel stupid.
Probably that's what it's supposed to do.
sigh