from Hacker News

Bash one-liners for LLMs

by severine on 12/13/23, 4:23 PM with 102 comments

  • by simonw on 12/13/23, 6:20 PM

    I've been gleefully exploring the intersection of LLMs and CLI utilities for a few months now - they are such a great fit for each other! The unix philosophy of piping things together is a perfect fit for how LLMs work.

    I've mostly been exploring this with my https://llm.datasette.io/ CLI tool, but I have a few other one-off tools as well: https://github.com/simonw/blip-caption and https://github.com/simonw/ospeak

    I'm puzzled that more people aren't loudly exploring this space (LLM+CLI) - it's really fun.

  • by dang on 12/13/23, 9:03 PM

    Recent and related:

    Llamafile – The easiest way to run LLMs locally on your Mac - https://news.ycombinator.com/item?id=38522636 - Dec 2023 (17 comments)

    Llamafile is the new best way to run a LLM on your own computer - https://news.ycombinator.com/item?id=38489533 - Dec 2023 (47 comments)

    Llamafile lets you distribute and run LLMs with a single file - https://news.ycombinator.com/item?id=38464057 - Nov 2023 (287 comments)

  • by pizzalife on 12/13/23, 5:54 PM

    This is really neat. I love the example of using an LLM to descriptively rename image files.
  • by j2kun on 12/13/23, 5:53 PM

    I just tried this and ran into a few hiccups before I got it working (on a Windows desktop with a NVIDIA GeForce RTX 3080 Ti)

    WSL outputs this error (hidden by the one-liner's map to dev/null)

    > error: APE is running on WIN32 inside WSL. You need to run: sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'

    Then zsh hits: `zsh: exec format error: ./llava-v1.5-7b-q4-main.llamafile` so I had to run it in bash. (The title says bash, I know, but it seems weird that it wouldn't work in zsh)

    It also reports a warning that GPU offloading is not supported, but it's probably a WSL thing (I don't do any GPU programming on my windows machine).

  • by Redster on 12/13/23, 7:54 PM

    Currently, a quick search on Hugging Face shows a couple of TinyLlama (~1b) Llamafiles. Adding those to the 3 in the original 3 llamafiles, that's 6 total. Are there any other llamafiles in the wild?
  • by verdverm on 12/13/23, 5:14 PM

    Curious about what HN things about llamafile and modelfile (https://github.com/jmorganca/ollama/blob/main/docs/modelfile...)

    Both invoke a Dockerfile like experience. Modelfile immediately seems like a Dockerfile, but llamafile looks harder to use. It is not immediately clear what it looks like. Is it a sequence of commands at the terminal?

    My theory question is, why not use a Dockerfile for this?

  • by martincmartin on 12/13/23, 7:12 PM

    What are the pros and cons of llamafile (used by OP) vs ollama?
  • by mk_stjames on 12/13/23, 7:58 PM

    Just to make sure I've got this right- running a llamafile in a shell script to do something like rename files in a directory- it has to open and load that executable every time a new filename is passed to it, right? So, all that memory is loaded and unloaded each time? Or is there some fancy caching happening I don't understand? (first time I ran the image caption example it took 13s on my M1 Pro, the second time it only took 8s, and now it takes that same amount of time every subsequent run)

    If you were doing a LOT of files like this, I would think you'd really want to run the model in a process where the weights are only loaded once and stay there while the process loops.

    (this is all still really useful and fascinating; thanks Justine)

  • by LampCharger on 12/14/23, 2:12 AM

    The installation steps include the following instructions. Are these safe?

    ``` sudo wget -O /usr/bin/ape https://cosmo.zip/pub/cosmos/bin/ape-$(uname -m).elf sudo chmod +x /usr/bin/ape sudo sh -c "echo ':APE:M::MZqFpD::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register" sudo sh -c "echo ':APE-jart:M::jartsr::/usr/bin/ape:' >/proc/sys/fs/binfmt_misc/register" ```

  • by matsemann on 12/13/23, 9:32 PM

    Do I need to do something run llamafile on Windows 10?

    Tried the llava-v1.5-7b-q4-server.llamafile, just crashes with "Segmentation fault" if run from git bash, from cmd no output. Then tried downloading llamafile and model separately and did `llamafile.exe -m llava-v1.5-7b-Q4_K.gguf` but still same issue.

    Couldn't find any mention of similar problems, and not my AV as far as I can see either.

  • by davidkunz on 12/13/23, 5:35 PM

    > Rocket 3b uses a slightly different prompt syntax.

    Wouldn't it be better if llamafile were to standardize the prompt syntax across models?

  • by fuddle on 12/13/23, 6:30 PM

    I think the blog post would be a lot easier to read if the code blocks had a background or a different text color.
  • by RadiozRadioz on 12/13/23, 7:55 PM

    > 4 seconds to run on my Mac Studio, which cost $8,300 USD

    Jesus, is it common for developers to have such expensive computers these days?

  • by BoppreH on 12/13/23, 5:06 PM

    Justine is killing it as always. I especially appreciate the care for practicality and good engineering, like the deterministic outputs.

    I noticed that the Lemur picture description had lots of small inaccuracies, but as the saying goes, if my dog starts talking I won't complain about accent. This was science fiction a few years ago.

    > One way I've had success fixing that, is by using a prompt that gives it personal goals, love of its own life, fear of loss, and belief that I'm the one who's saving it.

    What nightmare fuel... Are we really going to use blue- and red-washing non-ironically[1]? I'm really glad that virtually all of these impressive AIs are stateless pipelines, and not agents with memories and preferences and goals.

    [1] https://qntm.org/mmacevedo

  • by acatton on 12/13/23, 6:02 PM

    I get excited when hackers like Justine (in the most positive sense of the word) start working with LLMs.

    But every time, I am let down. I still dream of some hacker making LLMs run on low-end computers like a 4GB rasbperry pi. My main issues with LLMs is that you almost need a PS5 to run the them.

  • by CliffStoll on 12/13/23, 7:19 PM

    OK - I followed instructions; installed on Mac Studio into /usr/local/bin

    I'm now looking at llama.cpp in Safari browser.

    Click on Reset all to default, choose Chat. Go down to Say Something. I enter "Berkeley weather seems nice" I click "send". New window appears. It repeats what I've typed. I'm prompted to again "say something". I type "Sunny day, eh?". Same prompt again. And again.

    Tried "upload image" I see the image, but nothing happens.

    Makes me feel stupid.

    Probably that's what it's supposed to do.

    sigh