from Hacker News

Show HN: Checksum.sh verify every install script

by gavinuhma on 10/28/22, 6:38 PM with 75 comments

The pattern of downloading and executing installation scripts without verifying them has bothered me for a while.

I started messing around with a way to verify the checksum of scripts before I execute them. I've found it a really useful tool for installing things like Rust or Deno.

It's written entirely as a shell script, and it's easy to read and understand what's happening.

I hope it may be useful to someone else!

  • by dundarious on 10/28/22, 7:50 PM

    There are two big problems with the use of `echo $s` in bash/POSIX sh:

    1. Never use echo to output untrusted content as the first argument

    Let's say `s='-e 1\n2'`, then `echo $s` will output:

    > 1

    > 2

    Instead of:

    > -e 1\n2

    Always use printf if you want to start output with untrusted content, e.g., `printf %s\\n "$s"`.

    2. Never use unquoted variable expansion when trying to exactly reproduce contents of the variable

    Similarly, unquoted variable expansion re-tokenizes the contents and will not preserve spaces appropriately. Say `s='"a<space><space>b"'` (where each <space> is a literal ' ', HN seems to be collapsing 2 spaces down to 1), then `echo $s` will output:

    > "a<space>b"

    Instead of:

    > "a<space><space>b"

    You can get the latter with `echo "$s"` but use `printf %s\\n "$s"` to fix both issues.

    PS: If you fail to use quoted expansion with printf, for example like so, `printf %s\\n $s`, then you'll notice the problem right away, as it will effectively turn that into `for i in $s ; do printf %s\\n "$i" ; done`. That's actually a very useful feature of printf if you know to use it.

    Edit: These problems exist for bash/POSIX sh at least. Perhaps you're using a shell that works differently, like zsh, because otherwise issue 2 would probably have led to some checksum fails for you already.

  • by woodruffw on 10/29/22, 12:04 AM

    I think this is a worthy cause, but maybe a little misguided: the problem with "curl-piping" isn't so much the fact that you're throwing a random shell script into your shell, but the fact that you're downloading arbitrary code in a way that's disconnected from the normal integrity/authenticity guarantees of a package manager.

    In other words: you can be confident in the bootstrapping script you've just downloaded because it passed its checksum, but that script is just going to download more binaries from the Internet.

  • by kazinator on 10/29/22, 12:43 AM

    This function is flawed, containing unquoted variable interpolations:

      s=$(curl -fsSL $1)
      ...
      c=$(echo $s | shasum | awk '{print $1}')
    
    what it means is that the checksum is being calculated on a whitespace-mangled version of the data that is pulled down from the web.

    It appears to work because the author calculated the checksums with the same script and is just validating that they are not changing.

    In other words, it's possible to make whitespace changes such that the hash won't change.

    Here are two scripts: a harmless one and a malicious one, which produce the same whitespace-ignorant SHA256:

      $ foo='# this is a comment
      > # rm -rf /'
    
      $ echo $foo | sha256sum 
      8b87547d4d214038b153ce57d929be4c835b7690c930c1e83a25fc1509390cf9  -
    
      $ foo='# this is a comment #
      > rm -rf /'
      $ echo $foo | sha256sum 
      8b87547d4d214038b153ce57d929be4c835b7690c930c1e83a25fc1509390cf9  -
    
    The first foo contains two comments. The "rm -rf /" command is commented out. The second foo moves the hash mark of the second comment into the previous line, uncommenting the command.

    (I know about GNU Coreutils' safeguard in rm against removing / recursively, by the way.)

  • by NovemberWhiskey on 10/28/22, 10:56 PM

    I don't know; what's the threat model here?

    If the script is deliberately malicious as originally published, then the publisher will provide a valid checksum; so it doesn't help.

    If the script source is subverted by an attacker, then it only helps if the attacker doesn't also have the means to change the published checksum too.

    If an attacker can modify the site which publishes the URL for the script and the checksum, they can modify both at the same time.

  • by neeh0 on 10/28/22, 9:59 PM

    I wrote hundreds of those checks in scripts, makefiles, CI and whatever else. After I found Nix (and NixOS) it's ridiculous not to use it. Use it.
  • by Arnavion on 10/29/22, 3:39 AM

    >I've found it a really useful tool for installing things like Rust or Deno.

    For Rust you can ignore sh.rustup.rs and just download and set up rustup manually.

        CARGO_HOME="${CARGO_HOME:-$HOME/.cargo}"
        mkdir -p "$CARGO_HOME/bin"
        curl -Lo "$CARGO_HOME/bin/rustup" 'https://static.rust-lang.org/rustup/dist/x86_64-unknown-linux-gnu/rustup-init'
        chmod +x "$CARGO_HOME/bin/rustup"
        hash -r
        rustup set auto-self-update disable
        rustup set profile minimal
        rustup default stable
        rustup update --force
        rustup self update # Create hardlinks under $CARGO_HOME/bin/
  • by nerdponx on 10/28/22, 7:32 PM

    Why not use the -c option? Especially if you're using Bash or Zsh which has "here-strings":

        checksum() {
          hash="$1"
          file="$2"
          sha256sum -c <<< "${hash}  ${file}"
        }
    
    Or if you need to use a POSIX-ish shell:

        checksum() {
          hash="$1"
          file="$2"
          printf '%s  %s' "$hash" "$file" | sha256sum -c
        }
    
    Of course you can add a `--binary` option (uses '%s *%s' instead of '%s %s'), options to use different hash functions, etc.

    I also think it's weird to use `alias` inside a function, instead of just using a parameter to store the name of the program to execute.

  • by muterad_murilax on 10/29/22, 9:53 AM

    OP may want to take a look at "Shell script best practices" (https://news.ycombinator.com/item?id=33354286) submitted two days ago. :)
  • by charcircuit on 10/29/22, 3:04 AM

    This just shifts the trust to the checksum. How do you know you downloaded the right checksum? Checksum the checksum?

    Whatever you are doing to protect sending the checksum can also be used for protecting the script itself.

  • by koolba on 10/28/22, 7:43 PM

    Just remember that any script that fetches anything else remotely would still pass the checksum as only the initial script is checked.
  • by remram on 10/29/22, 3:27 AM

    An idea might be to get the checksum from the URL, for example:

        checksum https://sh.rustup.rs/#8327fa6ce106d2a387fdd02cb95c3607193c2edb | sh
    
    Otherwise I don't understand why your script is loaded as a function rather than run as a script.
  • by ithkuil on 10/28/22, 7:25 PM

    Awesome. I made something similar in https://github.com/mkmik/runck

    But I didn't but a fancy domain name :-)

  • by throwawaaarrgh on 10/28/22, 8:39 PM

    If we kept a mirrored or distributed decentralized network of just cryptographic hashes, that might solve a huge number of problems around distributing files securely.
  • by Genghis_9000 on 10/29/22, 12:33 AM

    and where do we get the checksum? HN has a hall monitor mentality issue
  • by thewataccount on 10/28/22, 8:12 PM

    Serious question - What is the benefit of verifying a hash? Are we really worried about file integrity? Why don't people use GPG?

    The hash only verifies file integrity, and that the content of the url doesn't switch the script later. But keep in mind in most scenerios, and attacker would also just change the hash listed too (they're usually on the same website). This only mitigates one very specific attack.

    Why don't we use GPG here? That way we can verify ownership and file integrity with at minimum TOFU, plus optional manual verification? If we're going through the work of adding a wrapper and all that, we may as well no?

    This has the benefit that you only need to import the owner's cert once, all future changes have the same cert. Where hashes are obviously different every time, you have to trust the source of the hash every time it changes. With GPG at the very least you have TOFU with certs - and very best can have better assurance of the initial download too.

    EDIT: Just want to clarify - I'm openly asking why the "developer community" is going the direction of hashes for script verification vs GPG signatures.

    I don't mean to diminish your project, your project looks fun, and does make verifying hashes easier :)

  • by dontbenebby on 10/28/22, 8:58 PM

    >The pattern of downloading and executing installation scripts without verifying them has bothered me for a while.

    Thanks for sharing this work OP! I didn't see a license mentioned -- did you intend this to go into the public domain? I like how you set up a cool domain name and did some sick graphics, but I'm not sure how I can legally use your code in the future.

    That being said, I appreciate the work you put into this project.

    I'm not going to list off specific examples, but MANY open source projects serve either PGP keys or hashes in the clear. Or they serve just hashes over HTTPS and now you have a trust issue.

    Or, in one case, my favorite -- they had lovingly listed out the MD5 sum for the program... but they served both that checksum, and the code itself... over HTTPS.

    Now, to be fair, HTTPS does provide an integrity check, so there's a benefit beyond privacy or whatever but... this is a RAMPANT problem in the open source community.

    I ran into it mostly when trying to find esoteric security tools when I was attempting OSCP and interviewing around for penetration testing roles.

    I got the sense rapidly shifting from "I was so scared of the CFAA I did an entire master's thesis on the design of censorship circumvention tools" to "Oh gee, I used to be such a narcissis, demanding a high falutin salary when I couldmn't even fire up Metasploit to wipe a server."

    (The implication being that some folks abused their access when my powers were week, and now, in time for spooky season, it's time lean in to letting people take whatever drug they want if they feel scared -- reality scares me too some days.)

  • by orf on 10/28/22, 9:20 PM

    I feel like bash/sh should have this built in