from Hacker News

“You meant to install ripgrep”

by micouay on 10/17/22, 3:53 PM with 159 comments

  • by burntsushi on 10/17/22, 4:01 PM

    Hah! TIL. I had no idea someone did this. But it's smart. I should have thought of it!

    (I'm the author of ripgrep.)

  • by woodruffw on 10/17/22, 5:26 PM

    This classic version of this in the Ruby ecosystem is "bundle"[1], which helpfully installs `bundler` for you. Of the 6.7 million downloads it has, I'm probably in there a dozen or so times.

    [1]: https://rubygems.org/gems/bundle

  • by richdodd on 10/18/22, 11:04 AM

    Hi - author of `rg` here :'). I've transferred over to BurntSushi which will give people a bit more assurance that `rg` won't become malware in the future.

    I also squatted `memap` and `memap2` for the same reasons.

    I wonder if there is an algorithmic way to decide when two crate names are 'near' each other. Then, if you added a crate with `cargo add` and there is another similarly-named crate with much higher usage, a warning could be emitted.

    *EDIT* I know there's already https://en.wikipedia.org/wiki/Levenshtein_distance, but I wonder if there is a better measure that looks at e.g. keyboard layouts and likely typos. I'm sure there will have been research done on this.

  • by samatman on 10/17/22, 6:38 PM

    If a package manager is starting from zero, and wants to have a privileged namespace such that a short name has a canonical value, it would make sense for those packages to be able to include a list of strings which the package should also reserve.

    That way "ripgrep" could include "rg", searching cargo for "rg" brings back "ripgrep", not a second package named "rg", and an install could tell the user the correct name for any attempt to install it.

    This also covers typo-squats, so there would be no need for packages like "memap".

    Obviously this represents a low-effort vector for massive squatting, so maintainers would need to be responsible for preventing that, and could add some typos themselves, being the ones which see the request for the mis-typed packages.

  • by mherdeg on 10/17/22, 6:44 PM

    Wow, it's been years now since I typed "sl" at a terminal and got an ascii steam locomotive.
  • by micouay on 10/17/22, 3:53 PM

    `rg` has only one version, and one line of code:

        println!("You meant to install ripgrep: type `cargo uninstall rg` followed by `cargo install ripgrep`");
  • by jfk13 on 10/17/22, 5:52 PM

    Huh - the same author also has https://crates.io/crates/memap and memap2, which explicitly say that they're "squatting to prevent a malicious typo package".

    Not sure how to feel about this... on an individual-package level, it seems a sensible enough idea, but if it becomes a widespread practice, the namespace could get really cluttered.

  • by filereaper on 10/17/22, 7:06 PM

    Can't tell you how often I've run: `pip install aws`

    This installs a library by some authors not affiliated with AWS.

    Instead of: `pip install awscli`

    Which is what you expect.

  • by noswi on 10/17/22, 5:23 PM

    What does one do if they wish to see the actual contents of this crate? The web interface I'm looking at contains no hints at peeking inside, not even direct archive download links, nothing.

    I can't believe that a good way to see what's inside is to make a rust project, add the crate and then go searching around the local filesystem.

  • by remram on 10/17/22, 5:22 PM

    Similar in the Python world: https://pypi.org/project/sklearn/

    This one just depends on the correct `scikit-learn` package though.

  • by fregante on 10/18/22, 3:33 AM

    Back when npm didn’t have any “similar name” restrictions I did the same for some popular packages. My redirects also helped me a couple of times as I wonder whether a package name had a dash or not.

    https://github.com/fregante/npm-helpful-typosquatting

    Here’s what it looks like: https://www.npmjs.com/package/webext

  • by typon on 10/17/22, 8:22 PM

    Neovim + Telescope + ripgrep. It's taken 30 years, but we finally have the perfect code navigation solution.
  • by rpigab on 10/18/22, 8:48 AM

    I never run cargo install or any other package manager download commands without checking the website of the package manager for the right name first, ensuring the author is right, and the commit/update history looks right.

    I love Python but pip/pypi and imports always felt wierd to me because of namespaces, package names, special imports "as", etc., maybe this is a bias because I started using them when I was younger and now I'm more experienced, I already know how to use most package managers.

    BTW Ripgrep is awesome, I'm learning Rust and it's an inspiration to me, thanks burntsushi!

  • by chlorion on 10/18/22, 12:09 AM

    It would be nice if crates supported being signed with GPG or minisign or whatever.

    I can imagine for example, importing keys from only the authors that I think I can trust, and passing a flag to cargo that only allows using those packages for cargo install or cargo add.

    In this case I think just checking the top level crates signature (and not dependencies) would be enough to mitigate a lot of issues including typo squatting.

  • by ashishbijlani on 10/18/22, 5:03 AM

    Tools like Packj[1] that check for typosquatting and install packages under a sandbox can help in avoiding accidental installation of malicious packages. Disclaimer: I’m one of the devs.

    1. https://github.com/ossillate-inc/packj

  • by underyx on 10/17/22, 5:29 PM

    I maintain a Python package that parks names like this. There's a Python library called pypi-parker[0] that makes it really easy to do this via CI.

    [0]: https://pypi.org/project/pypi-parker/

  • by worewood on 10/17/22, 7:39 PM

    Why not just add ripgrep as a dependency, effectively making it an alias of the original package?
  • by thombles on 10/17/22, 9:55 PM

    To my knowledge this is the only wrong crate I've ever installed due to my own error. It's... not a good feeling to read this message, even though the author turned out to be doing me a favour. :)
  • by benreesman on 10/17/22, 8:48 PM

    In spite of some tense conversations with the author I am still a rg super fan: it’s fantastic, reliable, performant, well-maintained software and I would recommend it to anyone. It’s best in class.
  • by seanw444 on 10/17/22, 5:04 PM

    Is there no way to have a package mirror, or alias or something? Unless I'm missing a joke or something, this seems like an easily solvable problem.
  • by bmn__ on 10/18/22, 6:40 AM

  • by debacle on 10/17/22, 7:09 PM

    What is ripgrep?

    Edit: Because I'm on a Zoom call that will never end.

    "ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern. By default, ripgrep will respect gitignore rules and automatically skip hidden files/directories and binary files."

    https://github.com/BurntSushi/ripgrep

  • by c7DJTLrn on 10/17/22, 7:19 PM

    Crates should be namespaced by user. This is a disaster waiting to happen.
  • by totorovirus on 10/18/22, 5:16 AM

    Why no alias linking or clone? Is it technically impossible?
  • by low_tech_punk on 10/18/22, 3:23 AM

    What is in the mysterious `rg` crate? There is no doc.
  • by secondcoming on 10/17/22, 8:53 PM

    Is it faster than Silver Searcher (ag)?
  • by notorandit on 10/18/22, 6:30 AM

    Life is too short to spend time just to know what ripgrep is/does. I mean, yes, you all look so cool and I look so dumb. But, c'mon, is this software so complex that its description doesn't fit 42 words? Is like having a shop with no sign and no window. Everyone is saying it's great and worth shopping. But still no sign. Ridiculuos.