from Hacker News

Lolbench: automagically and empirically discovering Rust performance regressions

by anp on 10/1/18, 6:21 PM with 40 comments

  • by MikeHolman on 10/2/18, 12:41 AM

    Do you do have any plans to better distinguish between noise and regressions? I run a similar performance testing infrastructure for Chakra, and found that comparing against the previous run makes the results noisy. That means more manual review of results, which gets old fast.

    What I do now is run a script that averages results from the preceding 10 runs and compares that to the average of the following 5 runs to see if the regression is consistent or anomalous. If the regression is consistent, then the script automatically files a bug in our tracker.

    There is still some noise in the results, but it cuts down on those one-off issues.

  • by chriswarbo on 10/2/18, 10:38 AM

    For those wanting to do similar tracking of benchmarks across commits, I've found Airspeed Velocity to be quite nice ( https://readthedocs.org/projects/asv ). It allows (but doesn't require) benchmarks to be kept separate to the project's repo, can track different configurations separately (e.g. using alternative compilers, dependencies, flags, etc.), keeps results from different machines separated, generates JSON data and HTML reports, performs step detection to find regressions, etc.

    It was intended for use with Python (virtualenv or anaconda), but I created a plugin ( http://chriswarbo.net/projects/nixos/asv_benchmarking.html ) which allows using Nix instead, so we can provide any commands/tools/build-products we like in the benchmarking environment (so far I've used it successfully with projects written in Racket and Haskell).

  • by anp on 10/1/18, 7:33 PM

    hi! author here if you want to ask questions or (nicely pls) let me know where I've made mistakes!
  • by valarauca1 on 10/1/18, 8:04 PM

    How do you determine baseline load of the test machine in order to qualify the correctness of the benchmark?

    Assuming the compiling, and testing is done in the cloud how do you ensure the target platform (processor) doesn't change, and that you aren't being subjected to neighbors who are stealing RAM bandwidth, or CPU cache resources from your VM and impacting the results?

  • by panic on 10/2/18, 3:15 AM

    The "More Like Rocket Science Rule of Software Engineering" has been WebKit policy for a while: https://web.archive.org/web/20061011203328/http://webkit.org... (now at https://webkit.org/performance/).
  • by habitue on 10/1/18, 11:06 PM

    This project looks awesome, but as a complete aside:

    How long do we expect it to take before "automagically" completely replaces "automatically" in English?

    I am guessing less than a decade to go now

  • by hsivonen on 10/1/18, 8:08 PM

    Very nice!

    Do you track opt_level=2 (the Firefox Rust opt level) in addition to the default opt_level=3?

  • by thsowers on 10/1/18, 9:44 PM

    This is really cool, love the project and the writeup! I regularly use nightly (I work with Rocket) and I had always wondered about this. Thank you!
  • by Twirrim on 10/1/18, 11:15 PM

    Can I suggest you consider putting https://github.com/anp/lolbench/issues/1 in to the README.md file, so people can easily see where to look for some TODO items?
  • by awake on 10/1/18, 7:44 PM

    Is there any equivalent project for java.