from Hacker News

All my favorite tracing tools

by trishume on 12/5/23, 10:56 PM with 40 comments

  • by crdavidson on 12/6/23, 9:00 AM

    I wrote Spall, one of the lightweight profilers mentioned in the post. I loved the author's blogpost on implicit in-order forests, it was neat to see someone else's take on trees for big traces, pushed me to go way bigger than I was originally planning!

    Thankfully, eytzinger-ordered 4-ary trees work totally fine at 165+ fps, even at 3+ billion functions, but I like to read back through that post once in a while just in case I hit that perf wall someday.

    Working on timestamp delta-compression at the moment to pack events into much smaller spaces, and hopefully get to 10 billion in 128 GB RAM sometime soon (at least for native builds of Spall).

    Thanks for the kick to keep on pushing!

  • by criddell on 12/6/23, 10:41 AM

    If you work on Windows applications, check out Event Tracing for Windows (ETW). The best place to start is Bruce Dawson’s blog:

    https://randomascii.wordpress.com/2015/09/24/etw-central/

  • by Veserv on 12/6/23, 6:45 AM

    A pretty good overview of open source solutions in the space.

    Missing out on one of the most useful areas for tracing which is time travel debugging. There are a number of interesting solutions there taking advantage of hardware trace, instrumentation, and deterministic replay. Even better when you get full visualization integration so you can do something like zoom in from a multiple minute trace onto a suspicious 200 ns function and then double click on it which will then backstep to that exact point in your program with the full reconstruction of memory at that time so you can debug from that point.

  • by jeffrallen on 12/6/23, 7:17 AM

    The author mentions dtrace in passing. If you're into "load bearing rants", check out bcantrill's recent rant on bpftrace silently losing events and why dtrace won't do that.
  • by kqr on 12/6/23, 9:12 AM

    > I wanted to correlate packets with userspace events from a Python program, so I used a fun trick: Find a syscall which has an early-exit error path and bindings in most languages, and then trace calls to that which have specific arguments which produce an error.

    Wow. This is some great engineering. Obviously that's what you'd do, but I'd never think of it in a thousand years!

  • by ElijahLynn on 12/7/23, 5:34 PM

    What a great way to recruit! The ending pitch to join Tristan at Anthropic, if I were competent enough in this area, is very alluring! Tristan does a great job covering the content about the types of things one would be working on.

    p.s. I think the blog post could use more screengrabs of the traces. Great first pass at it though, and screengrabs can be added over time!

  • by felixrieseberg on 12/6/23, 7:47 PM

    I wish the industry had a better answer for deterministically profiling the execution cost of JavaScript. Attempts were made in Chromium by hooking into Linux perf, but that change has since been removed.

    If anyone has any tips on how to trace JavaScript (not just profile by time, but deterministically measure the cost of it in CI), I'd love to hear tips!

  • by zubairq on 12/6/23, 11:03 AM

    Some great tools in here, thanks!