from Hacker News

Go Optimization Guide

by jedeusus on 3/31/25, 8:29 PM with 157 comments

  • by nopurpose on 3/31/25, 10:34 PM

    Every perf guide recommends to minimize allocations to reduce GC times, but if you look at pprof of a Go app, GC mark phase is what takes time, not GC sweep. GC mark always starts with known live roots (goroutine stacks, globals, etc) and traverse references from there colouring every pointer. To minimize GC time it is best to avoid _long living_ allocations. Short lived allocations, those which GC mark phase will never reach, has almost neglible effect on GC times.

    Allocations of any kind have an effect on triggering GC earlier, but in real apps it is almost hopeless to avoid GC, except for very carefully written programs with no dependenciesm, and if GC happens, then reducing GC mark times gives bigger bang for the buck.

  • by stouset on 4/1/25, 6:46 AM

    Checking out the first example—object pools—I was initially blown away that this is not only possible but it produces no warnings of any kind:

        pool := sync.Pool{
            New: func() any { return 42 }
        }
    
        a := pool.Get()
    
        pool.Put("hello")
        pool.Put(struct{}{})
    
        b := pool.Get()
        c := pool.Get()
        d := pool.Get()
    
        fmt.Println(a, b, c, d)
    
    Of course, the answer is that this API existed before generics so it just takes and returns `any` (née `interface{}`). It just feels as though golang might be strongly typed in principle, but in practice there are APIs left and rigth that escape out of the type system and lose all of the actual benefits of having it in the first place.

    Is a type system all that helpful if you have to keep turning it off any time you want to do something even slightly interesting?

    Also I can't help but notice that there's no API to reset values to some initialized default. Shouldn't there be some sort of (perhaps optional) `Clear` callback that resets values back to a sane default, rather than forcing every caller to remember to do so themselves?

  • by kevmo314 on 3/31/25, 11:35 PM

    Zero-copy is totally underrated. Like the site alludes to, Go's interfaces make it reasonably accessible to write zero-copy code but it still needs some careful crafting. The payoff is great though, I've often been surprised by how much time is spent allocating and shuffling memory around.
  • by roundup on 3/31/25, 10:32 PM

    Additionally...

    - https://go101.org/optimizations/101.html

    - https://github.com/uber-go/guide

    I wish this content existed as a model context protocol (MCP) tool to connect to my IDE along w/ local LLM.

    After 6 months or switching between different language projects, it's challenging to remember all the important things.

  • by donatj on 4/1/25, 11:15 AM

    Unpopular opinion maybe, but sync.Pool is so sharp, dangerous and leaky that I'd avoid using it unless it's your absolute last option. And even then, maybe consider a second server first.
  • by jrockway on 3/31/25, 11:49 PM

    GOMEMLIMIT has saved me a number of times. In containerized production, it's nice, because sometimes jobs are ephemeral and don't even do enough allocations to hit the memory limit, so you don't spend any time in GC. But it's saved me the most times in CI where golangci-lint or govulncheck can't complete without running out of memory on a kind-of-large CI machine. Set GOMEMLIMIT and it eventually completes. (I switched to nogo, though, so at least golangci-lint isn't a problem anymore.)
  • by dennis-tra on 4/1/25, 12:15 PM

    Can someone explain to me why the compiler can’t do struct-field-alignment? This feels like something that can easily be automated.
  • by parhamn on 3/31/25, 10:19 PM

    Noticed the object pooling doc, had me wondering: are there any plans to make packages like `sync` generic?
  • by __turbobrew__ on 4/2/25, 6:28 AM

    Calling mmap “zero copy” is generous. I guess we glaze over the whole page fault thing, or the fact that performance is heavily dependent on how much memory pressure the process is under.

    This is the same n00b trap that derailed the llama.cpp project last year because people don’t understand how memory maps and paging works, and the tradeoffs.

  • by inadequatespace on 4/2/25, 3:46 PM

    Why doesn’t the compiler pack structs for you if it’s as easy as shuffling around based on type?
  • by neillyons on 4/1/25, 5:33 AM

    Curious to know what people are building where you need to optimise like this? eg Struct Field Alignment https://goperf.dev/01-common-patterns/fields-alignment/#avoi...
  • by EdwardDiego on 4/1/25, 2:56 AM

    Huh, this surprises me about Golang, didn't realise it was so similar to C with struct alignment. https://goperf.dev/01-common-patterns/fields-alignment/#why-...
  • by jensneuse on 3/31/25, 11:06 PM

    You can often fool yourself by using sync.Pool. pprof looks great because no allocs in benchmarks but memory usage goes through the roof. It's important to measure real world benefits, if any, and not just synthetic benchmarks.
  • by nikolayasdf123 on 4/1/25, 2:21 AM

    nicely organised. I feel like this could grow into community driven current state-of-the-art of optimisation tips for Go. just need to allow people edit/comment their input easily (preferably in-place). I see there is github repo, but my bet people would not actively add their input/suggestions/research there, it is hidden too far from the content/website itself
  • by kunley on 4/1/25, 8:44 AM

    "Although the struct Data contains a [1024]int array, which is 4 KB (assuming int is 4 bytes on the architecture used)"

    Huh,what?

    I mean, who uses 32b architecture by default?

  • by _345 on 4/1/25, 4:10 AM

    Anyone know of a resource like this but for Python 3?
  • by nikolayasdf123 on 4/1/25, 2:18 AM

    nice article. good to see statements backed up by Benchmarks right there
  • by ljm on 3/31/25, 10:56 PM

    You're not really writing 'Go' anymore when you're optimising it, it's defeating the point of the language as a simple but powerful interface over networked services.