from Hacker News

Ask HN: What are some of the most elegant codebases in your favorite language?

by debanjan16 on 6/17/23, 2:51 PM with 177 comments

Your favorite language can be anything - Lisp, Python, C, Haskell, etc.

Which codebases are the most elegant ones written in your favorite language that new comers to the language can learn something from?

  • by w10-1 on 6/17/23, 8:15 PM

    I'd nominate Java itself.

    In the decades Java sources have been available, any of the kazillion junior programmers could debug-step into java.* and see exactly what's happening -- progressive disclosure at its finest for building expertise.

    Most other languages have a hard boundary, so the roots of language are a matter of conceptual documentation and experience. Java also offers that fly-over knowledge, but when problems arise, programmers love knowing exactly and seeing directly.

    Personally, in a pinch I prefer the actual to the elegant.

  • by cejast on 6/17/23, 4:01 PM

    Go and it’s standard library. ‘Elegant’ is pretty subjective, but it’s a treasure trove of learning how things work under the hood. I’ve learned loads about networking, compression, encryption and more by browsing through it, because the code is super easy to read and understand.
  • by zh3 on 6/17/23, 8:10 PM

    RT-11 on the PDP-11 (all in PDP-11 assembler). Back in the early 80's you could rebuild it in several different ways, depending on whether you wanted it "single job" (as it was back then) or not.

    As an example, here's the PDP-11 assembler to convert a binary number to decimal ASCII (no idea who wrote it, but they certainly made every word count):-

      CNV10:  MOV     R0,-(SP)     ;Subroutine to convert Binary # in R0
      1$:     CLR     R0           ;to Decimal ASCII by repetitive
              INC     R0           ;subtraction. The remainder for each
      2$:     SUB     #10.,@SP     ;radix is made into ASCII and pushed
              BGE     1$           ;on the stack, then the routine calls
              ADD     #72,@SP      ;itself. The code at 2$ pops the ASCII
              DEC     R0           ;digits off the stack and into the out-
              BEQ     2$           ;put buffer, eventually returning to
              CALL    CNV10        ;the calling program. This is a VERY
              MOVB    (SP)+,(R1)+  ;useful routine, is short and is
              RETURN               ;memory efficient.
    
    Courtesy bitsavers.org (http://www.bitsavers.org/pdf/dec/pdp11/rt11/v5.6_Aug91/AA-PD...)
  • by bb86754 on 6/17/23, 4:07 PM

    So many, I like reading how other people write code.

    R - sf package is a clean example of functional OOP

    Python - pytudes (Peter Norvig’s notebooks)

    Haskell - Elm compiler. I could mostly understand what’s going on even though I barely know any Haskell.

    Ruby - Sequel is really nice.

    Rust - Ripgrep

    Pretty much any F# codebase is super readable too.

  • by fastneutron on 6/17/23, 10:38 PM

    Shewchuk’s Triangle code for Delaunay mesh generation is some of the most elegant C-code I’ve ever seen [1]. It even won the SIAM Wilkinson Prize for numerical software at the time it was written. It’s written in a literate style, and builds up from the most fundamental arithmetic operations and data structures, all the way to one of the most efficient mesh generators of its kind. He has a few other codebases written in a similar style [2, 3]. It’s very different from what you see in the wild, but it’s been great for understanding what’s happening under the hood for complex algorithms like mesh generation.

    1. https://www.cs.cmu.edu/~quake/triangle.html

    2. https://github.com/ctlee/Stellar

    3. https://www.cs.cmu.edu/~quake/robust.html

  • by ttkciar on 6/17/23, 5:19 PM

    Working with Perl, two things spoiled me for other languages: JSON and DBI/DBD.

    In Perl, everything serializes to JSON without fuss. It can be "lossy"; objects without an explicit JSON serialization method and coderefs can't be deserialized, but serializations at least have placeholders for them.

    Compare to Python's json module, which doesn't try very hard to serialize things and throws exceptions every time it runs across something new. It's very frustrating to use.

    Perl's DBI provides a universal API for all databases, with implementation-specific details in an underlying DBD module (which both provides glue between the DBI abstraction and programmer access to features specific to different database systems).

    Compare to Python, where you need to import a different module and use a different API for every different kind of database. As I increasingly use Python for a living, I frequently wish they'd follow Perl's example with this.

  • by phkahler on 6/17/23, 6:46 PM

    C++ this file covers all the math for working with NURBS curves and surfaces:

    https://github.com/solvespace/solvespace/blob/master/src/srf...

    There is a lot more in other files - triangulation, booleans, creation - but the core math functions are there in very readable form.

  • by akkartik on 6/17/23, 10:56 PM

  • by mepian on 6/17/23, 4:23 PM

    For Lisp I would recommend Edi Weitz's CL-PPCRE: https://edicl.github.io/cl-ppcre/

    Pretty much anything else from Edi Weitz is also great.

    Mezzano is quite elegant in my opinion, especially for an operating system. For example, this is the USB mass storage driver: https://github.com/froggey/Mezzano/blob/master/drivers/usb/m...

  • by Waterluvian on 6/17/23, 6:37 PM

    Back when I was starting out, it was nice that Backbone had such clean, well-commented code: https://github.com/jashkenas/backbone/blob/master/backbone.j...
  • by osener on 6/18/23, 5:58 AM

    I really like Pandoc codebase [0]. It is a document converter written in Haskell.

    Reading it’s source code a decade ago was a turning point for me. Prior to that, I always felt an insurmountable gap between my toy codebases and real projects. All those open source software written in C++ etc. looked so unapproachable that I felt like I could not write production ready software.

    Pandoc however, was written in a language I didn’t know and did something very complicated very thoroughly, yet remained accessible. It was very nicely laid out and I could easily follow how it constructs it’s internal representation of documents and converts between them. I think this made me catch the functional programming bug for the next decade that let me build way bigger things than I had any right to, without getting crushed underneath all the complexity.

    Putting together something in Java or even contributing to OOP Python codebases was still like an exercise in frustration, no matter how much better I thought I’m getting at programming I would feel stupid trying to wrap my head around those abstractions and hierarchies. Somehow FP just clicked for me and made me see how I could start from a simple library call and little by little build the complete program.

    Today I am comfortable with all kinds of paradigms and levels of abstraction, but I definitely owe a lot to Pandoc for showing me I was smart enough to understand and modify real world software I did not build myself.

    [0] https://github.com/jgm/pandoc

  • by wewxjfq on 6/17/23, 4:20 PM

    At the beginning of the year I was rewriting a SPA and looking for ideas on how to structure a web app. One project I looked at was Github Desktop and I think it has very clean code for an app.

    https://github.com/desktop/desktop

  • by jpe90 on 6/17/23, 9:21 PM

    I've really enjoyed reading (and playing with) kons-9's codebase in Common Lisp. It lets you do exploratory 3D modeling via the program's GUI or the repl. The source is very neatly structured and readable. I don't see many CL projects that do 3D rendering, so it's nice to see!

    Quick demo:

    https://www.youtube.com/watch?v=i0CwhEDAXB0

    Repo:

    https://github.com/kaveh808/kons-9

  • by freedomben on 6/17/23, 5:49 PM

    For C, I've really enjoyed reading the Ruby source code. There are good and bad spots, but overall especially if you know Ruby the language, the source is both entertaining and enlightening.
  • by btkostner on 6/18/23, 6:16 AM

    I’m surprised that nobody has posted about Elixir yet. I nominate the excellently written Phoenix library. Not only is the code well organized and easy to find, the documentation is expansive and right next to the code.

    https://github.com/phoenixframework/phoenix

  • by blt on 6/17/23, 4:24 PM

    Redis is a popular answer for C, and I agree.
  • by seneca on 6/17/23, 3:58 PM

    I don't know if elegant is the word, but I often refer to the Hashicorp repos for Go, and specifically admire Mitchell's own. Very clean, well written code most of the time.
  • by nrdvana on 6/17/23, 6:24 PM

    C++

    An old one, but the FTGL library (renders truetype fonts in old-school OpenGL in half a dozen different ways: texture-per-letter, texture-per-word, 2D polygons, 3D polygons, etc) is the best example of C++ inheritance I've ever run across. The code formatting isn't my favorite, and comments are sparse, but the hierarchy of objects subclassed for all the different rendering modes is just about perfect. https://sourceforge.net/p/ftgl/code/HEAD/tree/trunk/

    Perl

    The Mojolicious / Mojo toolkit. Great minimalist API and great documentation and clean code all around. https://metacpan.org/release/SRI/Mojolicious-9.33/view/lib/M...

  • by jnordwick on 6/17/23, 10:18 PM

    arthur whitney's famous first J interpreter.

    https://www.jsoftware.com/ioj/iojATW.htm

    The first version of KDB the timeseries database was written by arthur in C to bootstrap the K interpreter that was used to write KDB. It was 26 files, named a.c to z.c, none larger than a single page of C written with notepad.exe.

  • by mo_42 on 6/17/23, 10:11 PM

    C++ isn't my favorite programming language anymore.

    I guess the Doom 3 source code is an elegant codebase. Surprisingly, they used C++ features quite sparsely.

    I think that’s nice because in programming it’s about getting the job done and not about using all sorts of features for other reasons.

  • by valenterry on 6/17/23, 5:53 PM

    fs2 (reactive streaming, https://github.com/typelevel/fs2) written in Scala. It shows how nicely things can compose in a typesafe way if the language supports it.

    And then, the opposite is Monix (https://monix.io/) also written in Scala. It's also about reactive streaming and the API is great, but the internal code is ugly because it sacrifices readability/composability for performance.

  • by agjmills on 6/17/23, 3:49 PM

    Php and Laravel
  • by culi on 6/17/23, 6:00 PM

    mentioned in this thread:

    C

    * Lua https://www.lua.org/source/

    * Redis https://github.com/redis/redis (mentioned twice)

    * Ruby https://github.com/ruby/ruby

    * SQLite https://sqlite.org/src/dir?ci=trunk

    C++

    * Botan https://github.com/randombit/botan

    * ClickHouse https://github.com/ClickHouse/ClickHouse

    Go

    * Go's standard library https://cs.opensource.google/go/go

    * HashiCorp repos, particular those by Mitchell https://github.com/hashicorp

    F#

    * jet.com repos https://github.com/jet?language=f%23

    Haskell

    * Elm compiler https://github.com/elm/compiler

    Lisp

    * CL-PPCRE https://github.com/edicl/cl-ppcre/

    * Mezzano USB driver https://github.com/froggey/Mezzano/blob/master/drivers/usb/m...

    PHP

    * Carbon https://github.com/briannesbitt/Carbon

    * Laravel https://github.com/laravel/laravel

    Python

    * pytudes (notebooks) https://github.com/norvig/pytudes

    R

    * sf (ex of functional OOP) https://github.com/r-spatial/sf

    Ruby

    * Sequel https://github.com/jeremyevans/sequel/

    * Sidekiq https://github.com/sidekiq/sidekiq

    Rust

    * ripgrep https://github.com/BurntSushi/ripgrep

    Scala

    * fs2 (ex of good type safety) https://github.com/typelevel/fs2

    * monix (ex of ugly code with great performance) https://github.com/monix/monix

    TypeScript

    * GitHub Desktop (ex of SPA) https://github.com/desktop/desktop

    ---

    thanks to: cejast, bb86754, wewxjfq, antonyt, bit, diego_moita, agjmills, freedomben, Contortion, valenterry, DethNinja, mberning, seneca

  • by diego_moita on 6/17/23, 4:14 PM

    C -> Lua, Sqlite
  • by crop_rotation on 6/17/23, 9:15 PM

    Guava -> In Java

    Golang Stdlib -> Golang

    redis -> C

    anything by armin ronacher -> Python

  • by SirensOfTitan on 6/17/23, 5:13 PM

    Anyone have any good clojure examples?
  • by neonsunset on 6/17/23, 10:32 PM

    [C#] Bitwarden server repository is probably one of the cleanest (non-novel) solution architectures I have seen so far. I always point people to it as learning material for structuring the code. It is not the most minimalistic but I feel like it strikes very good balance and does not follow blindly all the """fancy""" OOP patterns people should never use anyway.

    https://github.com/bitwarden/server

    Otherwise, I can see why people are burned, average Java/C# codebases look abysmal and written without understanding of (not) using heaps of mediator/factory/adapter/provider classes.

  • by syntheweave on 6/18/23, 2:19 AM

    The various flavors of Forth all qualify. They are great examples of how to bootstrap towards a powerful abstraction, starting from a relatively simple context. The 80% of the 80/20 rule here comes from "writing a Forth" being an engaging learning experience.

    For anyone who would like to do that, I recommend looking up one of the old standards(FIG-FORTH, FORTH-79, FORTH-83) and implementing the words in them with whatever tools and languages you like. You will hit a point where your early assumptions about the implementation are wrong. Keep going, get it to the point where you are capable of using the metaprogramming words.

  • by jessewmc on 6/17/23, 11:09 PM

    https://github.com/lua/lua

    Everything is nicely documented and pretty easy to read.

    It's a great companion if you want to learn more about how languages are implemented.

  • by smackeyacky on 6/18/23, 2:49 AM

    There were 3 different libraries that impressed me:

    1. The SunOS headers. Couldn't see the code but the number of header files was small enough in the 4.x Sun libraries that you could read them all. Mostly the documentation was terse but information dense.

    2. The TeX sources. Knuths literate programming seems like such a missed opportunity.

    3. ParcPlace VisualWorks. Due to the nature of Smalltalk you could read all the source code, learn object patterns directly from the people who invented it. A great learning experience when transitioning from C and so much better than the cfront based C++ hack.

  • by karmakaze on 6/17/23, 5:17 PM

    What I find to be elegant in a codebase is where the chosen abstractions or seams that separate subdomains/concerns is well chosen that each is handled mostly in one place without a lot of ceremony or machinery. I would call these 'well factored' for factors which are domain-specific.

    Each language has different ways that may lend itself as part of this shaping, but I like being able to think of this as something to look for when reading and strive for when writing, irrespective of the language at-hand.

  • by johncalvinyoung on 6/18/23, 4:40 AM

    Ruby - Devise. A beautifully functional gem that also makes customization of its magic really easy. Also ActiveRecord, as it's just a delight to read.
  • by hiyer on 6/18/23, 4:30 AM

    Peter Norvig's Advent of Code solutions in Python. e.g., for 2022 - https://colab.research.google.com/github/norvig/pytudes/blob...
  • by revskill on 6/19/23, 12:56 PM

    I found Zig implementation of json parsing is interesting. The code is free from hidden control flow !.

    https://github.com/hanabi1224/Programming-Language-Benchmark...

  • by arrjayh on 6/17/23, 9:19 PM

    I nominate SDL2 and DPDK. Spent a lot of time combing through these code bases, found them to be excellent.
  • by hn_throwaway_99 on 6/17/23, 9:39 PM

    Back when I first was learning about GraphQL in 2016, I remember thinking the reference implementation in JavaScript was extremely clear, easy to read, and well organized. It had a "wow, I wish all the code I read was this good" feel to it for me.
  • by mikewarot on 6/18/23, 2:37 AM

    The TurboVision library that came with Turbo Pascal was awesome. Borland really had their shit together back in the day. Then they got greedy and overpriced their product, and killed the community.
  • by spektom on 6/19/23, 6:39 PM

    I think FreeBSD code in C is very well organized:

    https://github.com/freebsd/freebsd-src

  • by bgia on 6/17/23, 5:47 PM

    C++: ClickHouse. It is very readable and easy to navigate. Call stacks that you pick up in logs are very descriptive, and you can easily tell what went wrong.
  • by waterpowder on 6/17/23, 5:08 PM

    gorilla.bas
  • by mberning on 6/17/23, 4:46 PM

    Ruby - Sidekiq
  • by personjerry on 6/17/23, 10:09 PM

    Not sure if it's still the case but about 6 years ago Facebook's folly C++ library was something I'd point to for my junior engineers to get a sense of "good" C++ https://github.com/facebook/folly
  • by DethNinja on 6/17/23, 5:17 PM

    C++ - Botan

    C - Redis

  • by MorganGallant on 6/18/23, 3:57 PM

    Not my favourite language, but LevelDB (written in C++) is really great.
  • by rkagerer on 6/17/23, 7:57 PM

    Open Hardware Monitor (OHM), in C#
  • by szundi on 6/17/23, 5:36 PM

    Swing in Java obviously
  • by dadarecit on 6/17/23, 7:26 PM

    Envoy proxy for C++
  • by NicoJuicy on 6/17/23, 4:20 PM

    My own ( c# ). It's not opensource yet, monetizing it first.

    Full DDD, it was a refactor during COVID-19 for my ecommerce.

    It powers https://belgianbrewed.com

    Copy paste from: https://news.ycombinator.com/item?id=35257225

    > But I believe the project is much cleaner and frankly better to understand than all other projects i've encountered for this size. I'm using DDD, so DDD knowledge is a requirement to navigate this in a breeze :) :

    - https://snipboard.io/D03VWg.jpg - General overview of the architecture. Small fyi: Connectors => Autogenerated nugets to call the api's

    - https://snipboard.io/9M24hB.jpg - Sample of Modules + Presentation layer

    - https://snipboard.io/ybp6EH.jpg - Example of Specifications related to catalog ( = products )

    - https://snipboard.io/lE9vcK.jpg - How specifications are translated to the infrastructure ( here I'm using EF, so I'm using Expressions a lot), but plain old SQL is also supported. A query is basically a list of AND/OR Specifications where a hierarchy is possible. This would translate to "(QUERY 1 ) AND ((QUERY 2) AND (QUERY 3))" in the Infrastructure layer.

    - https://snipboard.io/7rVBpk.jpg - . In general, i have 2 List methods ( one for Paged queries and one not for Paged queries)

    Additional fyi: Is V2, so has some legacy code. Uses my own STS. Has 2 gateways ( the ShopGateway that is used to develop new sites and the BackendGateway for the Backend). Enduser frontend is in MVC for SEO purpose, Customer backend is in Angular ( SPA). The basket is a NoSql implementation on top of SQL server.

    The enduser frontend supports a hierarchy of themes ( so it's insanely flexible to create variations of pages for other clients).

    There are more projects involved outside of this solution, eg. nuget repo's usable accross solutions (JWT, Specifications, ...) and "plugins" for a standalone project that is installed for end-users for syncing local data. So it's +101 projects :)

    Edit: my specification implementation has been open-sourced here: https://github.com/NicoJuicy/Specification-Pattern-CSharp

    Someone was curious to see it.