from Hacker News

Ash HN: What are some good resources on building a relational database?

by ashwin110 on 7/14/24, 10:20 PM with 14 comments

I was hoping to build a simple relational database as a side project, focussing mainly on learning how the internals and the algorithms used work, as I never plan to make this a published product of any sort.

So far I have the CMU Advanced DB course (https://15721.courses.cs.cmu.edu/spring2024/) and the Database Internals book.

While I'm learning a lot about how databases work, I have no clue how to start writing my own, so I was wondering if there were any resources for building a relational database, I've only found some for KV Stores. Hopefully something less intimidating to get started than having to read SQLite code.

  • by pvg on 7/14/24, 11:34 PM

    Comes up a fair bit in Ask HN - https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

    You can widen the search a bit by taking out 'implementation' and trying some other terms like 'book', 'internals', etc.

  • by avinassh on 7/15/24, 1:41 PM

    You may check https://cstack.github.io/db_tutorial which teaches writing an SQLite compatible database from scratch in C.

    I know you mentioned about RDBMS, but may I introduce you to a structured path for building a KV Store, which can be a foundation for a RDBMS? My project is in TDD fashion with the tests. So, you start with simple functions, pass the tests, and the difficulty level goes up. When all the tests pass, you will have written a persistent key-value store.

    https://github.com/avinassh/py-caskdb

  • by jasfi on 7/15/24, 9:59 AM

    https://www.youtube.com/@CMUDatabaseGroup

    They publish their latest course videos every year, during the year. Andy Pavlo is highly knowledgeable about the field.

  • by gerdesj on 7/14/24, 11:39 PM

    "This course is a comprehensive study of the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. The course is appropriate for graduate students in software systems and for advanced undergraduates with dirty systems programming skills. "

    That class drops a few buzz words in its advert: OLAP and dirty!

    What sort of "simple" RDBMS are you envisioning that is different from the current lot?

  • by DemocracyFTW2 on 7/15/24, 2:55 AM

    I'd have a look at DuckDB as well, looks like they're doing a great job with their useful, practical and successful innovation and a ton of interesting differentiating design decisions; I hear that's on top of SQLite, is that right? They must have a fair amount of code of their own regardless.

    Then there's also some projects who have tried to port or re-create SQLite in Rust.

  • by apavlo on 7/15/24, 4:21 PM

    You want our Intro DB Systems course not the Advanced one:

    https://15445.courses.cs.cmu.edu

    Lectures start next month. Or you can watch previous years. Learn to walk before you run.

  • by smitty1e on 7/15/24, 1:34 AM

    I was going to suggest the SQLite source code.

    One could probably go quite a ways in bare python with lists of dataclasses and pickles, never mind the performance.

    That's your backend.

    Then you might find some prior art in the way of a SQL parser for a front end.

  • by philonoist on 7/15/24, 1:23 AM

    Back in the day the hardcore passionate lovers used to recommend CJ Date.
  • by joshbochu on 7/14/24, 11:26 PM