from Hacker News

You probably don't need query builders

by mattrighetti on 1/21/25, 9:47 AM with 152 comments

  • by hyperpape on 1/25/25, 9:21 PM

    The recommended approach is to generate SQL that looks like:

        SELECT \* FROM users
        WHERE id = $1
            AND ($2 IS NULL OR username = $2)
            AND ($3 IS NULL OR age > $3)
            AND ($4 IS NULL OR age < $4)
    
    
    It's worth noting that this approach has significant dangers for execution performance--it creates a significant chance that you'll get a query plan that doesn't match your actual query. See: https://use-the-index-luke.com/sql/where-clause/obfuscation/... for some related material.
  • by dagss on 1/26/25, 7:49 AM

    At least for MSSQL: Never do this (before learning about query caches). Or at least, if you do, add (option recompile) to the query.

    For each combination of parameters to search for you may want to use a different index.

    But... the query plans are cached by query string lookup!

    So it is imperative that your search string looks different for each query plan/index being used.

    The code suggested here will pick a more or less random index (the one optimized for the parameters of the first execution) and stick with it for remaining executions, leading to bad queries for combinations of non-null that doesn't match the first query.

    You could just add a comment inside the string that was different depending on what parameters are null, but that is no less complex than just generating the query.

    PS: Of course there are situations where it fits, like if your strategy is to always use the same index to do the main scan and then filter away results from it based on postprocessing filters. Just make sure to understand this issue.

  • by orf on 1/25/25, 9:12 PM

    All of these are simple, almost unrealistic queries. Show me how to handle optional joins in the filter.

    > My naive-self in the past used to create a fancy custom deserializer function that transformed 11,22,33,44 from a String into a Vec<i64> and that is useless work that could have easily been handled by the database.

    Great, now the database has no idea what the cardinality of the IN clause is and has to generate a sub-optimal plan, because it could be 1 or it could be 10000.

    The same for a lot of the other examples.

  • by scott_w on 1/25/25, 9:10 PM

    I don’t get the point of this article. Just reading the samples, I strongly dislike this query builder because it looks flaky and difficult to parse by eye. And the examples get worse and worse.

    This isn’t an argument against query builders, that just seems like an argument to make your query builder easier to use and understand. I wouldn’t argue against programming languages by picking bad C++ libraries.

  • by bvrmn on 1/26/25, 8:23 AM

    It seems article shows the opposite argument. SQL builders are useful not to write fragile raw sql ridden with noisy filter patterns with repeated numbered placeholders which could be easily broken on refactoring. Also it's impossible to compose queries with abstracted parts.

    Shameless plug: https://github.com/baverman/sqlbind

  • by hinkley on 1/25/25, 9:20 PM

    Eventually people will have enough of Little Bobby Tables and url spoofing and then query engines won’t allow string concatenation at all.

    The only alternative I know of is to make a query engine that exactly emulates the String Interpolation syntax of the host language and can detect string concatenation in the inputs.

    But the problem with non-builders is always going to be GraphQL and advanced search boxes, where there are any of a couple dozen possible parameters and you either build one query that returns * for every unused clause or you have a factorial number of possible queries. If you don’t use a builder then Bobby always shows up. He even shows up sometimes with a builder.

  • by aswerty on 1/26/25, 9:02 AM

    I see a lot of push back against this approach. And since it is something I've been experimenting with recently, this is pretty interesting stuff. Clearly it has issues with query planning getting messed up, which is not something I had been aware of since my DB size I've been experimenting with is still only in the 10s of thousands of rows. But...

    Using raw SQL file addresses:

      1. Very difficult for devs to expose SQL injection vulnerabilities because you need to use parameters.
    
      2. Having all available filtering dimensions on a query makes it very clear what the type of filtering is for that particular query.
    
      3. Easy debugging where you can just throw your query into an SQL client and play around with the parameters.
    
      4. Very clear what the total query footprint of you application is (e.g. files all neatly listed in a dir).
    
      5. Super readable and editable.
    
      6. Code for running the SQL is pretty much: here is my query, here are my params, execute.
    
      7. Etc?
    
    So the amount of good you can get our of this approach is very high IMO.

    So an open question to anybody who is more familiar with DBs (and postgres in particular) than myself. Is there a reliable way to address the issue with this approach to querying that you all are flagging as problematic here. Because beyond the query planning issues, raw SQL files (with no building/templating) just seems to me like such a better approach to developing a db access layer.

  • by lmm on 1/26/25, 1:38 AM

        CASE
            WHEN $2 BETWEEN 0 AND 100 AND $1 > 0
                THEN (($1 - 1) * $2)
            ELSE
                50
        END
    
    What a wonderful, maintainable language for expressing logic in /s. Perfect for my COBOL on Cogs application.

    The problem with SQL has never been that it's impossible to put logic in it. The problem is that it's a classic Turing Tarpit.

  • by nixpulvis on 1/25/25, 9:03 PM

    `push_bind` covers a good deal of the concerns for a query builder, while letting us think in SQL instead of translating.

    That said, an ORM like ActiveRecord also handles joins across related tables, and helps avoid N+1 queries, while still writing consistent access to fields.

    I find myself missing ActiveRecord frequently. I know SeaORM aims to address this space, but I don't think it's there yet.

  • by dgan on 1/25/25, 9:02 PM

    Well. Query builders are composable. You can create a builder with partial query, and reuse in many queries. With sql strings, you either have to copy paste the string, or to define sql functions. It's a trade off!
  • by from-nibly on 1/25/25, 10:23 PM

    SQL isn't composable. It would be great if it was, but it isn't. So we can use query builders or write our own, but we're going to have to compose queries at some point.
  • by maximilianroos on 1/26/25, 12:40 AM

    SQL is terrible at allowing this sort of transformation.

    One benefit of PRQL [disclaimer: maintainer] is that it's simple to add additional logic — just add a line filtering the result:

      from users
      derive [full_name = name || ' ' || surname]
      filter id == 42           # conditionally added only if needed
      filter username == param  # again, only if the param is present
      take 50
  • by janlugt on 1/25/25, 9:21 PM

    Shameless plug, you can use something like pg_named_args[0] to at least have named instead of numbered arguments in your queries.

    [0] https://github.com/tandemdrive/pg_named_args

  • by oksurewhynot on 1/25/25, 11:10 PM

    I use SQlAlchemy and just generate a pydantic model that specifies which fields are allowed and what kind of filtering or sorting is allowed on them. Bonus is the resulting generated typescript client and use of the same pydantic model on the endpoint basically make this a validation issue instead of a query building issue.
  • by andybak on 1/25/25, 10:19 PM

    I assumed this meant "graphical query builders" (and who exactly is defending those!)

    Is this term Rust specific or have I slept through another change in terminology (like the day I woke up to find developers were suddenly "SWE"s)?

  • by sebazzz on 1/25/25, 9:13 PM

    Using the OR approach can actually cause some headaches. It can cause SQL Server to make an suboptimal plan for the other queries which have the same query text but due to the parameters behave completely different.
  • by nemothekid on 1/26/25, 5:50 AM

    The use of `push_bind` here is strange to me. The idomatic way would be do something like:

        let mut builder = Query::select();
    
    
    then you could (optionally) add clauses like so:

        builder.and_where(Expr::col("id").eq("A"))
    
    it shouldn't matter if a where clause exists or not, the builder should figure that out for you.

    If you are going to treat your QueryBuilder as glorified StringBuilder, then of course you won't see the value of a QueryBuilder.

  • by mojuba on 1/25/25, 9:13 PM

    You probably don't. For the same reason you don't need a builder for writing Rust programs. You just write Rust programs.
  • by davidwparker on 1/26/25, 2:46 AM

    Meta - anyone else not seeing a scrollbar on the blog? Chrome on OSX.
  • by evantbyrne on 1/26/25, 1:32 AM

    The lack of expressiveness in query builders that the author refers to in their first post as a motivation for ditching them is an easily solvable problem. It seems like most ORMs have easily solvable design issues though, and I would definitely agree that you should ditch tools that get in your way. What I've been doing is sporadically working on an _experimental_ Golang ORM called Trance, which solved this by allowing parameterized SQL anywhere in the builder through the use of interfaces. e.g.,

        trance.Query[Account].Filter("foo", "=", trance.Sql("...", trance.Param("bar"))
  • by hn_throwaway_99 on 1/25/25, 10:12 PM

    There was an essay a couple years ago that really convinced me to not use query builders, https://gajus.medium.com/stop-using-knex-js-and-earn-30-bf41... , and from that I switched to using Slonik (by the author of that blog post). There were some growing pains as the API was updated over the years, especially to support strong typing in the response, but now the API is quite stable and I love that project.
  • by lukaslalinsky on 1/26/25, 12:32 PM

    It's pretty much impossible not to end up with a lot of repeated spaghetti code, if you are doing anything beyond a really trivial single user app.

    Even for simple stuff, like each user only having permission to see parts of the database, it's essential to have a systematic way of filtering that is composable.

    I'm not a fan of ORMs and I actually like SQL and yet have been using sqlalchemy expression language (the low level part of sqlalchemy) for many many years and i wouldn't really go to SQL strings.

  • by econ on 1/26/25, 3:26 AM

    With only 4 optional Params you can just have 15 queries. Heh

    I remember back when everything was someone's idea and others would both compliment it and improve it. Now it is like things are unchangable holy scripture. Just let `Null < 42 or Null > 42 or name = Null` all be true. What is the big deal? I can barely wrap my head around joins, the extra luggage really isn't welcome. Just have some ugly pollyfills for a decade or so. All will be fine.

  • by Merad on 1/26/25, 4:12 PM

    It seems to me a big part of the problem is that the "query builder" in TFA is little more than a string builder. In the .Net world I've used SqlKata [0] and been very pleased with it. It allows you to easily dynamically build and compose queries.

    0: https://sqlkata.com/

  • by andix on 1/25/25, 10:17 PM

    I completely disagree. I love .NET Entity Framework Core. It's possible to build queries in code with a SQL-like syntax and a lot of helpers. But it's also possible to provide raw SQL to the query builder. And the top notch feature: You can combine both methods into a single query.

    Everything has it's place though. Query builders and ORMs require some effort to keep in sync with the database schema. Sometimes it's worth the effort, sometimes not.

  • by Tainnor on 1/26/25, 7:14 AM

    I don't know Rust well, is this what's known as a query builder in Rust? That's weird to me, because in other typed languages that I know, query builders are typically typesafe and don't just concatenate strings (see e.g. jOOQ for the JVM).
  • by riiii on 1/26/25, 8:42 AM

    You don't need them until you do. And when you do, you might first think that you can just hack your way around this minor inconvenience.

    Then you'll eventually learn why the road to hell is paved with good intentions.

  • by hk1337 on 1/26/25, 5:46 PM

    This is weird. When you say “query builder” I’m thinking of something associated with an ORM so it already knows the table specifics and you don’t have to initialize it with “SELECT * FROM table”.
  • by pipeline_peak on 1/26/25, 3:01 PM

    With code like this, there’s nothing in place to prevent injections.

    Where I work we use Veracode scans regularly. Trusted 3rd party query builders are necessary to prevent them.

  • by 1270018080 on 1/25/25, 10:06 PM

    Yeah I'm just going to stick with query builders.
  • by PaulHoule on 1/26/25, 7:23 PM

    The carping about if statements really gets to me.

    I mean, I get it, structures like

      if(X) {
        if(Y) {} else {
          if(Z) {
            return;
          } else {}
      ...
    
    will drive anybody crazy. For a query builder though, you should write something table driven where for instance you have a hash that maps query names to either functions or objects

       variables = { "age": where_age, "username": where_username, ... }
    
    these could be parameterized functions, e.g.

       where_username = (operator, quantity) => where("username", "text", operator, quantity)
    
    or you could have some object like

       {field_name: username, field_type: "text"}
    
    and then, say loop over the get variables so,

       username:gt
    
    gets broken into "username" and "gt" functions, and the where_username function gets these as arguments in the operator and quantity fields. Easy-peasy, wins at code golf if that's what you're after. Your "field" can be a subselect statement if you want to ask questions like "how pictures are in this photo gallery?"

    This is the kind of code that Lisp wizards wrote in the 1980s, and there's no reason you can't write it now in the many languages which contain "Lisp, the good parts."

  • by pkstn on 1/26/25, 2:11 PM

    Definitely not, if you use modern db like MongoDB :D
  • by sanderjd on 1/26/25, 2:09 AM

    Yeah of course you don't need query builders. But maybe you want them?
  • by gaeb69 on 1/27/25, 8:32 PM

    Beautifully designed blog.
  • by peteforde on 1/26/25, 8:39 AM

    A strong reminder that you'd have to yank ActiveRecord from my cold, dead hands.
  • by cookiengineer on 1/26/25, 6:59 AM

    Ah yes, the SQL injection cycle begins anew. A solved vulnerability for decades, only for the new generation of junior devs to ignore wisdom of the old generation again and introduce it anew.

    Don't ever do this. Query builders exist to sanitize inputs in a failsafe manner. SQL has so many pitfalls that tools like sqlmap [1] exist for a reason. You will never be able to catch all encoding schemes in a regex approach to filter unsanitized input.

    The examples in the blog can be exploited with a simple id set to "1 or 1=1;--" and is literally the very first web exploitation technique that is taught in highschool-level CTFs.

    sqlx can mitigate a lot of problems at compile time, but sanitization is completely ignored in the post, and should at least be mentioned. If you recommend to juniors that they don't need a query builder, tell them at least why they existed in the first place.

    [1] https://github.com/sqlmapproject/sqlmap