from Hacker News

MongoDB Releases Queryable Encryption Preview

by andrewbarba on 6/7/22, 1:09 PM with 67 comments

  • by SkyPuncher on 6/7/22, 2:50 PM

    This is a really neat technology, but I don't understand it's use case. I've worked in HealthTech and currently in the compliance space. I'm skeptical of Mongo's claims (and their familiarity with compliance laws). Kind of feels like a solution in search of a problem.

    "In use" implies that you have a need to process that data. It doesn't matter if the end client is submitting queries in plain text (protected in transit) or this fancy encryption, the client (or server) still needs to be authorized to query that data. Translating from plain-text to encryption does not add additional protections from a compliance perspective.

  • by dandraper on 6/7/22, 3:18 PM

    This feature is a result of MongoDB's acquisition of Aroki. It looks like a good product but we actually beat them to it with https://cipherstash.com/activestash

    CipherStash works with any Database and also supports Range queries and sorting/ordering. We do it in the application layer. Only supports Ruby so far but C#, Java, Python, Rust are in the works.

  • by throwaway2016a on 6/7/22, 5:27 PM

    Help me understand this...

    It says it will support prefix search, substring search, and the like. Can anyone point me in the right direction on what the algorithm may be here? I don't get how you could do those things without making the encryption less secure and/or decrypting every record the fly.

    Another interesting use case I found that isn't mentioned here is sort. I've had customers ask me to be able to sort the results by PII and we tell them... no, we can't do that because the field is encrypted.

  • by bincyber on 6/7/22, 10:40 PM

    This is really neat. Recently I explored similar functionality for relational databases and only got as far as implementing column-level encryption [0] in this Go library [1], but without support for querying the encrypted data. HashiCorp Vault's transit secrets engine supports Convergent Encryption [2] which provides limited ability to query the encrypted data, but I haven't yet experimented with it. If anyone is doing something like this in production, would love to hear about your experience.

    [0]: https://en.wikipedia.org/wiki/Column_Level_Encryption

    [1]: https://github.com/bincyber/go-sqlcrypter

    [2]: https://www.vaultproject.io/docs/secrets/transit#convergent-...

  • by eknkc on 6/7/22, 2:43 PM

    I didn't know this was a thing. The article mentions it can do equality, range, prefix, suffix and substring queries. Does this mean that the encryption scheme creates sortable 1:1 mapped results after encryption? Kind of like a shift cipher?
  • by GTP on 6/7/22, 4:40 PM

    The problem is: is also the full query encrypted or just some values that are considered sensitive? I remember a research form some years ago showing that if an attacker is still able to see the SQL code can recover the content of the database by looking at the queries, the responses and "putting the pieces together". Now, if the target was to get the exact values inside the database (think about employees wages) it still required to observe a very big number of queries, but if you were interested in getting a reasonable interval for each value then the number of queries needed become small enough to be doable in practice.

    Unfortunately I don't seem too be able to find this again, but a quick search turned out two papers that say that just encrypting your db isn't enough: [0], [1]. In particualr [1] doesn't seem to go into the details of how you could recover the data, but mentions that many operations as performed by "normal" databases leak information if performed over encrypted data. Maybe someone that is more familiar with Queryable Encryption can comment on this?

    [0] https://www.cs.cornell.edu/~shmat/shmat_hotos17.pdf [1] https://www.microsoft.com/en-us/research/wp-content/uploads/...

  • by winrid on 6/8/22, 6:44 AM

    Neat. Did they fix their blog's pagination yet? If you hit next enough times you may or may not be able to take down the site, don't ask me how I know.

    (their pagination is implemented just by increasing the limit parameter).

  • by api on 6/7/22, 3:00 PM

    Is this actually possible? Couldn't you make many repeated queries and slowly decrypt the text by e.g. slowly narrowing the range?
  • by rafaelturk on 6/7/22, 2:32 PM

    This looks really cool. Albeit feels that it is actually a feature implemented in the driver (client side) so my initial impression is that is not a meanignfull innovation on the server side. This can be implemented with any Database, even with current MongoDBs
  • by bawolff on 6/8/22, 3:32 AM

    I call bullshit.

    So let me get this right - its encrypted but you cansearch prefix and suffix?

    So all the attacker has to do is do it one letter at a time, see if it starts with A, B, C, once they figure that out, go to the next letter and so on. (I presume that the DB is not supposed to be trusted since they make such a big fuss about only being decryptable on the client side)

    Also there doesn't seem to be a whitepaper detailing algorithms or their threat model. Bitcoin scams try harder then this.

  • by Redsquare on 6/7/22, 4:28 PM

    If it is going to the likes of aws kms everytime it will blow budgets
  • by claudiug on 6/7/22, 2:42 PM

    can this be done in postgres via client or via server? I found it really nice
  • by uberdru on 6/7/22, 2:44 PM

    seriously did not think we would see homomorphic encryption productized for a few more years. pretty impressive!