from Hacker News

RavenDB 6.0.2 (A Jepsen Report)

by aphyr on 1/31/24, 3:04 PM with 69 comments

by hudo on 1/31/24, 3:53 PM
I was using Raven around ver 1-3. Even it was single node and simple app, we observed stale reads and lost writes so had to eventually migrate to SQL Server. It was really weird reading claims from Oren (expert in .NET space, famous from his great work on Nhibernate and few other frameworks), where his db didn't work as advertised at all (back then build with Esent key/value store + full text search for map/reduce, think it was lucene.net - obviously very broken tech for this purpose). Too bad, was really hoped things were fixed by now, Db has really good programming APIs.
Interesting trivia: there's "raven db done right" - https://martendb.io/ , just an API wrapper around PSQL. Named Marten because thats a natural enemy of ravens:)
by mjb on 1/31/24, 5:11 PM
As database builders and users, we’ve made talking about systems a lot harder on ourselves by conflating the ideas of replication, active-active, atomic commitment, and concurrency control.
- Replication is a technique used to achieve higher availability and durability than a single node can offer, by making multiple copies of the data. Techniques include Paxos, Raft, chain replication, quorum protocols, etc.
- Active-active means that transactions can run against multiple different replicas at the same time, while still achieving the desired level of isolation and consistency.
- Atomic commitment is a technique used in sharded/partitioned databases (which themselves exist to scale throughput or size beyond the capabilities of a single machine) to allow transactions to be atomically (“all or nothing”) committed across multiple shards (and allow one or more shards to vote “nah, let’s not commit this”). 2 phase commit (2PC) is the classic technique.
- Concurrency control is a set of techniques to implement isolation, which is needed in any database that allows concurrent sessions (single node or multi-node). Classic techniques include 2PL and OCC, but many exist.
When vendors or projects answer concurrency control questions with replication answers (which appears to be the case here), it’s worth diving deeper into those answers. There are cases where “Paxos” or “Raft” might be answers to atomic commitment or even concurrency control questions, but at best they are very partial answers and building blocks of a larger protocol. Databases that only support “single shot”/predeclared transactions can get away without a lot of concurrency control, for example, and might be able to do the required work as part of their state machine replication protocol. In general, I'd see using words like "Paxos" and "Raft" in the marketing for a database as a negative sign. It's not a fully reliable one, but it's often the least interesting part of the implementation and the choices the database is making.
To be extra clear, I’m not criticizing Aphyr here (the article clearly doesn’t conflate these concepts), but more pointing out what I think lies at the bottom of a lot of the issues we see with distributed database claims.
by CJefferson on 1/31/24, 3:30 PM
I love Jepsen, but it seriously worries me how bad software turns out to be, and how many outrageous claims companies make that turn out to be so easily proved false. Should there be more serious penalties when companies make claims which turn out to be false as soon as they are tested? I think there should be.
by Twirrim on 1/31/24, 4:03 PM
When it comes to databases, that's when I get the most conservative in tech choices. Stick with the tried and tested approaches. Data/Metadata integrity is generally the single most important thing for whatever I'm working on.
by gigatexal on 1/31/24, 3:58 PM
I love when Jepsen's reports hit HN. I always learn a ton about databases from them. Kudos to the projects brave enough to put their claims to the literal test. Jepsen is the best in the biz.
by ukd1 on 1/31/24, 4:42 PM
This aged well - https://github.com/ravendb/ravendb/issues/13218#issuecomment...
by PreInternet01 on 1/31/24, 6:24 PM
My fun RavenDB story: I briefly used it for an analytics (music royalty data, not advertising) solution somewhere late in the 1.x release series. Ayende (the initial RavenDB author) was/is an avid blogger, and made a really good case for their product in the .NET ecosystem.
It did not... go exactly as planned. Initial tests looked OK, but when I did testing with actual users, there were huge issues right away. Like: OK, I just ran your ingestion pipeline. What do you see? And the answer was 'well, nothing', or 'ehhm, a lot less than I expected'. These issues turned out to be pretty much impossible to fix: there were no real errors, but the data just seemed to... disappear randomly, even in a simple single-node cluster. I got community support involved in a bunch of particular issues, but nothing really helped: the aggregate numbers we got never added up to what they should be.
I then migrated the whole thing to a single SQLite database. That file is, as I write this, a good 2TB in size, and still performs as well as the day it was deployed and never had any unexplained-number issues, without any changes to the surrounding code. I did eventually move away from the .NET Entity Framework (as that did cause some rare, yet unexpected and hard-to-fix concurrency issues, but those were hard crashes and not silent data corruptions) to a hand-rolled entity mapper, but all has been good since then...
TL;DR: databases are very hard, and fashionable choices are not necessarily desirable.
by philipbjorge on 1/31/24, 7:11 PM
We used RavenDB 2-4 at Leafly.
Won't go into battle scars here, but this report does not surprise me. We're much happier with Postgres and Elasticsearch.
by dramm on 1/31/24, 8:39 PM
I need a brain colonic after reading though just some of the mess of overhyped claims in RavenDB marketing and documentation. I appreciate Aphyr doing all this wonderful work and how some of the Raven claims triggered that work. I'd have hoped that anybody building a critical system would have read the mess of Raven documentation/claims/hype and run the other way.
by RcouF1uZ4gsC on 1/31/24, 3:52 PM
> AP systems are known for availability, not safety;
I think in 99.9% of cases, you don't want AP. The P only matters when the network is more prone to go down than the machines. For example, if every node goes down, your AP design won't be available.
With the massive improvements in network and connectivity and increased redundancy, you should aim for CP.
If you really, really need AP, then a ground up design based on CRDTs seems the best, most discipline approach. With CRDT, you can have availability because the operations can be entirely local, and you know you can sync to the other nodes when available without conflict.
by jjirsa on 1/31/24, 4:21 PM
It’s disappointing to me that the technologist desire to experiment with new DBs continually puts naive customers at correctness and durability risk they don’t (won’t) understand.
by endisneigh on 1/31/24, 4:13 PM
I’m still waiting on their report of foundationdb, which Kyle claims would readily pass their test so they didn’t bother to do one.
by neonsunset on 1/31/24, 10:39 PM
The unfortunate thing is .NET deserves to have a proper database written in pure C# because the language offers all the tools to achieve a really performant, safe and cross-platform implementation.
But RavenDB does not do it justice and uses unsafe in catastrophic amounts in places where it is not necessary or in ways which are straight up UB despite the fact that JIT/ILC is much more strict than GCC/LLVM. There have been multiple bug reports submitted to dotnet/runtime by RavenDB which required extensive debugging effort only to end up being an issue on RavenDBs end due to explicit misuse of unsafe APIs (in ways, I must reiterate, that have safe alternatives to achieve the same performance).
(if anyone's interested, I can later ask around/dig through issue history and give the references)
by jwr on 1/31/24, 6:45 PM
Healthy reminder that a pretty website and warm fuzzies all over do not make a distributed database actually work.
I witnessed RethinkDB losing to MongoDB in spite of being significantly better. I am now worried that FoundationDB isn't gaining popularity, even though it is arguably the best and most well-tested distributed database out there, with strict serializability (!) guarantees. But it doesn't have a shiny website and doesn't cause warm fuzzies, quite the opposite, it looks complex and intimidating. So it isn't popular.
This is worrying, but perhaps neither new nor surprising: we have a history of picking inferior solutions because the good ones looked too complex or intimidating (betamax vs VHS in video formats, ATM vs Ethernet in WANs).
by JazCE on 1/31/24, 7:14 PM
Kelly Somers has been vindicated.
by CyanLite2 on 1/31/24, 8:21 PM
I gave up on RavenDB when Oren would post blog entries regarding interviews with candidates, bashing how they would fail his coding exams.
I mean who posts that kinda stuff on their public website?
by oliverpk on 1/31/24, 9:01 PM
Genuinely one of my favourite posts here
by romanovcode on 1/31/24, 3:40 PM
RavenDB is expensive pay-to-use database. I do not understand why would one choose this over Postgres.