from Hacker News

Production Twitter on one machine? 100Gbps NICs and NVMe are fast

by trishume on 1/7/23, 6:46 PM with 477 comments

by BeefWellington on 1/7/23, 9:07 PM
I'm going to preface this criticism by saying that I think exercises like this are fun in an architectural/prototyping code-golf kinda way.
However, I think the author critically under-guesses the sizes of things (even just for storage) by a reasonably substantial amount. e.g.: Quote tweets do not go against the size limit of the tweet field at Twitter. Likely they are embedding a tweet reference in some manner or other in place of the text of the quoted tweet itself but regardless a tweet takes up more than 280 unicode characters.
Also, nowhere in the article are hashtags mentioned. For a system like this to work you need some indexing of hashtags so you aren't doing a full scan of the entire tweet text of every tweet anytime someone decides to search for #YOLO. The system as proposed is missing a highly critical feature of the platform it purports to emulate. I have no insider knowledge but I suspect that index is maybe the second largest thing on disk on the entire platform, apart from the tweets themselves.
by aetimmes on 1/8/23, 3:34 PM
(Disclaimer: ex-Twitter SRE)
> There’s a bunch of other basic features of Twitter like user timelines, DMs, likes and replies to a tweet, which I’m not investigating because I’m guessing they won’t be the bottlenecks.
Each of these can, in fact, become their own bottlenecks. Likes in particular are tricky because they change the nature of the tweet struct (at least in the manner OP has implemented it) from WORM to write-many, read-many, and once you do that, locking (even with futexes or fast atomics) becomes the constraining performance factor. Even with atomic increment instructions and a multi-threaded process model, many concurrent requests for the same piece of mutable data will begin to resemble serial accesses - and while your threads are waiting for their turn to increment the like counter by 1, traffic is piling up behind them in your network queues, which causes your throughput to plummet and your latency to skyrocket.
OP also overly focuses on throughput in his benchmarks, IMO. I'd be interested to see the p50/p99 latency of the requests graphed against throughput - as you approach the throughput limit of an RPC system, average and tail latency begin to increase sharply. Clients are going to have timeout thresholds, and if you can't serve the vast majority of traffic in under that threshold consistently (while accounting for the traffic patterns of viral tweets I mentioned above) then you're going to create your own thundering herd - except you won't have other machines to offload the traffic to.
by jameshart on 1/8/23, 4:26 AM
Getting everything onto one machine works great until... it no longer fits on one machine.
You add another feature and it requires a little bit more RAM, and another feature that needs a little bit more, and.. eventually it doesn't all fit.
Now you have to go distributed.
And your entire system architecture and all your development approaches are built around the assumptions of locality and cache line optimization and all of a sudden none of that matters any more.
Or you accept that there's a hard ceiling on what your system will ever be able to do.
This is like building a video game that pushes a specific generation of console hardware to its limit - fantastic! You got it to do realtime shadows and 100 simultaneous NPCs on screen! But when the level designer asks if they can have water in one level you have to say 'no', there's no room to add screenspace reflections, the console can't handle that as well. And that's just a compromise you have to make, and ship the game with the best set of features you can cram into that specific hardware.
You certainly could build server applications that way. But it feels like there's something fundamental to how service businesses operate that pushes away from that kind of hyperoptimized model.
by TacticalCoder on 1/8/23, 12:19 AM
TFA, to me, touches about something I've wondered about a very long time ago: what are the implications of CPU and storage growing at much faster rates than human population?
Back in the 486 days you wouldn't be keeping, in RAM, data about every single human on earth (let's take "every single human on earth" as the maximum number of humans we'll offer our services to with on our hypothetical server). Nowadays keeping in RAM, say, the GPS coordinates of every single human on earth (if we had a mean to fetch the data) is doable. On my desktop. In RAM.
I still don't know what the implications are.
But I can keep the coordinates of every single humans on earth in my desktop's RAM.
Let that sink in.
P.S: no need to nitpick if it's actually doable on my desktop today. That's not the point. If it's not doable today, it'll be doable tomorrow.
by sethev on 1/7/23, 8:54 PM
John Carmack tweeted something that made me noodle on this too:
>It is amusing to consider how much of the world you could serve something like Twitter to from a single beefy server if it really was just shuffling tweet sized buffers to network offload cards. Smart clients instead of web pages could make a very large difference. [1]
Very interesting to see the idea worked out in more detail.
[1] https://twitter.com/id_aa_carmack/status/1350672098029694998
by drewg123 on 1/7/23, 9:08 PM
How much bandwidth does Twitter use for images and videos? Less than 1.4Tb/s globally? If so, we could probably fit that onto a second machine. We can currently serve over 700Gb/s from a dual-socket Milan based server[1]. I'm still waiting for hardware, but assuming there are no new bottlenecks, that should directly scale up to 1.4Tb/s with Genoa and ConnectX-7, given the IO pathways are all at least twice the bandwidth of the previous generation.
There are storage size issues (like how big is their long tail; quite large I'd imagine), but its a fun thing to think about.
[1] https://people.freebsd.org/~gallatin/talks/euro2022.pdf
by habibur on 1/7/23, 9:20 PM
He will be up for surprise.
HTTP with connection: keep-open can serve 100k req/sec. But that's for one client being served repeatedly over 1 connection. And this is the inflated number that's published in webserver benchmark tests.
For more practical down to earth test, you need to measure performance w/o keep-alive. Request per second will drop to 12k / sec then.
And that's for HTTP without encryption or ssl handshake. Use HTTPS and watch it fall down to only 400 req / sec under load test [ without connection: keep-alive ].
That's what I observer.
by summerlight on 1/8/23, 7:31 AM
I think many people in this thread are making the mistake of ignoring evolutionary factors in system engineering. If a system doesn't need to adopt or change, lots of things can be much more efficient, easier and simpler, likely the order of 10x~100x. But you gotta appreciate that we're all paid because we need to swap wheels on running trains (or even engines in flying airplanes). A large fraction of demand for redundancy, introspection, abstraction and generalization comes from this.
Why do we want to apply ML at the cost of a significant fleet cost increase? Because it can make the overall system consistently perform against external changes via generalization, thus the system can evolve more cheaply. Why do we want to implement a complex logging layer although it doesn't bring direct gains on system performance? Because you need to inspect the system to understand its behavior and find out where it needs to change. The list can go on and I can give you hundreds of reasons why we need all these apparently unnecessary complexities and overheads can be important for systems' longevity.
I don't deny the existence of accidental complexities (probably Twitter can become 2~3x simpler and cheaper given sufficient eng resource and time), but in many cases you probably won't be able to confidently say if some overheads are accidental or essential since system engineering is essentially a highly predictive/speculative activity. To make this happen, you gotta have a precise understanding of how the system "currently works" to make a good bet rather than re-imagination of the system with your own wish list of how the system "should work". There's a certain value on the latter option, but it's usually more constructive to build an alternative rather than complaining about the existing system. This post is great since the author actually tried to build something to prove its possibility, this knowledge could turn out to be valuable for other Twitter alternatives later on.
by jasonhansel on 1/7/23, 8:51 PM
If you really wanted to run Twitter on one machine at any cost, wouldn't an IBM mainframe be much more practical?
You can even run Linux on them now. The specs he cites would actually be fairly small for a mainframe, which can reach up to 40TB of memory.
I'm not saying this is a good idea, but it seems better than what the OP proposes.
by varunkmohan on 1/7/23, 8:43 PM
Good analysis. Obviously, this doesn't handle cases like redundancy and doesn't handle some of other critical workloads the company has. However, it does show how much real compute bloat these companies actually have - https://twitter.com/petrillic/status/1593686223717269504 where they use 24 million vcpus and spend 300 million a month on cloud.
by agilob on 1/7/23, 8:37 PM
The title reminded me about this https://www.phoronix.com/news/Netflix-NUMA-FreeBSD-Optimized (2019) and this 2 years later: https://papers.freebsd.org/2021/eurobsdcon/gallatin-netflix-... (2021)
by PragmaticPulp on 1/7/23, 9:27 PM
Very cool exercise. I enjoyed reading it.
I see a lot of comments here assuming that this proves something about Twitter being inefficient. Before you jump to conclusions, take a look at the author’s code: https://github.com/trishume/twitterperf
Notably absent are things like serving HTTP, not to even mention HTTPS. This was a fun exercise in algorithms, I/O, and benchmarking. It wasn’t actually imitating anything that resembles actual Twitter or even a usable website.
by mgaunard on 1/7/23, 11:23 PM
All web and cloud technologies are inherently inefficient, and most programmers don't know networking or even how hardware works sufficiently well to optimize for high througput and low-latency.
There was an article just yesterday about how Jane Street had developed an internal exchange way faster than any actual exchange by building it from the ground up, thinking about how the hardware works and how agents can interact with it.
Modern software like Slack or Twitter are just reinventing what IRC or BBS did in the past, and those were much leaner, more reliable and snappier than their modern counterparts, even if they didn't run at the same scale.
It wouldn't be surprising at all that you could build something equivalent to Twitter on just one beefy machine, maybe two for redundancy.
by samsquire on 1/7/23, 9:44 PM
I recommend this table of latency figures for any software engineer:
https://gist.github.com/jboner/2841832
Essentially IO is expensive except within a datacenter but even in a data center, you can do a lot of loop iterations in a hot loop in the time it takes to ask a server for something.
There is a whitepaper which talks about the raw throughput and performance of single core systems outperforming scalable systems. These should be required reading of those developing distributed systems.
http://www.frankmcsherry.org/assets/COST.pdf A summary: http://dsrg.pdos.csail.mit.edu/2016/06/26/scalability-cost/
by SilverBirch on 1/7/23, 9:48 PM
I think one of the under-estimated interesting points of twitter as a business is that this is the core. Yes, Twitter is 140 characters, it's got "300m users" which is probably 5m real heavy users. So yes, you could do a lot of "140 characters, a few tweets per person, few million users" on very little hardware. But that's why Twitters a shit business!
How much RAM did your advertising network need? Becuase that is what makes twitter a business! How are you building your advertiser profiles? Where are you accounting for fast roll out of a Snapchat/Instagram/BeReal/Tiktok equivalent? Oh look, your 140 characters just turned into a few hundreds megs of video that you're going to transcode 16 different ways for Qos. Ruh Roh!
How are your 1,000 engineers going to push their code to production on one machine?
Almost always the answer to "do more work" or "buy more machines" is "buy more machines".
All I'm saying is I'd change it to "Toy twitter on one machine" not Production.
by jiggawatts on 1/7/23, 10:47 PM
Something I've found a lot modern IT architects seem to ignore is "write amplification" or the equivalent effect for reads.
If you have a 1 KB piece of data that you need to send to a customer, ideally that should require less than 1 KB of actual NIC traffic thanks to HTTP compression.
If processing that 1 KB takes more than 1 KB of total NIC traffic within and out of your data centre, the you have some level of amplification.
Now, for writes, this is often unavoidable because redundancy is pretty much mandatory for availability. Whenever there's a transaction, an amplification factor of 2-3x is assumed for replication, mirroring, or whatever.
For reads, good indexing and data structures within a few large boxes (like in the article) can reduce the amplification to just 2-3x as well. The request will likely need to go through a load balancer of some sort, which amplifies it, but that's it.
So if you need to process, say, 10 Gbps of egress traffic, you need a total of something like 30 Gbps at least, but 50 Gbps for availability and handling of peaks.
What happens in places like Twitter is that they go crazy with the microservices. Every service, every load balancer, every firewall, proxy, envoy, NAT, firewall, and gateway adds to the multiplication factor. Typical Kubernetes or similar setups will have a minimum NIC data amplification of 10x on top of the 2-3x required for replication.
Now multiply that by the crazy inefficient JSON-based protocols, the GraphQL, an the other insanity layered on to "modern" development practices.
This is how you end up serving 10 Gbps of egress traffic with terabits of internal communications. This is how Twitter apparently "needs" 24 million vCPUs to host text chat.
Oh, sorry... text chat with the occasional postage-stamp-sized, potato quality static JPG image.
by bitbckt on 1/8/23, 2:39 AM
Regarding tweet distribution, I was one of the folks who built the first scalable solution to this problem at Twitter (called Haplocheirus). We used the Yahoo “Feeding Frenzy” design, pushing tweets through a Redis-backed caching layer.
Feel free to continue using that (historically-correct) answer in interviews. :P
by justapassenger on 1/7/23, 10:28 PM
Saying this is production Twitter is like saying that rsync is a Dropbox.
by Cyph0n on 1/7/23, 7:30 PM
More like “barebones, in-memory, English-only Twitter clone on one machine”.
Edit: Still a nice writeup!
by Tepix on 1/7/23, 9:09 PM
The new EPYC servers can be filled with 6TB of RAM and 96 cores per socket. Fun times.
by kissgyorgy on 1/8/23, 12:18 AM
I feel like people writing posts like this never worked in a big team at a big company on a big project. It is so obviously impossible to do this and Twitter has so many more features users will never even see, but sure, re-implement it in a couple hundred lines of Rust and Twitter will be saved...
by ricardobeat on 1/7/23, 10:31 PM
> super high performance tiering RAM+NVMe buffer managers which can access the RAM-cached pages almost as fast as a normal memory access are mostly only detailed and benchmarked in academic papers
Isn't this exactly what modern key value stores like RocksDB, LMDB etc are built for?
by morphle on 1/7/23, 8:54 PM
Why not a single FPGA with 100Gbps ethernet or pcie with NVM attached? Around $5K for the hardware and $5K for the traffic per month. The software would be a bit trickier to write, but you now get 100x performance for the same price
by keewee7 on 1/7/23, 9:21 PM
In the coming years we will probably see a lot of complicated microservice architectures be replaced by well-designed and optimized Rust (and modern C++) monoliths that use simple replication to scale horizontally.
by gravypod on 1/8/23, 12:34 AM
As a side note:
> I did all my calculations for this project using Calca (which is great although buggy, laggy and unmaintained. I might switch to Soulver) and I’ll be including all calculations as snippets from my calculation notebook.
I've always wanted an {open source, stable, unit-aware} version of something like this which could be run locally or in the browser (with persistence on a server). I have yet to find one. This would be a massive help to anyone who does systems design.
by eatonphil on 1/7/23, 8:12 PM
This is a great exercise in napkin math, even with constraints you've set for yourself that don't fully approximate Twitter (yet). Thanks!
by pengaru on 1/7/23, 10:29 PM
This post reminds me of an experience I had in ~2005 while @ Hostway Chicago.
Unsolicited story time:
Prior to my joining the company Hostway had transitioned from handling all email in a dispersed fashion across shared hosting Linux boxes with sendmail et al, to a centralized "cluster" having disparate horizontally-scaled slices of edge-SMTP servers, delivery servers, POP3 servers, IMAP servers, and spam scanners. That seemed to be their scaling plan anyways.
In the middle of this cluster sat a refrigerator sized EMC fileserver for storing the Maildirs. I forget the exact model, but it was quite expensive and exotic for the time, especially for an otherwise run of the mill commodity-PC based hosting company. It was a big shiny expensive black box, and everyone involved seemed to assume it would Just Work and they could keep adding more edge-SMTP/POP/IMAP or delivery servers if those respective services became resource constrained.
At some point a pile of additional customers were migrated into this cluster, through an acquisition if memory serves, and things started getting slow/unstable. So they go add more machines to the cluster, and the situation just gets worse.
Eventually it got to where every Monday was known as Monday Morning Mail Madness, because all weekend nobody would read their mail. Then come Monday, there's this big accumulation of new unread messages that now needs to be downloaded and either archived or deleted.
The more servers they added the more NFS clients they added, and this just increased the ops/sec experienced at the EMC. Instead of improving things they were basically DDoSing their overpriced NFS server by trying to shove more iops down its throat at once.
Furthermore, by executing delivery and POP3+IMAP services on separate machines, they were preventing any sharing of buffer caches across these embarrassingly cache-friendly when colocated services. When the delivery servers wrote emails through to the EMC, the emails were also hanging around locally in RAM, and these machines had several gigabytes of RAM - only to never be read from. Then when customers would check their mail, the POP3/IMAP servers always needed to hit the EMC to access new messages, data that was probably sitting uselessly in a delivery server's RAM somewhere.
None of this was under my team's purview at the time, but when the castle is burning down every Monday, it becomes an all hands on deck situation.
When I ran the rough numbers of what was actually being performed in terms of the amount of real data being delivered and retrieved, it was a trivial amount for a moderately beefy PC to handle at the time.
So it seemed like the obvious thing to do was simply colocate the primary services accessing the EMC so they could actually profit from the buffer cache, and shut off most of the cluster. At the time this was POP3 and delivery (smtpd), luckily IMAP hadn't taken off yet.
The main barrier to doing this all with one machine was the amount of RAM required, because all the services were built upon classical UNIX style multi-process implementations (courier-pop and courier-smtp IIRC). So in essence the main reason most of this cluster existed was just to have enough RAM for running multiprocess POP and SMTP sessions.
What followed was a kamikaze-style developed-in-production conversion of courier-pop and courier-smtp to use pthreads instead of processes by yours truly. After a week or so of sleepless nights we had all the cluster's POP3 and delivery running on a single box with a hot spare. Within a month or so IIRC we had powered down most of the cluster, leaving just spam scanning and edge-SMTP stuff for horizontal scaling, since those didn't touch the EMC. Eventually even the EMC was powered down, in favor of drbd+nfs on more commodity linux boxes w/coraid.
According to my old notes it was a Dell 2850 w/8GB RAM we ended up with for the POP3+delivery server and identical hot spare, replacing racks of comparable machines just having less RAM. >300,000 email accounts.
by spullara on 1/8/23, 5:01 AM
When I was working there I implemented my patent during a hack week (given a set of follows return the list of matching tweet ids, very similar to his prototype):
https://patents.google.com/patent/US20120136905A1/en (licensed under Innovators Patent Agreement, https://github.com/twitter/innovators-patent-agreement)
I could have definitely served all the chronological timeline requests on a normal server with lower latency that the 1.1 home timeline API. There are a bunch of numbers in the calculations that he is doing that are off but not by an order of magnitude. The big issue is that since I left back then Twitter has added ML ads, ML timeline and other features that make current Twitter much harder to fit on a machine than 2013 Twitter.
by KaiserPro on 1/8/23, 10:31 AM
> A friend points out that IBM Z mainframes have a bunch of the resiliency software and hardware infrastructure I mention,
Sure its expensive, and you have to deal with IBM, who are either domain experts or mouth breathers. Sure it'll cost you $2m but!
the opex of running a team of 20 engineers is pretty huge. Especially as most of the hard bits of redundant multi-machine scaling are solved for you by the mainframe. Redundancy comes for free(well not free, because you are paying for it in hardware/software)
Plus, IBM redbooks are the golden standard of documentation. Just look at this: https://www.redbooks.ibm.com/redbooks/pdfs/sg248254.pdf its the redbook for GPFS (scalable multi-machine filesystem, think ZFS but with a bunch more hooks.)
Once you've read that, you'll know enough to look after a cluster of storage.
by viraptor on 1/7/23, 11:31 PM
This is in no way a criticism of the analysis. But what I think is a hidden cost of an idea like this (that hasn't been pointed out) is the ability to extend the features. With a tightly integrated system like that you may want to add a frobnicator as a test - now that whole system would need to change to accommodate that, because all the timeline processing happens more or less in memory. Making things external / network based adds overhead, but makes plugging in / removing an extra feature much easier. If you count the cost of work required for making changes, then burning money on the "unnecessary" horizontal scaling may not be a bad idea. Wanna add new ads analytics? Just plug into this common firehouse/summary endpoint without worrying about the internals. Wanna test a new implementation of some component? Run both in parallel, plugging into same inputs. Etc.
by firstSpeaker on 1/8/23, 1:07 PM
This is one of the most interesting part of the whole post for me:
Through intense digging I found a researcher who left a notebook public including tweet counts from many years of Twitter’s 10% sampled “Decahose” API and discovered the surprising fact that tweet rate today is around the same as or lower than 2013! Tweet rate peaked in 2014 and then declined before reaching new peaks in the pandemic. Elon recently tweeted the same 500M/day number which matches the Decahose notebook and 2013 blog post, so this seems to be true! Twitter’s active users grew the whole time so I think this reflects a shift from a “posting about your life to your friends” platform to an algorithmic content-consumption platform.
So, the number of writes has been the same for a good long while.
by swellguy on 1/8/23, 12:37 AM
You have violated the number one rule in Silicon Valley: If it doesn't take at least "N" "engineers" to "solve" a problem who report directly to moi, then how am I relevant? So I agree this is entirely possible, but no one would build this with any funding.
by henning on 1/8/23, 5:27 AM
No rate limiting, API data, quote tweets, view count, threads, likes, mentions, notifications, ads, video, images, account blocking (permanent or TTL), account muting (permanent or TTL), word filtering (permanent or TTL), moderation/reporting, user profile storage, or the fact that tweets that display show more than just the tweet itself. No mention that tweet activity all occurs concurrently and therefore the loading script is not at all a realistic estimate of real activity.
But sure, go ahead and take this as evidence that 10 people could build Twitter as I'm sure that's what will happen to this post. If that's true, why haven't they already done so? It should only take a couple weeks and one beefy machine, right?
by siliconc0w on 1/7/23, 11:52 PM
Enjoyed the write up, would be curious to see the twitter spend broken down by functionality given all the extra stuff they do. I imagine it's a non-linear relationship where the company has to burn more and more cash with every new feature (and esp things like Advertising which you need once your spend surpasses what a simple subscription can offer), more scale adds more complexity, bureaucracy and overhead (management, hr & recruiting, legal&accounting, etc). While it's likely there is waste (some of which is inevitable, see 'overhead' above) a super bareboens twitter can maybe run within one beefy machine but a 'real' twitter ends up needing millions + lots of people.
by systemvoltage on 1/8/23, 6:51 AM
It's worth noting Stackoverflow's production arch: https://stackexchange.com/performance
by mcqueenjordan on 1/8/23, 3:42 PM
Fun thought experiment! I can't help but be reminded of the Good Will Hunting quote, though:
SEAN: So if I asked you about art you’d probably give me the skinny on every art book ever written. Michelangelo? You know a lot about him. Life’s work, political aspirations, him and the pope, sexual orientation, the whole works, right? But I bet you can’t tell me what it smells like in the Sistine Chapel. You’ve never actually stood there and looked up at that beautiful ceiling. Seen that.
by knubie on 1/7/23, 11:08 PM
> I’m not sure how real Twitter works but I think based on Elon’s whiteboard photo and some tweets I’ve seen by Twitter (ex-)employees it seems to be mostly the first approach using fast custom caches/databases and maybe parallelization to make the merge retrievals fast enough.
I think Twitter does (or at some point did) use a combination of the first and second approach. The vast majority of tweets used the first approach, but tweets from accounts with a certain threshold of followers used the second approach.
by fleddr on 1/7/23, 10:40 PM
"Through intense digging I found a researcher who left a notebook public including tweet counts from many years of Twitter’s 10% sampled “Decahose” API and discovered the surprising fact that tweet rate today is around the same as or lower than 2013! Tweet rate peaked in 2014 and then declined before reaching new peaks in the pandemic. Elon recently tweeted the same 500M/day number which matches the Decahose notebook and 2013 blog post, so this seems to be true! Twitter’s active users grew the whole time so I think this reflects a shift from a “posting about your life to your friends” platform to an algorithmic content-consumption platform."
I know it's not the core premise of the article, but this is very interesting.
I believe that 90% of tweets per day are retweets, which supports the author's conclusion that Twitter is largely about reading and amplifying others.
That would leave 50 million "original" tweets per day, which you should probably separate as main tweets and reply tweets. Then there's bots and hardcore tweeters tweeting many times per day, and you'll end up with a very sobering number of actual unique tweeters writing original tweets.
I'd say that number would be somewhere in the single digit millions of people. Most of these tweets get zero engagement. It's easy to verify this yourself. Just open up a bunch of rando profiles in a thread and you'll notice a pattern. A symmetrical amount of followers and following typically in the range of 20-200. Individual tweets get no likes, no retweets, no replies, nothing. Literally tweeting into the void.
If you'd take away the zero engagement tweets, you'll arrive at what Twitter really is. A cultural network. Not a social network. Not a network of participation. A network of cultural influencers consisting of journalists, politicians, celebrities, companies and a few witty ones that got lucky. That's all it is: some tens of thousands of people tweeting and the rest leeching and responding to it.
You could argue that is true for every social network, but I just think it's nowhere this extreme. Twitter is also the only "social" network that failed to (exponentially) grow in a period that you might as well consider the golden age of social networks. A spectacular failure.
Musk bought garbage for top dollar. The interesting dynamic is that many Twitter top dogs have an inflated status that cannot be replicated elsewhere. They're kind of stuck. They achieved their status with hot take dunks on others, but that tactic doesn't really work on any other social network.
by thriftwy on 1/7/23, 10:20 PM
I remember Stack Overflow running on a single Windows Server box and mocking fellow LAMP developers with their propensity towards having dozens of VMs to same effect.
That was some time ago, though.
by VLM on 1/8/23, 12:52 AM
Interesting optimization idea: If 95% of the users are bots, and your ML algorithms are smart enough to figure who's bot and who's bio, you could save a lot of traffic by not publishing tweets to bots as no one is going to read them anyway. Of course if that traffic included advertisements, you'd also lose 95% of your ad revenue.
The ultimate extension of this "run it all on one machine" meme would be to run the bots on the single machine along with the service.
by mizzao on 1/8/23, 4:36 AM
While you might not want to do this with actual Twitter, any sort of high-performance computing workload can run substantially faster on a single optimized machine than on a distributed computing environment.
I learned this the hard way when I was running a medium-sized MapReduce job in grad school that was over 100x faster when run as a local direct computation with some numerical optimizations.
by snotrockets on 1/8/23, 4:19 AM
I ask candidates I interview to design a certain service. Most ask about scale, to which I like to direct the question back at them: it's going to be huge. As big as Twitter. How big would that be, do you think?
Most then suggest scale that would make the service run comfortable from a not-too powerful machine, and then go to design data-center spanning distributed service.
by mattbillenstein on 1/8/23, 1:04 AM
Interesting abstract system design type problem - I think it becomes difficult if you have to shard the data though because all the assumptions about the hot set being in RAM break all the performance guarantees now I think... Which is I think basically what Twitter's existing backend has to deal with.
by z3t4 on 1/7/23, 10:32 PM
A Twitter clone could probably run in a teenagers closet, but not after it has been iterated by 10000 monkeys.
by fortran77 on 1/8/23, 2:39 PM
I've thought about this problem, too, and blocklists seem like a hard problem to implement efficiently. I have a few thousand users blocked, and several hundred keywords, phrases and emoji. How are these processed efficiently?
by jeffbee on 1/7/23, 8:42 PM
I like this kind of exercise. One thing I am not seeing is analytics, logs and so forth that as I understand it are significant portions of Twitter's production cost story.
by betimsl on 1/7/23, 11:45 PM
I'm just curious as to what kind of motherboard this personal computer is going to have? I'm asking this because of the limit on PCIe bandwidth. 100gbit NIC? How?
by sammy2255 on 1/9/23, 2:16 AM
A bit out of touch to think that the bandwidth alliance will let you push 500TB a month through them for free
by surume on 1/8/23, 3:51 PM
Ops: "One of our instances went down" Everyone else: "Gaaaahhh"
by irq on 1/8/23, 7:59 AM
Excellent article! I wish the font size on mobile was bigger.
by truth_seeker on 1/8/23, 4:52 AM
How to subdue the cost of abstraction. Very well explained !
by Halan on 1/8/23, 11:21 AM
Let’s hope Elon doesn’t read this
by castratikron on 1/8/23, 5:21 AM
Does 4chan fit on one machine?
by lightlyused on 1/7/23, 11:59 PM
It is all fun running on one machine until a capacitor leaks or something else goes south.
by wonnage on 1/7/23, 9:10 PM
This doesn’t seem to support fetching a specific tweet by id?
by sitkack on 1/7/23, 11:01 PM
I am both embarrassed and disappointed with the negativity this post has attracted.
A litanny of "gotchas", where someone attempts to best the OP. What about x, y and z? It can't possibly scale. Twitter is so much more than this, etc.
The OP isn't making the assertion that Twitter should replace their current system with a single large machine.
The whole thread paints a picture of HN like it is full of a bunch of half-educated, uncreative negative brats.
To the people that encourage a fun discussion, thank you! Great things are not built by people who only see how something cannot possibly work.
by andrewstuart on 1/7/23, 8:55 PM
Most projects I encounter these days instantly reach for kubernetes, containers and microservices or cloud functions.
I find it much more appealing to just make the whole thing run on one fast machine. When you suggest this tend to people say "but scaling!", without understanding how much capacity there is in vertical.
The thing most appealing about single server configs is the simplicity. The more simple a system easy, likely the more reliable and easy to understand.
The software thing most people are building these days can easily run lock stock and barrel on one machine.
I wrote a prototype for an in-memory message queue in Rust and ran it on the fastest EC2 instance I could and it was able to process nearly 8 million messages a second.
You could be forgiven for believing the only way to write software is is a giant kablooie of containers, microservices, cloud functions and kubernetes, because that's what the cloud vendors want you to do, and it's also because it seems to be the primary approach discussed. Every layer of such stuff add complexity, development, devops, maintenance, support, deployment, testing and (un)reliability. Single server systems can be dramatically mnore simple because you can trim is as close as possible down to just the code and the storage.
by britneybitch on 1/7/23, 8:45 PM
> colo cost + total server cost/(3 year) => $18,471/year
Meanwhile the company I just left was spending more than this for dozens of kubernetes clusters on AWS before signing a single customer. Sometimes I wonder what I'm still doing in this industry.
by jacobsenscott on 1/8/23, 2:34 AM
Nice. While there may be some impracticalities to actually doing this for twitter, 99% of the software out there could run on a fraction of a single commodity server. People complain about the carbon burn of crypto, and they are right, but I bet it is dwarfed by the carbon burn of all the shitty over-provisioned and over-architected CRUD apps running interpreted languages. Unfortunately with universities teaching python of all things we'll have (or maybe already do have) a whole generation of developers that actually have no idea how powerful a modern computer is.
I suppose there's a chance AI will get to the point where we can feed it a ruby/python/js/whatever code base and it can emit the functionally equivalent machine code as a single binary (even a microservices mess).
by throwmeup123 on 1/7/23, 6:52 PM
The title is highly misleading for some theoretical "exploration".
by kierank on 1/7/23, 7:08 PM
This is as realistic as the moon rocket in my back garden.
by twp on 1/8/23, 12:10 AM
This post solves all of the easy problems (i.e. make simple stuff go fast) and none of the hard problems (i.e. build a system that still works when other stuff breaks).
This post is perfect world thinking. We don't live in a perfect world.
by kureikain on 1/7/23, 9:42 PM
Not to the extreme of fitting everything into one machine but I have explorer the idea of separate stateless workload into its own machine.
However, the stateless workload can still operate in a read-only manner if the stateful component failed.
I run an email forwarding service[1], and one of challenge is how can I ensure the email forwarding still work even if my primary database failed.
And I come up with a design that the app boot up, and load entire routing data from my postgres into its memory data structure, and persisted to local storage. So if postgres datbase failed, as long as I have an instance of those app(which I can run as many as I can), the system continue to work for existing customer.
The app use listen/notify to load new data from postgres into its memory.
Not exactly the same concept as the artcile, but the idea is that we try to design the system in a way where it can operate fully on a single machine. Another cool thing is that it easiser to test this, instead of loading data from Postgres, it can load from config files, so essentially the core biz logic is isolated into a single machine.
---
https://mailwip.com