from Hacker News

So what's wrong with 1975 programming? (2008)

by nwjsmith on 12/5/12, 1:03 AM with 128 comments

by coffeemug on 12/5/12, 2:33 AM
So what happens with squids elaborate memory management is that it gets into fights with the kernels elaborate memory management, and like any civil war, that never gets anything done.
This quote, much like various scientific quantum mechanics quotes adopted by the laymen, keeps haunting honest systems programmers because people with a little bit of knowledge read it, misinterpret (or misunderstand) it, and then share it.
Look, I don't know how Squid is designed, but most database systems use this strategy and it does not get into wars with the kernel for a whole slew of reasons that aren't addressed in the article. I know, because we've done a ton of sophisticated benchmarking comparing custom use case cache performance to general purpose page cache performance. Here are a few of the many, many reasons why this quote cannot be applied to sensibly designed pieces of systems software:
1. If the database/proxy/whatever server is designed correctly, it'll always use just enough RAM that it won't go into swap. That means the kernel won't magically page out its memory preventing it from doing its job.
2. In fact, kernels provide mechanisms to guarantee this by using various mechanisms (such as mlock).
3. Also, if your process misbehaves, modern kernels will just deploy the OOM killer (depending on how things are configured), so you can't just get into fights with the page cache without being sniped.
4. Of course you have to be smart and read from the file in a way that bypasses the page cache (via DIRECT_IO). Yes, it complicates things greatly for systems programmers (all sorts of alignment issues, journal data filesystems issues, etc.) but if you want high performance, especially on SSDs, and have special use cases to warrant it, it's worth it.
5. If you really know what you're doing, a custom cache can be significantly more efficient than the general purpose kernel cache, which in turn can make significant impact on performance bottom line. For example, a b-tree aware caching scheme has to do less bookkeeping, is more efficient, and has more information to make decisions than the general purpose LRU-K cache.
In fact, it is absolutely astounding how many 1975 abstractions translate wonderfully into the world of 2012. Architecturally, almost everything that worked back then still works now, including OS research, PL research, algorithms research, and software engineering research -- the four pillars that are holding up the modern software world. Some things are obsolete, perhaps, but far, far fewer than one might think.
Incidentally, this is also one of the reasons why I cringe when people say "the world is changing so fast, it's getting harder and harder to keep up". In matters of fashion, perhaps, but as far as core principles go (in computer science, mathematics, human emotions/interaction, and pretty much everything else of consequence) the world is moving at a glacial pace. Shakespeare might be a bit clunky to read these days because the language is a bit out of style, but what Hamlet had to say in 1600 is, amazingly, just as relevant today (and likely much more useful, because instead of actually reading Hamlet, most people read things like The Purple Cow, The 22 Immutable Laws of Marketing, The 99 Immutable Laws of Leadership, etc.)
by jessedhillon on 12/5/12, 9:10 AM
The first few lines mentioned "acoustic delay lines" which piqued my interest. Wikipedia has a page on this old technology: http://en.wikipedia.org/wiki/Delay_line_memory#Acoustic_dela...
It was a pretty amazing hack, before magnetic memory cores. Because sound moved at a slow rate through a medium like mercury, an acoustic wave (that is, a sound) could be applied to one side of a volume of mercury and be expected to arrive at the other end after a predictable, useful delay. So what would be done is that a column of mercury with transducers on both ends would function as speakers and microphones, which in an acoustic medium are the equivalent of read and write heads!
The system memory would be a collection of these columns, each I guess storing one bit. The memory would of course have to be refreshed: when the signal arrived at the other end, it would be fed back into the column, assuming I suppose that there wasn't a new signal waiting to be written to that bit instead. The article mentions that this was not randomly accessible memory, but rather serially accessible. From that and other bits of information, I gather that the device would visit each bit in sequence, according to some clock, and produce a signal on the read line corresponding to the value in that bit. You had to wait for the memory device to read out the particular bit you were waiting for.
Does anyone know if this a correct understanding of how this kind of storage worked? What a cool way to store bits!
by _delirium on 12/5/12, 2:00 AM
Earlier discussion, fwiw (though it was 2 1/2 years ago): http://news.ycombinator.com/item?id=1554656
Among other things, contains an interesting alternate perspective from a former Squid developer, about some of Squid's design decisions, some of which were driven by a goal of being maximally cross-platform and compatible with all possible clients/servers. Others were driven by the fact that Unix VM systems were actually not very good much more recently than 1975, like in the 1990s.
by marshray on 12/5/12, 2:22 AM
Well, today computers really only have one kind of storage, and it is usually some sort of disk, the operating system and the virtual memory management hardware has converted the RAM to a cache for the disk storage.
I used to think that too. Specifically Windows NT was said to need a pagefile at least as large as physical RAM. This was back when a workstation might have 16MB RAM and a 1GB disk. I thought this was because the kernel might be eliminating the need for some indirection by direct mapping physical RAM addresses to pagefile addresses. I was wrong.
On the Linux side, you would typically see the recommendation to make a swap partition "twice the size off RAM". Despite the possibility of using swap files, most distros still give dire warnings if you don't define a fixed-size swap partition on installation.
I don't think there was ever a solid justification for this "twice RAM" heuristic. A better method might be something like "max amount of memory you're ever going to need minus physical RAM" or "max amount of time you're willing to be stuck in the weeds divided by the expected disk bandwidth under heavy thrashing".
Regardless, if your server is actively swapping at all you're probably doing it wrong. It's not just that swapping is slow, it's that your database or your web cache have special knowledge about the workload that, in theory, should allow it to perform caching more intelligently.
I'd prefer to disable swap entirely, but there are occasions where it can make the difference in being able to SSH into a box on which some process has started running away with CPU and RAM.
But this guy is a kernel developer so he seems to feel that the kernel should manage the "one true cache". I like the ease and performance of memory-mapped files as much as the next guy, but I wouldn't go sneering at other developers for attempting to manage their disk IO in a more hands-on fashion.
by georgemcbay on 12/5/12, 2:03 AM
tl;dr - Premature optimization is (still) the root of all evil.
I'm not familiar with squid, but I'm quite familiar with the idea of programmers writing their own systems on top of other systems that are basically a worse implementation of something the underlying system is already doing.
To my chagrin, I occasionally catch myself doing this sort of thing once in a while when I'm first moving into new language/API/concept and don't really understand what is going on underneath.
It is always a good idea to try the simplest thing that could possibly work first, and then measure it, and only then try to improve it and always make sure you measure your "improvements" against the baseline. And make sure you're measuring the right things. I think this is a concept most developers are aware of but one of those things you have to constantly checklist yourself on because it is too easy to backslide on.
by mikeash on 12/5/12, 3:02 AM
One thing to keep in mind with this talk of virtual memory is that current smartphones and tablets have basically regressed to the 1975 model when it comes to swap. That is to say, there isn't any. If you take the approach of "allocate plenty of memory and let the kernel sort out what should go to disk" then you'll end up killed by the OS if you're running on e.g. an iPhone, because there's no swap.
by crazygringo on 12/5/12, 3:48 AM
Wait a minute -- I'm not a sysadmin guy, but all the servers I've ever dealt with had swapping / virtual memory turned off. Because you'd rather a web request failed, then start churning things on disk.
When you're dealing with a web cache, don't you want to explicitly know whether your cache contents are in memory or on disk, and be able to fine-tune that? It seems like the last thing you want is the OS making decisions about memory vs disk for you. Am I missing something?
by javajosh on 12/5/12, 2:11 AM
This article is one of those interesting things that doesn't affect me directly because I don't do systems programming, but holds a great deal of fascination. I've often wondered about how the kernel allocates memory and deals with disk, and how that affects the behavior of an application that may do it's own memory allocation.
In object oriented programming there is a thing called a CRC card[1] where you list what the responsibilities of important classes are. This helps the developer visualize and understand how the system works, and to keep things as orthogonal as practical. Here we have an example of someone pointing out that the system-level "CRC cards" are stepping on each other's toes. Pretty compelling stuff.
An aside - would there be any benefit to using `go` rather than `c` for writing something like varnish if you were starting in 2012?
[1] http://en.wikipedia.org/wiki/Class-responsibility-collaborat...
by halayli on 12/5/12, 2:48 AM
I might be missing something here but assuming varnish is using mmap()+madvise(), accessing memory might block the thread until the page fault is served, which is not ideal for a user-facing server.
If you manage your own memory/swap, at least you can use async IO and free up the thread while the IO request is being served by the OS.
by antirez on 12/5/12, 8:34 AM
Sometimes it is a good idea, but it works only if:
1) You have a threaded implementation, otherwise your single thread blocks every time you access a page on the swap.
2) You have decently sized continuous objects. If instead a request involves many fragments of data from many different pages, it is not going to work well.
There are other issues but probably 1 & 2 are the most important.
by miah_ on 12/5/12, 6:27 AM
Oh joy.
“these days so small that girls get disappointed if think they got hold of something else than the MP3 player you had in your pocket.”
An otherwise interesting article.
by stcredzero on 12/5/12, 3:26 AM
Here's the thing wrong with any kind of programming. The "best" way is highly contextual. Your situation, the OS, the hardware, the problem domain, the target market -- these all change the situation and bring their own particular trade-offs. There will always be something "wrong" with the way most anyone programs from the point of view of somebody not familiar with a particular situation.
by guilloche on 12/5/12, 4:12 AM
I wish all that this guy said is true. I desperately wish a perfect virtual memory can relieve me from all the pains of caching.
Take an example, in a word processor, can we just keep all possible cursor positions (for moving the cursor around) and all line-breaking, page breaking info, each character's location in virtual memory?
by khitchdee on 12/5/12, 5:16 AM
All this abstraction with garbage collection and virtual memory etc etc is only taking us further away from the hardware. In some ways its good to think like a 1975 programmer because you are acknowledging the fact that there's hardware underneath. If you completely ignore that and rely on the abstractions provided to you by an OS layer, the end result is you get a system that uses resources very wastefully. Look at how much software has bloated in the last 35 years. A large reason for that is the amount of abstraction of the hardware and lower layers of software we've started relying on. The more abstraction you use, the easier your job becomes, but it also results in a less lean system
by taylorbuley on 12/5/12, 4:09 AM
This is a really interesting talk by the author of this article and program: http://archive.org/details/VarnishHttpCacheServer
by ccleve on 12/5/12, 3:52 AM
This article is wrong, just wrong.
I would love it if there were just one kind of storage, and my code could ignore the distinction between disk and memory. But it can't, for three reasons: 10 ms seek times, RAM that is much smaller than disk, and garbage collection.
10 ms seek times mean that fast random access across large disk files just isn't possible. There is a vast amount of literature and research devoted to getting over this specific limitation. And it isn't old, either: all of the recent work on big data is aimed at resolving the tension between sequential disk access, which is fast, and random access, which is required for executing queries.
RAM that is smaller than disk means that virtual disk files don't work very well when you have large data files. If you try to map more than the amount of physical RAM you get a mess: http://stackoverflow.com/questions/12572157/using-lots-of-ma...
Garbage collection means that it is easy to allocate a bit of memory, and then let it go when the reference goes out of scope. There's no need to explicitly deallocate it. It's one of the things that makes modern programming efficient. With disk, you don't get that; if you write something, you've got to erase it or disk fills up.
In short, this guy's casual contempt for "1975 programming" is irksome, because it's clear that he isn't working on the same class of problems that the rest of us are. He may be able to get away with virtual memory for his limited application, but the rest of us can't.
by hakaaak on 12/5/12, 2:21 AM
5.2% of the worlds top 10,000 websites use it, as of July 11, 2012: http://royal.pingdom.com/2012/07/11/how-popular-is-varnish/
So the question is- if it is so great, why only 5.2%? I'm not being sarcastic. This is a totally serious question.
by jwilliams on 12/5/12, 2:04 AM
If you take the premise of this article literally - then since 1975 computers have gotten inordinately more complex, but we've developed no abstractions to help programmers deal with it.
by mcfunley on 12/5/12, 2:00 AM
[2008]
by guilloche on 12/5/12, 6:51 AM
The article is misleading and the author has totally no clue on the complexity of user space memory management. Random on-disk virtual memory access will be a disaster if we just keep everything in so-called virtual memory without complicated cache mechanism.
by martinced on 12/5/12, 10:03 AM
I understand he's a kernel developer but to me this sounds exactly the same as people who kept repeating, since years:
"Don't create a ramdisk (a true, fixed size, one, that you prevent from ever getting to disk) because the (Linux) kernel is so good and so sentient that you won't gain anything by doing that"
Yet anyone compiling from scratch big projects made of thousands of source file know that it's much faster to write the compiled files to the ramdisk.
I can't tell how many times I've seen this argument between "pro 'kernel is sentient'" and "pro 'compile into a real ramdisk'" but I can tell you that, by experience (and it's hard to beat that), the ramdisk Just Works [TM] faster than the 'sentient kernel'.
So how is it different this time?
by smegel on 12/5/12, 1:46 AM
Fuck this guy gets on my nerves, acting like he's the only person in the world who knows what virtual memory is or that paging is some kind of dark magic only understood by kernel developers, rather than standard subject matter for any intro to computer architecture/OS concepts course.