from Hacker News

The Itanic Saga: The History of VLIW and Itanium

by blakespot on 1/23/24, 12:14 AM with 72 comments

by cpr on 1/23/24, 9:57 PM
Was at Multiflow (Yale spinoff with Josh Fisher and John O'Donnell) '85-90 and saw the VLIW problem up close (was in the OS group, eventually running it).
The main problem was compiler complexity -- the hoped-for "junk parallelism" gains really never panned out (maybe 2-3X?), so the compiler was best when it could discover, or be fed, vector operations.
But Convex (main competitor at the time) already had the "minisupercomputer vector" market locked up.
So Multiflow folded in early '90 (I had already bailed, seeing the handwriting mural) after burning through $60M in VC, which was a record at the time, I believe.
by ghaff on 1/24/24, 2:44 AM
I actually have a short book on the Itanic/Itanium done and planned to have it released as a free download by now. But various schedule-related stuff happened and it just hasn't happened yet.
I was a mostly hardware-focused industry analyst during Itanium's heyday so I find the topic really interesting. From a technical perspective, compilers (and dependency on them) certainly played a role but there were a bunch of other lessons too around market timing, partner strategies, fighting the last war, etc.
by quic_bcain on 1/24/24, 2:35 AM
A modern history of VLIW should also include mention the Hexagon FSP architecture used by Qualcomm in its SoCs.
With a smaller target market it's probably more sustainable than Itanium was.
Disclaimer: Qualcomm employee working on hexagon toolchain.
by mjevans on 1/24/24, 9:46 AM
VLIW reminded me of Transmeta, but unfortunately...
"For Sun, however, their VLIW project was abandoned. David Ditzel left Sun and founded Transmeta along with Bob Cmelik, Colin Hunter, Ed Kelly, Doug Laird, Malcolm Wing and Greg Zyner in 1995. Their new company was focused on VLIW chips, but that company is a story for another day."
by chx on 1/24/24, 12:21 PM
> These delays didn’t stop the hypetrain.
This is an understatement. From an older article "How the Itanium killed the Computer Industry" https://www.pcmag.com/archive/how-the-itanium-killed-the-com...
> In 1997 Intel was the king of the hill; in that year it first announced the Itanium or IA-64 processor. That same year, research company IDC predicted that the Itanium would take over the world, racking up $38 billion in sales in 2001.
> What we heard was that HP, IBM, Dell, and even Sun Microsystems would use these chips and discontinue anything else they were developing. This included Sun making noise about dropping the SPARC chip for this thing—sight unseen. I say "sight unseen" because it would be years before the chip was even prototyped. The entire industry just took Intel at its word that Itanium would work as advertised in a PowerPoint presentation.
And then the original article has an Intel leader saying "Everything was new. When you do that, you're going to stumble". Yeah, much as Intel stumbled with the Pentium IV and basically everything since Skylake in 2015 (which was late). Let's emphasize this: for near ten years now, Intel can't deliver on time and on target. Just last year, Sapphire Rapids after being late by two years shipped in 2023 March and needed to pause in June because of a bug. Meteor Lake was also two years late. In 2020 https://www.zdnet.com/article/intels-7nm-product-transition-...
> Intel's first 7nm product, a client CPU, is now expected to start shipping in late 2022 or early 2023, CEO Bob Swan said on a conference call Thursday.
> The yield of Intel's 7nm process is now trending approximately 12 months behind the company's internal target.
Well then the internal target must've been late 2021 and it came out late 2023.
by gregw2 on 1/23/24, 11:00 AM
I’d be interested in understanding why the compilers never panned out but have never seen a good writeup on that. Or why people thought the compilers would be able to succeed in the first place at the mission.
by Findecanor on 1/24/24, 3:08 PM
"Something of a tragedy: the Itanium was Bob Rau's design, and he died before he had a chance to do it right. His original efforts wound up being taken over for commercial reasons and changed into a machine that was rather different than what he had originally intended and the result was the Itanium. While it was his machine in many ways, it did not reflect his vision."
Quote from Ivan Goddard of Mill Computing: https://www.youtube.com/watch?v=JS5hCjueqQ0&t=4054s
Bob Rau: https://en.wikipedia.org/wiki/Bob_Rau
by flakiness on 1/24/24, 1:01 PM
VLIW is everywhere in client side ML accelerator space for some reason.
Another comment mentioned Snapdragon's Hexagon, which they try to rebrand as NPU with some Mat-mul circuits.
Intel Core's NPU, which is based on Movidius VPU, also has a VLIW based core in it. It is called SHAVE.
And AMD's XDNA NPU, which is based on Xilinx Alveo, also has a VLIW based core they call AI-Engine.
by lastgeniusua on 1/24/24, 7:14 PM
the total lack of sources and references (other than to the articles on this very blog) is annoying to say the least. is there anything at all to read on this alleged Elbrus influence on Itanium plans, in Russian or English?
by dsand on 1/24/24, 7:42 PM
HP partnered with Intel to bring HP's Playdoh vliw architecture to market, because HP could not afford to continue investing in new leading-edge fabs. Compaq/DEC similarly killed Alpha shortly before getting acquired by HP, because Compaq could not afford its own new leading edge fab either. SGI spun off its MIPS division and switched to Itanium for the same reason -- fabs were getting too expensive for low-volume parts. The business attraction wasn't Itanium's novel architecture. It was the prospect of using the high-volume most profitable fab lines in the world. But ironically, Itanium never worked well enough to sell in enough volumes to pay its way in either fab investments or in design teams.
The entire Itanium saga was based on the theory that dynamic instruction scheduling via OOO hardware could not be scaled up to high IPC with high clock rates. Lots of academic papers said so. VLIW was sold as a path to get high IPC with short pipelines and fast cycle times and less circuit area. But Intel's own x86 designers then showed that OOO would indeed work well in practice, better than the papers said. It just took huge design teams and very high circuit density, which the x86 product line could afford. That success doomed the Itanium product line, all by itself.
Intel did not want its future to lie with an extended x86 architecture shared with AMD. It wanted a monopoly. It wanted a proprietary, patented, complicated architecture that no one could copy, or even retarget its software. That x86-successor arch could not be yet another RISC, because those programs are too easy to retarget to another assembler language. So, way beyond RISC, and every extra gimmick like rotating register files was a good thing, not a hindrance to clock speeds and pipelines and compilers.
HP's Playdoh architecture came from its HP Labs, as had the very successful PARISC before it. But the people involved were all different. And they could make their own reputations only by doing something very different from PARISC. They sold HP management on this adventure without proving that it would work for business and other nonnumerical workloads.
VLIW had worked brilliantly in numerical applications like Floating Point Systems' vector coprocessor. Very long loop counts, very predictable latencies, and all software written by a very few people. VLIW continues to thrive today in the DSP units inside all cell phone SOCs. Josh Fisher thought his compiler techniques could extract reliable instruction-level parallelism from normal software with short-running loops, dynamically-changing branch probabilities, and unpredictable cache misses. Fisher was wrong. OOO was the technically best answer to all that, and upward compatible with massive amounts of existing software.
Intel planned to reserve the high-margin 64-bit server market for Itanium, so it deliberately held back its x86 team from going to market with their completed 64 bit extensions. AMD did not hold back, so Intel lost control of the market it intended for Itanium.
Itanium chips were targeted only for high-end systems needing lots of ILP concurrency. There was no economic way to make chips with less ILP (or much more ILP), so no Itanium chips cheap and low-power enough to be packaged as development boxes for individual open-source programmers like Torvalds. This was only going to market via top-down corporate edicts, not bottom-up improvements.
The first-gen Itanium chip, Merced, included a modest processor for directly executing x86 32-bit code. This ran much slower than Intel's contemporary cheap x86 chips, so no one wanted that migration route. It also ran slower than using static translation from x86 assembler code to Itanium native code. So HP dropped that x86 portion from future Itanium chips. Itanium had to make it on its own via its own native-built software. The large base of x86 software was of no help. In contrast, DEC designed Alpha and migration tools so that Alpha could efficiently run VAX object code at higher speeds than on any VAX.
by DeathArrow on 1/24/24, 11:25 AM
Is there a way for VLIW to succeed in generic computing? Or is impossible?
by bee_rider on 1/24/24, 5:08 AM
When will compilers be good enough to take another swing at VLIW?