from Hacker News

What happens if your CPU gets something wrong?

by krisgenre on 9/29/24, 6:12 PM with 6 comments

  • by avmich on 9/29/24, 10:50 PM

    > If cosmic rays flip bits in storage or on the network, that can be detected through error coding. But there's no analogy for a CPU that allows cheap online verification of its correctness.

    Note that CPU can be represented as mostly a set of memory cells, so some techniques for memory correction can probably be used with CPUs as well. https://bailleux.net/pub/ob-project-gray1.pdf

  • by dsamarin on 9/30/24, 7:52 AM

    Are these errors consistent for the same instructions? For example, in the same ALU, will 2+2 always equal 5, or will it spontaneously produce 5 and not happen again for a "long while"?
  • by musicale on 9/30/24, 2:13 AM

    It's not just CPUs that can cause Silent Data Corruption errors (SDCs). Essentially any chip in the system can give bad results, and those bad results are often not detected.
  • by more_corn on 9/30/24, 12:03 AM

    Funny they should mention Google. Isn’t that the company who’s chatbot regularly and consistently gets things wrong?