from Hacker News

Show HN: Makeshift GPU tensor core using 64-bit CPU integer math

by bugfix-66 on 11/23/22, 8:53 PM with 1 comments

  • by bugfix-66 on 11/23/22, 8:57 PM

    This is a "broadword matrix multiplication" as described in Knuth's The Art of Computer Programming Volume 4A (exercise 55 in section 7.1.3).

    Here is a lecture where Knuth explains it:

    https://youtu.be/o22BAuQj3ds?t=1h20s

    It's efficient for 64-bit registers, but even larger registers could be used in the same way.