- China has trained a 10 trillion parameter language model
by MrUssek on 10/6/21, 8:29 PM, with comments
- What is your backup if the tech industry crashes?
by MrUssek on 7/1/21, 1:35 AM, with comments
- The Future of Deep Learning Is Photonic
by MrUssek on 6/30/21, 12:29 PM, with comments
- Separating MNIST digits using Optimal Transport
by MrUssek on 6/18/21, 2:31 AM, with comments
- Enigma: GPT-2 trained on 10K Nature Papers: Can you spot the difference?
by MrUssek on 5/13/21, 5:26 PM, with comments
- GShard: Scaling giant models with conditional computation and automatic sharding
by MrUssek on 7/2/20, 12:58 AM, with comments