from Hacker News

Jan Leike joins Anthropic on their superalignment team

by icpmacdo on 5/28/24, 4:54 PM with 33 comments

  • by Lerc on 5/28/24, 5:50 PM

    I was very impressed with Anthropic's paper on Concept mapping.

    Post https://www.anthropic.com/news/mapping-mind-language-model

    Paper https://transformer-circuits.pub/2024/scaling-monosemanticit...

    This seems like a very good starting point for alignment. One could almost see a pathway to making something like the laws of robotics from here. It's a long way to go, but a good first step.

  • by mvkel on 5/28/24, 5:39 PM

    These superaligners.

    "I am breaking out on my own! Together we will do bigger and better things!!!"

    "Ok I'll join the other guys."

    I think it's pretty clear that the capital markets have next door to no interest in alignment pursuits, and only the most-funded apply a token amount of investment towards it.

  • by whimsicalism on 5/28/24, 6:17 PM

    @dang - I find topics like these quite interesting. Are they downweighted due to AI relatedness (or is twitter?) or just being flagged a lot?
  • by Imnimo on 5/28/24, 5:22 PM

    "Automated alignment research" suggests he's still interested in following the superalignment blueprint from OpenAI. So what do you do while you're waiting for the AI that's capable of doing alignment research for you to arrive? If you believe this is a viable path, what's the point of putzing around doing your own research when you'll allegedly have an army of AI researchers at your command in the near future?
  • by smountjoy on 5/28/24, 5:12 PM

    "Superalignment" is (was?) OpenAI's term, so it might be more accurate to say he is joining Anthropic to work on alignment.
  • by htrp on 5/28/24, 5:22 PM

    it's also completely theoretical, until it isn't (ref paperclip maximizers)
  • by andrewfromx on 5/28/24, 5:18 PM

    I keep getting Anthropic and Extropic (Guillaume Verdon / Beff Jezos) names mixed up. Anthropic is Claude and Extropic is Thermodynamic hardware many orders of magnitude faster and more energy efficient than CPUs/GPUs.*

    * parameterized stochastic analog circuits that implement energy-based models (EBMs). Stochastic computing is a computing paradigm that represents numbers using the probability of ones in a bitstream.