Primer on language models, reinforcement learning, or machine learning basics
This ones not really on-topic, but I do see value in a more “getting up to date” focus where experts can give talks or references to learn things (eg “here’s a tutorial for implementing a small GPT-2”). Though I could just periodically ask LW questions on whatever topic ends up interesting me at the moment. Though, I could do my own Google search, but I feel there’s some community value here that won’t be gained. Like learning and teaching together makes it easier for the community to coordinate in the future. Plus connections bonuses.
Timelines and forecasting
Goodhart’s law
Power-seeking
Human values
Learning from human feedback
Pivotal actions
Bootstrapping alignment
Embedded agency
Primer on language models, reinforcement learning, or machine learning basics
This ones not really on-topic, but I do see value in a more “getting up to date” focus where experts can give talks or references to learn things (eg “here’s a tutorial for implementing a small GPT-2”). Though I could just periodically ask LW questions on whatever topic ends up interesting me at the moment. Though, I could do my own Google search, but I feel there’s some community value here that won’t be gained. Like learning and teaching together makes it easier for the community to coordinate in the future. Plus connections bonuses.