I would love to add the YouTube video of this class to my database of safety relevant videos once it’s out.
copy and pasting channel reviews I wrote originally in my short form—this is too much content to include in a single talk, but I share it in the hope that it will be useful to make the link and perhaps the students would like to see this question itself and discussion around it (I’m a big fan of old fashioned linkweb surfing):
CPAIOR has a number of interesting videos on formal verification, how it works, and some that apply it to machine learning, eg “Safety in AI Systems—SMT-Based Verification of Deep Neural Networks”; “Formal Reasoning Methods in Machine Learning Explainability”; “Reasoning About the Probabilistic Behavior of Classifiers”; “Certified Artificial Intelligence”; “Explaining Machine Learning Predictions”; a few others. https://www.youtube.com/channel/UCUBpU4mSYdIn-QzhORFHcHQ/videos
the Schwartz Reisman Institute is a multi-agent safety discussion group, one of the very best ai safety sources I’ve seen anywhere. a few interesting videos include, for example:
“An antidote to Universal Darwinism”—https://www.youtube.com/watch?v=ENpdhwYoF5g
I would also encourage directly mentioning the recent works from Anthropic AI, stuff as this paper from this month, “Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback” https://arxiv.org/abs/2204.05862
The simons institute for theoretical computer science at UC Berkeley is a contender for my #1 recommendation from this whole list. Banger talk after banger talk after banger talk there. Several recent workshops with kickass ai safety focus. https://www.youtube.com/user/SimonsInstitute
they have a number of “boot camp” lessons that appear to be meant for an interdisciplinary advanced audience as well. the current focus of talks is on causality and games, and they also have some banger talks on “how not to run a forecasting competition”, “the invisible hand of prediction”, “communicating with anecdotes”, “the challenge of understanding what users want”, and my personal favorite due to its fundamental reframing of what game theory even is, “in praise of game dynamics”: https://www.youtube.com/watch?v=lCDy7XcZsSI
In general I have a higher error rate than some folks on less wrong and my recommendations should be considered weaker and more exploratory. but here you go, those are my exploratory recommendations, and I have lots and lots more suggestions for more capability focused stuff on my short form.
I would love to add the YouTube video of this class to my database of safety relevant videos once it’s out.
copy and pasting channel reviews I wrote originally in my short form—this is too much content to include in a single talk, but I share it in the hope that it will be useful to make the link and perhaps the students would like to see this question itself and discussion around it (I’m a big fan of old fashioned linkweb surfing):
the collective intelligence workshop from IPAM at UCLA had some recent banger talks on both human and AI network safety: https://www.youtube.com/watch?v=qhjho576fms&list=PLHyI3Fbmv0SfY5Ft43_TbsslNDk93G6jJ
I would also encourage directly mentioning the recent works from Anthropic AI, stuff as this paper from this month, “Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback” https://arxiv.org/abs/2204.05862
In general I have a higher error rate than some folks on less wrong and my recommendations should be considered weaker and more exploratory. but here you go, those are my exploratory recommendations, and I have lots and lots more suggestions for more capability focused stuff on my short form.