I’ve been recently struggling to translate my various AI safety ideas (low impact, truth for AI, Oracles, counterfactuals for value learning, etc...) into formalised versions that can be presented to the machine learning/computer science world in terms they can understand and critique.
What would be useful for me is a collaborator who knows the machine learning world (and preferably had presented papers at conferences) which who I could co-write papers. They don’t need to know much of anything about AI safety—explaining the concepts to people unfamiliar with them is going to be part of the challenge.
Looking for machine learning and computer science collaborators
I’ve been recently struggling to translate my various AI safety ideas (low impact, truth for AI, Oracles, counterfactuals for value learning, etc...) into formalised versions that can be presented to the machine learning/computer science world in terms they can understand and critique.
What would be useful for me is a collaborator who knows the machine learning world (and preferably had presented papers at conferences) which who I could co-write papers. They don’t need to know much of anything about AI safety—explaining the concepts to people unfamiliar with them is going to be part of the challenge.
The result of this collaboration should be things like the paper of Safely Interruptible Agents with Laurent Orseau of Deep Mind, and Interactive Inverse Reinforcement Learning with Jan Leike of the FHI/Deep Mind.
It would be especially useful if the collaborators were located physically close to Oxford (UK).
Let me know if you know or are a potential candidate, in the comments.
Cheers!