This post and its precusor from 2018 present a strong and well-written argument for the centrality of mathematical theory to AI alignment. I think the learning-theoretic agenda, as well as Hutter’s work on ASI safety in the setting of AIXI, currently seems underrated and will rise in status. It is fashionable to talk about automating AI alignment research, but who is thinking hard about what those armies of researchers are supposed to do? Conceivably one of the main things they should do is solve the problems that Vanessa has articulated here.
The idea of a frugal compositional language and infa-Bayesian logic seem very interesting. As Vanessa points out in Direction 2 it seems likely there are possibilities for interesting cross-fertilisation between LTA and SLT, especially in connection with Solomonoff-like ideas and inductive biases.
I have referred colleagues in mathematics interested in alignment to this post and have revisited it a few times myself.
While the agenda is aesthetically nice, and I do think AI capabilities of the future are surprisingly well-tuned to Vanessa’s agenda, I personally think ASI safety in the setting of AIXI is overrated as a way to reduce existential risk, compared to other agendas that rely on automatng research.
What do you think are better agendas? > ASI safety in the setting of AIXI is overrated as a way to reduce existential risk Could you please elaborate on this?
This post and its precusor from 2018 present a strong and well-written argument for the centrality of mathematical theory to AI alignment. I think the learning-theoretic agenda, as well as Hutter’s work on ASI safety in the setting of AIXI, currently seems underrated and will rise in status. It is fashionable to talk about automating AI alignment research, but who is thinking hard about what those armies of researchers are supposed to do? Conceivably one of the main things they should do is solve the problems that Vanessa has articulated here.
The idea of a frugal compositional language and infa-Bayesian logic seem very interesting. As Vanessa points out in Direction 2 it seems likely there are possibilities for interesting cross-fertilisation between LTA and SLT, especially in connection with Solomonoff-like ideas and inductive biases.
I have referred colleagues in mathematics interested in alignment to this post and have revisited it a few times myself.
While the agenda is aesthetically nice, and I do think AI capabilities of the future are surprisingly well-tuned to Vanessa’s agenda, I personally think ASI safety in the setting of AIXI is overrated as a way to reduce existential risk, compared to other agendas that rely on automatng research.
What do you think are better agendas?
> ASI safety in the setting of AIXI is overrated as a way to reduce existential risk
Could you please elaborate on this?