I think your critique would be better understood were it more concrete. For example, if you write something like
“In the paper X, authors claim that AI alignment requires the following set of assumptions {Y}, which they formalize using a set of axioms {Z}, used to prove a number of theorems {T}. However, the stated assumptions are not well motivated, because [...] Furthermore, the transition from Y to Z is not unique, because of [a counterexample]. Even if the axioms Z are granted, the theorems do not follow without [additional unstated restrictions]. Given the above caveats, the main results of the paper, while mathematically sound and potentially novel, are unlikely to contribute to the intended goal of AI Alignment because [...].”
then it would be easier for the MIRI-adjacent AI Alignment community to engage with your argument.
Thanks for you reply. I am aware of that, but I didn’t want to reduce the discussion to particular papers. I was curious about how other people read this field as a whole and what’s their opinion about it. One particular example I had in mind is the Embedded Agency post often mentioned as a good introductory material into AI Alignment. The text often mentions complex mathematical problems, such as halt problem, Godel’s theorem, Goodhart’s law, etc. in a very abrupt fashion and use these concept to evoke certain ideas. But a lot is left unsaid, e.g. if Turing completeness is evoked, is there an assumption that AGI will be deterministic state machine? Is this an assumption for the whole paper or only for that particular passage? What about other types of computations, e.g. theoretical hypercomputers? I think it would be beneficial for the field if these assumptions would be stated somewhere in the writing. You need to know what are the limitations of individual papers, otherwise you don’t know what kind of questions were actually covered previously. E.g. if this paper covers only Turing-computable AGI, it should be clearly stated so others can work on other types of computations.
I think your critique would be better understood were it more concrete. For example, if you write something like
“In the paper X, authors claim that AI alignment requires the following set of assumptions {Y}, which they formalize using a set of axioms {Z}, used to prove a number of theorems {T}. However, the stated assumptions are not well motivated, because [...] Furthermore, the transition from Y to Z is not unique, because of [a counterexample]. Even if the axioms Z are granted, the theorems do not follow without [additional unstated restrictions]. Given the above caveats, the main results of the paper, while mathematically sound and potentially novel, are unlikely to contribute to the intended goal of AI Alignment because [...].”
then it would be easier for the MIRI-adjacent AI Alignment community to engage with your argument.
Thanks for you reply. I am aware of that, but I didn’t want to reduce the discussion to particular papers. I was curious about how other people read this field as a whole and what’s their opinion about it. One particular example I had in mind is the Embedded Agency post often mentioned as a good introductory material into AI Alignment. The text often mentions complex mathematical problems, such as halt problem, Godel’s theorem, Goodhart’s law, etc. in a very abrupt fashion and use these concept to evoke certain ideas. But a lot is left unsaid, e.g. if Turing completeness is evoked, is there an assumption that AGI will be deterministic state machine? Is this an assumption for the whole paper or only for that particular passage? What about other types of computations, e.g. theoretical hypercomputers? I think it would be beneficial for the field if these assumptions would be stated somewhere in the writing. You need to know what are the limitations of individual papers, otherwise you don’t know what kind of questions were actually covered previously. E.g. if this paper covers only Turing-computable AGI, it should be clearly stated so others can work on other types of computations.
For the record: I feel that Embedded Agency is a horrible introduction to AI alignment. But my opinion is a minority opinion on this forum.
I don’t think there’s anyone putting his crecedence on hypercomputation becoming a problem. I’ve since been convinced that turing machines can do (at least) everything you can “compute”.