Alignment is almost exactly opposite of abstract math?
Math has a good quality of being checkable—you can get a paper, follow all its content and become sure that content is valid. Alignment research paper can have valid math, but be inadequate in questions such as “is this math even related to reality?”, which are much harder to check.
Wentworths own work is closest to academic math/theoretical physics, perhaps to philosophy.
are you claiming we have no way of telling good (alignment) research from bad?
And if we do, why would private funding be better at figuring this out than public funding?
To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example
Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
Alignment is almost exactly opposite of abstract math? Math has a good quality of being checkable—you can get a paper, follow all its content and become sure that content is valid. Alignment research paper can have valid math, but be inadequate in questions such as “is this math even related to reality?”, which are much harder to check.
That may be so.
Wentworths own work is closest to academic math/theoretical physics, perhaps to philosophy.
are you claiming we have no way of telling good (alignment) research from bad? And if we do, why would private funding be better at figuring this out than public funding?
To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
The canonical examples are NP problems.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
Yep, I was thinking about NP problems, though #P problems for the counting version would count as well.