Noosphere89 comments on o1 is a bad idea

Noosphere89 16 Nov 2024 17:00 UTC
3 points
1
I agree chess is an extreme example, such that I think that more realistic versions would probably develop instrumental convergence at least in a local sense.

(We already have o1 at least capable of a little instrumental convergence.)

My main substantive claim is that constraining instrumental goals such that the AI doesn’t try to take power via long-term methods is very useful for capabilities, and more generally instrumental convergence is an area where there is a positive manifold for both capabilities and alignment, where alignment methods increase capabilities and vice versa.