abramdemski comments on o1 is a bad idea

abramdemski 16 Nov 2024 16:31 UTC
LW: 3 AF: 2
0
AF
Chess is like a bounded, mathematically described universe where all the instrumental convergence stays contained, and only accomplishes a very limited instrumentality in our universe (IE chess programs gain a limited sort of power here by being good playmates).
LLMs touch on the real world far more than that, such that MCTS-like skill at navigating “the LLM world” in contrast to chess sounds to me like it may create a concerning level of real-world-relevant instrumental convergence.
- Noosphere89 16 Nov 2024 17:00 UTC
  3 points
  1
  Parent
  I agree chess is an extreme example, such that I think that more realistic versions would probably develop instrumental convergence at least in a local sense.
  
  (We already have o1 at least capable of a little instrumental convergence.)
  
  My main substantive claim is that constraining instrumental goals such that the AI doesn’t try to take power via long-term methods is very useful for capabilities, and more generally instrumental convergence is an area where there is a positive manifold for both capabilities and alignment, where alignment methods increase capabilities and vice versa.