Would you agree that the current paradigm is almost in direct contradiction to long-term goals?
I agree with something similar, but not this exact claim.
I think this provides a headwind that makes AIs worse at complex skills where performance can only be evaluated over long horizons. But it’s not a strong argument against pursuing long-horizon goals or any simple long-horizon behaviors.(Superhuman competence at long horizon tasks doesn’t seem necessary for either of the mechanisms I’m suggesting.)
In particular, systems trained on lots of short-horizon datapoints can still learn a lot about how the world works at larger timescales. For example, existing LMs understand quite a bit about longer-horizon dynamics of the world despite being trained on next-token prediction. Such systems can make reasonable judgments about what actions would lead to effects in the longer run. As a result I’d expect smart systems can be quickly fine-tuned to pursue long-horizon goals (or might pursue them organically), even though they don’t have any complex cognitive abilities that don’t help improve loss on the short-horizon pre-training task.
Note that people concerned about AI safety often think about this concept under the same heading of horizon length. A relatively common view is that training cost scales roughly linearly with horizon length and so AI systems will be relatively bad at long-horizon tasks (and perhaps the timeline to transformative AI may be longer than you would think based on extrapolations from competent short-horizon behavior).
There are a few dissenting views: (i) almost all long-horizon tasks have rich feedback over short horizons if you know what to look for, so in practice things that feel like “long-horizon” behaviors aren’t really, (ii) although AI systems will be worse at long-horizon tasks, so are humans and so it’s unlikely to be a major comparative advantage for AIs, most of the things we think of as sophisticated long-horizon behavior are just short-horizon cognitive behaviors (like carrying out reasoning or iterating on plans) applied to a question about long-horizons.
(My take is that most planning and “3d chess” is basically short-horizon behavior applied to long-horizon questions, but there is an important and legitimate question about how much cognitive work like “forming new concepts” or “organizing information in your head” or “coming to deeply understand an area” effectively involves longer horizons.)
I agree with something similar, but not this exact claim.
I think this provides a headwind that makes AIs worse at complex skills where performance can only be evaluated over long horizons. But it’s not a strong argument against pursuing long-horizon goals or any simple long-horizon behaviors.(Superhuman competence at long horizon tasks doesn’t seem necessary for either of the mechanisms I’m suggesting.)
In particular, systems trained on lots of short-horizon datapoints can still learn a lot about how the world works at larger timescales. For example, existing LMs understand quite a bit about longer-horizon dynamics of the world despite being trained on next-token prediction. Such systems can make reasonable judgments about what actions would lead to effects in the longer run. As a result I’d expect smart systems can be quickly fine-tuned to pursue long-horizon goals (or might pursue them organically), even though they don’t have any complex cognitive abilities that don’t help improve loss on the short-horizon pre-training task.
Note that people concerned about AI safety often think about this concept under the same heading of horizon length. A relatively common view is that training cost scales roughly linearly with horizon length and so AI systems will be relatively bad at long-horizon tasks (and perhaps the timeline to transformative AI may be longer than you would think based on extrapolations from competent short-horizon behavior).
There are a few dissenting views: (i) almost all long-horizon tasks have rich feedback over short horizons if you know what to look for, so in practice things that feel like “long-horizon” behaviors aren’t really, (ii) although AI systems will be worse at long-horizon tasks, so are humans and so it’s unlikely to be a major comparative advantage for AIs, most of the things we think of as sophisticated long-horizon behavior are just short-horizon cognitive behaviors (like carrying out reasoning or iterating on plans) applied to a question about long-horizons.
(My take is that most planning and “3d chess” is basically short-horizon behavior applied to long-horizon questions, but there is an important and legitimate question about how much cognitive work like “forming new concepts” or “organizing information in your head” or “coming to deeply understand an area” effectively involves longer horizons.)