How could the AI gain practical understanding of long-term planning if it’s only trained on short time scales?
Writing code, how servers work, and how users behave seen like very different types of knowledge, operating with very different feedback mechanisms and learning rules. Why would you use a single, monolithic ‘AI’ to do all three?
How could the AI gain practical understanding of long-term planning if it’s only trained on short time scales?
Existing language models are trained on the next word prediction task, but they have a reasonable understanding of the long-term dynamics of the world. It seems like that understanding will continue to improve even without increasing horizon length of the training.
Writing code, how servers work, and how users behave seen like very different types of knowledge, operating with very different feedback mechanisms and learning rules. Why would you use a single, monolithic ‘AI’ to do all three?
Why would you have a single human employee do jobs that touch on all three?
Although they are different types of knowledge, many tasks involve understanding of all of these (and more), and the boundaries between them are fuzzy and poorly-defined such that it is difficult to cleanly decompose work.
So it seems quite plausible that ML systems will incorporate many of these kinds of knowledge. Indeed, over the last few years it seems like ML systems have been moving towards this kind of integration (e.g. large LMs have all of this knowledge mixed together in the same way it mixes together in human work).
That said, I’m not sure it’s relevant to my point.
To the second point, because humans are already general intelligences.
But more seriously, I think the monolithic AI approach will ultimately be uncompetitive with modular AI for real life applications. Modular AI dramatically reduces the search space. And I would contend that prediction over complex real life systems over long-term timescales will always be data-starved. Therefore being able to reduce your search space will be a critical competitive advantage, and worth the hit from having suboptimal interfaces.
Why is this relevant for alignment? Because you can train and evaluate the AI modules independently, individually they are much less intelligent and less likely to be deceptive, you can monitor their communications, etc.
I take issue with the initial supposition:
How could the AI gain practical understanding of long-term planning if it’s only trained on short time scales?
Writing code, how servers work, and how users behave seen like very different types of knowledge, operating with very different feedback mechanisms and learning rules. Why would you use a single, monolithic ‘AI’ to do all three?
Existing language models are trained on the next word prediction task, but they have a reasonable understanding of the long-term dynamics of the world. It seems like that understanding will continue to improve even without increasing horizon length of the training.
Why would you have a single human employee do jobs that touch on all three?
Although they are different types of knowledge, many tasks involve understanding of all of these (and more), and the boundaries between them are fuzzy and poorly-defined such that it is difficult to cleanly decompose work.
So it seems quite plausible that ML systems will incorporate many of these kinds of knowledge. Indeed, over the last few years it seems like ML systems have been moving towards this kind of integration (e.g. large LMs have all of this knowledge mixed together in the same way it mixes together in human work).
That said, I’m not sure it’s relevant to my point.
To the second point, because humans are already general intelligences.
But more seriously, I think the monolithic AI approach will ultimately be uncompetitive with modular AI for real life applications. Modular AI dramatically reduces the search space. And I would contend that prediction over complex real life systems over long-term timescales will always be data-starved. Therefore being able to reduce your search space will be a critical competitive advantage, and worth the hit from having suboptimal interfaces.
Why is this relevant for alignment? Because you can train and evaluate the AI modules independently, individually they are much less intelligent and less likely to be deceptive, you can monitor their communications, etc.