That was the summary :P The full thing was quite a bit longer. I also didn’t want to misquote Eric.
Maybe the shorter summary is: there are two axes which we can talk about. First, will systems be transparent, modular and structured (call this CAIS-like), or will they be opaque and well-integrated? Second, assuming that they are opaque and well-integrated, will they have the classic long-term goal-directed AGI-agent risks or not?
Eric and I disagree on the first one: my position is that for any particular task, while CAIS-like systems will be developed first, they will gradually be replaced by well-integrated ones, once we have enough compute, data, and model capacity.
I’m not sure how much Eric and I disagree on the second one: I think it’s reasonable to predict that the resulting systems are specialized for particular bounded tasks and so won’t be running broad searches for long-term plans. I would still worry about inner optimizers; I don’t know what Eric thinks about that worry.
This summary is more focused on my beliefs than Eric’s, and is probably not a good summary of the intent behind the original comment, which was “what does Eric think Rohin got wrong in his summary + opinion of CAIS”, along with some commentary from me trying to clarify my beliefs.
Updates were mainly about actually carving up the space in the way above. Probably others, but I often find it hard to introspect on how my beliefs are updating.
I don’t understand why this crux needs to be dichotomous. Setting aside the opacity question for the moment, why can’t services in a CAIS be differentiable w.r.t. each other?
Example Consider a language modeling service (L) that is consumed by several downstream tasks, including various text classifiers, an auto-correction service for keyboards, and a machine translation service. In the end-to-end view, it would be wise for these downstream services to use a language representation from L and to propagate their own error information back to L so that it can improve its shared representation. Since the downstream services ultimately make up L’s raison d’etre, it will be obliged to do so.
For situations that are not so neatly differentiable, we can describe the services network as a stochastic computation graph if there is a benefit for end-to-end learning the entire system. This should lead to a slightly more precise conjecture about the relationship between the CAIS agent and utility-maximizing agent: A CAIS agent that can be described as a stochastic computation graph is equivalent to some utility-maximizing agent when trained end-to-end via approximate backpropagation.
It’s likely that CAIS agents aren’t usefully described as stochastic computation graphs, or that we may need to extend the usage of “stochastic computation graph” here to deal with services that create other services as offspring and attach them to the graph. But the possibility itself suggests a spectrum between the archetypal modular CAIS and an end-to-end CAIS, in which subgraphs of the services network are trained end-to-end. It’s not obvious to me that the CAIS as defined in the text discounts this scenario, despite Eric’s comments here.
I broadly agree, especially if you set aside opacity; I very rarely mean to imply a strict dichotomy.
I do think in the scenario you outlined the main issue would be opacity: the learned language representation would become more and more specialized between the various services, becoming less interpretable to humans and more “integrated” across services.
Can you summarize this exchange, especially what updates you made as a result of it, if any?
That was the summary :P The full thing was quite a bit longer. I also didn’t want to misquote Eric.
Maybe the shorter summary is: there are two axes which we can talk about. First, will systems be transparent, modular and structured (call this CAIS-like), or will they be opaque and well-integrated? Second, assuming that they are opaque and well-integrated, will they have the classic long-term goal-directed AGI-agent risks or not?
Eric and I disagree on the first one: my position is that for any particular task, while CAIS-like systems will be developed first, they will gradually be replaced by well-integrated ones, once we have enough compute, data, and model capacity.
I’m not sure how much Eric and I disagree on the second one: I think it’s reasonable to predict that the resulting systems are specialized for particular bounded tasks and so won’t be running broad searches for long-term plans. I would still worry about inner optimizers; I don’t know what Eric thinks about that worry.
This summary is more focused on my beliefs than Eric’s, and is probably not a good summary of the intent behind the original comment, which was “what does Eric think Rohin got wrong in his summary + opinion of CAIS”, along with some commentary from me trying to clarify my beliefs.
Updates were mainly about actually carving up the space in the way above. Probably others, but I often find it hard to introspect on how my beliefs are updating.
I don’t understand why this crux needs to be dichotomous. Setting aside the opacity question for the moment, why can’t services in a CAIS be differentiable w.r.t. each other?
Example Consider a language modeling service (L) that is consumed by several downstream tasks, including various text classifiers, an auto-correction service for keyboards, and a machine translation service. In the end-to-end view, it would be wise for these downstream services to use a language representation from L and to propagate their own error information back to L so that it can improve its shared representation. Since the downstream services ultimately make up L’s raison d’etre, it will be obliged to do so.
For situations that are not so neatly differentiable, we can describe the services network as a stochastic computation graph if there is a benefit for end-to-end learning the entire system. This should lead to a slightly more precise conjecture about the relationship between the CAIS agent and utility-maximizing agent: A CAIS agent that can be described as a stochastic computation graph is equivalent to some utility-maximizing agent when trained end-to-end via approximate backpropagation.
It’s likely that CAIS agents aren’t usefully described as stochastic computation graphs, or that we may need to extend the usage of “stochastic computation graph” here to deal with services that create other services as offspring and attach them to the graph. But the possibility itself suggests a spectrum between the archetypal modular CAIS and an end-to-end CAIS, in which subgraphs of the services network are trained end-to-end. It’s not obvious to me that the CAIS as defined in the text discounts this scenario, despite Eric’s comments here.
I broadly agree, especially if you set aside opacity; I very rarely mean to imply a strict dichotomy.
I do think in the scenario you outlined the main issue would be opacity: the learned language representation would become more and more specialized between the various services, becoming less interpretable to humans and more “integrated” across services.