This seems like a false dichotomy. We shouldn’t think of scaling up as “free” from a complexity perspective—usually when scaling up, you need to make quite a few changes just to keep individual components working. This happens in software all the time: in general it’s nontrivial to roll out the same service to 1000x users.
I agree. But I also think there’s an important sense in which this additional complexity is mundane—if the only sorts of differences between a mouse brain and a human brain were the sorts of differences involved in scaling up a software service to 1000x users, I think it would be fair (although somewhat glib) to call a human brain a scaled-up mouse brain. I don’t think this comparison would be fair if the sorts of differences were more like the sorts of differences involved in creating 1000 new software services.
I think whether the additional complexity is mundane or not depends on how you’re producing the agent. Humans can scale up human-designed engineering products fairly easily, because we have a high-level understanding of how the components all fit together. But if you have a big neural net whose internal composition is mostly determined by the optimiser, then it’s much less clear to me. There are some scaling operations which are conceptually very easy for humans, and also hard to do via gradient descent. As a simple example, in a big neural network where the left half is doing subcomputation X and the right half is doing subcomputation Y, it’d be very laborious for the optimiser to swap it so the left half is doing Y and the right half is doing X—since the optimiser can only change the network gradually, and after each gradient update the whole thing needs to still work. This may be true even if swapping X and Y is a crucial step towards scaling up the whole system, which will later allow much better performance.
In other words, we’re biased towards thinking that scaling is “mundane” because human-designed systems scale easily (and to some extent, because evolution-designed systems also scale easily). It’s not clear that AIs also have this property; there’s a whole lot of retraining involved in going from a small network to a bigger network (and in fact usually the bigger network is trained from scratch rather than starting from a scaled-up version of the small one).
I agree. But I also think there’s an important sense in which this additional complexity is mundane—if the only sorts of differences between a mouse brain and a human brain were the sorts of differences involved in scaling up a software service to 1000x users, I think it would be fair (although somewhat glib) to call a human brain a scaled-up mouse brain. I don’t think this comparison would be fair if the sorts of differences were more like the sorts of differences involved in creating 1000 new software services.
I think whether the additional complexity is mundane or not depends on how you’re producing the agent. Humans can scale up human-designed engineering products fairly easily, because we have a high-level understanding of how the components all fit together. But if you have a big neural net whose internal composition is mostly determined by the optimiser, then it’s much less clear to me. There are some scaling operations which are conceptually very easy for humans, and also hard to do via gradient descent. As a simple example, in a big neural network where the left half is doing subcomputation X and the right half is doing subcomputation Y, it’d be very laborious for the optimiser to swap it so the left half is doing Y and the right half is doing X—since the optimiser can only change the network gradually, and after each gradient update the whole thing needs to still work. This may be true even if swapping X and Y is a crucial step towards scaling up the whole system, which will later allow much better performance.
In other words, we’re biased towards thinking that scaling is “mundane” because human-designed systems scale easily (and to some extent, because evolution-designed systems also scale easily). It’s not clear that AIs also have this property; there’s a whole lot of retraining involved in going from a small network to a bigger network (and in fact usually the bigger network is trained from scratch rather than starting from a scaled-up version of the small one).