I see two main ways to deal mathematically with these optimization processes:
1) The first is an ‘whatever-it-takes’ process that realizes a goal function ideally (in the limit). To get a feel how the mathematics looks I suggest a look at the comparable mathematics of the operational amplifier (short op-amp).
An ideal op-amp also does whatever it takes to realize the transfer function applied to the input. Non-ideal i.e. real op-amps fail this goal but one can give operating ranges by comparing the parameters of the tranfer function elements with the prameters (mostly the A_OL) of the op-amp.
I think this is a good model for the limiting case because we abstract the ‘optmization process’ as a black box and look at what it does to its goal function—namely realize it. We just can make this mathematcally precise.
2) My second model tries to model the differential equations following from EYs description of
Recursive Self-Improvement (RSI) namely the PDEs relating “Optimization slope”, “Optimization resources”, “Optimization efficiency” with actual physical quantities. I started to write the equations down and put a few into Wolfram Alpha but didn’t have time to do a comprehensive analysis. But I’d think that the resulting equations form classes of functions which could be classified by their associated complexity and risk.
1) This is an interesting approach. It looks very similar to the approach taken by the mid-20th century cybernetics movement—namely, modeling social and cognitive feedback processes with the metaphors of electrical engineering. Based on this response, you in particular might be interested in the history of that intellectual movement.
My problem with this approach is that it considers the optimization process as a black box. That seems particularly unhelpful when we are talking about the optimization process acting on itself as a cognitive process. It’s easy to imagine that such a thing could just turn itself into a superoptimizer, but that would not be taking into account what we know about computational complexity.
I think that it’s this kind of metaphor that is responsible for “foom” intuitions, but I think those are misplaced.
2) Partial differential equations assume continuous functions, no? But in computation, we are dealing almost always with discrete math. What do you think about using concepts from combinatorial optimization theory, since those are already designed to deal with things like optimization resources and optimization efficiency?
It looks very similar to the approach taken by the mid-20th century cybernetics movement
Interesting. I know a bit about cybernetics but wasn’t consciously aware of a clear analog between cognitive and electrical processes. Maybe I’m missing some background. Could you give a reference I could follow up on?
I think that it’s this [the backbox] kind of metaphor that is responsible for “foom” intuitions, but I think those are misplaced.
That is a plausible interpretation. Fooming is actually the only valid interpretation given an ideal black-box AI modelled this way. We have to look into the box which is comparable to looking at non-ideal op-amps. Fooming (on human time-scales) may still be be possible, but to determine that we have to get a handle on the math going on inside the box(es).
But in computation, we are dealing almost always with discrete math.
One could formulate discrete analogs to the continuous equations relating self-optimization steps. But I don’t think this gains much as we are not interested in the specific efficiency of a specific optimization step. That wouldn’t work anyway simply because the effect of each optimization step isn’t known precisely, not even its timing.
But maybe your proposal to use complexity results from combinatorial optimization theory for specific feedback types (between the optimization stages outlined by EY) could provide better approximations to possible speedups.
Maybe we can approximate the black-box as a set of nested interrelated boxes.
Norbert Wiener is where it all starts. This book has a lot of essays. It’s interesting—he’s talking about learning machines before “machine learning” was a household word, but envisioning it as electrical circuits.
I think that it’s important to look inside the boxes. We know a lot about the mathematical limits of boxes which could help us understand whether and how they might go foom.
Thank you for introducing me to that Concrete Mathematics book. That looks cool.
I would be really interested to see how you model this problem. I’m afraid that op-amps are not something I’m familiar with but it sounds like you are onto something.
Thank you for introducing me to that Concrete Mathematics book. That looks cool.
You’re welcome. It is the most fun math book I ever read.
I would be really interested to see how you model this problem.
Currently it is just a bunch of PDEs on paper. But I really want to write a post on this as this could provide some mathematical footing for many of the fooming debates.
One problem I’m stumbling with is the modelling of hard practical physical limits on computational processes. And I mean really practical limits that take thermodynamic into account, not these computronium bounds that are much too high. Something that takes entropic cost of replication and message transfer into account.
I see two main ways to deal mathematically with these optimization processes:
1) The first is an ‘whatever-it-takes’ process that realizes a goal function ideally (in the limit). To get a feel how the mathematics looks I suggest a look at the comparable mathematics of the operational amplifier (short op-amp).
An ideal op-amp also does whatever it takes to realize the transfer function applied to the input. Non-ideal i.e. real op-amps fail this goal but one can give operating ranges by comparing the parameters of the tranfer function elements with the prameters (mostly the A_OL) of the op-amp.
I think this is a good model for the limiting case because we abstract the ‘optmization process’ as a black box and look at what it does to its goal function—namely realize it. We just can make this mathematcally precise.
2) My second model tries to model the differential equations following from EYs description of Recursive Self-Improvement (RSI) namely the PDEs relating “Optimization slope”, “Optimization resources”, “Optimization efficiency” with actual physical quantities. I started to write the equations down and put a few into Wolfram Alpha but didn’t have time to do a comprehensive analysis. But I’d think that the resulting equations form classes of functions which could be classified by their associated complexity and risk.
And when searching for RSI look what I found:
Mathematical Measures of Optimization Power
1) This is an interesting approach. It looks very similar to the approach taken by the mid-20th century cybernetics movement—namely, modeling social and cognitive feedback processes with the metaphors of electrical engineering. Based on this response, you in particular might be interested in the history of that intellectual movement.
My problem with this approach is that it considers the optimization process as a black box. That seems particularly unhelpful when we are talking about the optimization process acting on itself as a cognitive process. It’s easy to imagine that such a thing could just turn itself into a superoptimizer, but that would not be taking into account what we know about computational complexity.
I think that it’s this kind of metaphor that is responsible for “foom” intuitions, but I think those are misplaced.
2) Partial differential equations assume continuous functions, no? But in computation, we are dealing almost always with discrete math. What do you think about using concepts from combinatorial optimization theory, since those are already designed to deal with things like optimization resources and optimization efficiency?
Interesting. I know a bit about cybernetics but wasn’t consciously aware of a clear analog between cognitive and electrical processes. Maybe I’m missing some background. Could you give a reference I could follow up on?
That is a plausible interpretation. Fooming is actually the only valid interpretation given an ideal black-box AI modelled this way. We have to look into the box which is comparable to looking at non-ideal op-amps. Fooming (on human time-scales) may still be be possible, but to determine that we have to get a handle on the math going on inside the box(es).
One could formulate discrete analogs to the continuous equations relating self-optimization steps. But I don’t think this gains much as we are not interested in the specific efficiency of a specific optimization step. That wouldn’t work anyway simply because the effect of each optimization step isn’t known precisely, not even its timing.
But maybe your proposal to use complexity results from combinatorial optimization theory for specific feedback types (between the optimization stages outlined by EY) could provide better approximations to possible speedups.
Maybe we can approximate the black-box as a set of nested interrelated boxes.
Norbert Wiener is where it all starts. This book has a lot of essays. It’s interesting—he’s talking about learning machines before “machine learning” was a household word, but envisioning it as electrical circuits.
http://www.amazon.com/Cybernetics-Second-Edition-Control-Communication/dp/026273009X
I think that it’s important to look inside the boxes. We know a lot about the mathematical limits of boxes which could help us understand whether and how they might go foom.
Thank you for introducing me to that Concrete Mathematics book. That looks cool.
I would be really interested to see how you model this problem. I’m afraid that op-amps are not something I’m familiar with but it sounds like you are onto something.
Thank you for the book. Just ordered it.
You’re welcome. It is the most fun math book I ever read.
Currently it is just a bunch of PDEs on paper. But I really want to write a post on this as this could provide some mathematical footing for many of the fooming debates.
One problem I’m stumbling with is the modelling of hard practical physical limits on computational processes. And I mean really practical limits that take thermodynamic into account, not these computronium bounds that are much too high. Something that takes entropic cost of replication and message transfer into account.