“Prove would work” is a much more stringent standard than “just be smarter and have the same goals.” The import of the paper is that the budding FOOMster couldn’t prove that a change would work.
Lacking a proof, if the imperfection lies in the modeling of intelligence improvement, it could be in error when believing the update will make it smarter and have the same goals.
I believe Nancy’s point is a very good one. Intelligence has generally improved in evolutionary fashion. Some things work, some don’t. People seem to picture the AI as a fully self referential optimizer, which per Wolpert would be a mistake, and more generally a mistake as I think fully recursive self reference explodes without bounds in complexity as you iterate your self referencing.
Instead, you have some simple rules applied to some subset self referentially, which seems to improve some local issue, but may fail in a wider context. In the end, you make a guess based on your current functioning and reality tells you if you were right.
I suppose it depends what you mean by ‘smarter’. I mean, code optimizations are provable, and if Löb’s theorem says you can’t safely trim a million consecutive no-ops that somehow snuck into your inner loop, then it’s a dumb theorem to use.
Developing new heuristics is a whole different kettle of fish and yes it’s a rough-and-tumble world out there.
Upon further reflection, it seems to me that the real upgrades are either going to be heuristics adopted in a continuous fashion on a Bayesian basis (software), or hardware.
And hardware contract proving is a much littler thing altogether. Basically, when DOES this theorem apply?
Why would an AI try an upgrade it couldn’t prove would work?
I believe David Wolpert (of the No Free Lunch Theorems fame) had a paper asserting the impossibility of perfect self referential modeling.
Doesn’t need to be perfect. Just be smarter and have the same goals.
“Prove would work” is a much more stringent standard than “just be smarter and have the same goals.” The import of the paper is that the budding FOOMster couldn’t prove that a change would work.
Lacking a proof, if the imperfection lies in the modeling of intelligence improvement, it could be in error when believing the update will make it smarter and have the same goals.
I believe Nancy’s point is a very good one. Intelligence has generally improved in evolutionary fashion. Some things work, some don’t. People seem to picture the AI as a fully self referential optimizer, which per Wolpert would be a mistake, and more generally a mistake as I think fully recursive self reference explodes without bounds in complexity as you iterate your self referencing.
Instead, you have some simple rules applied to some subset self referentially, which seems to improve some local issue, but may fail in a wider context. In the end, you make a guess based on your current functioning and reality tells you if you were right.
I suppose it depends what you mean by ‘smarter’. I mean, code optimizations are provable, and if Löb’s theorem says you can’t safely trim a million consecutive no-ops that somehow snuck into your inner loop, then it’s a dumb theorem to use.
Developing new heuristics is a whole different kettle of fish and yes it’s a rough-and-tumble world out there.
Upon further reflection, it seems to me that the real upgrades are either going to be heuristics adopted in a continuous fashion on a Bayesian basis (software), or hardware.
And hardware contract proving is a much littler thing altogether. Basically, when DOES this theorem apply?
If the expected gain from the upgrade, assuming it worked, outweighed the cost of the upgrade failing.
That, and there’s also the possibility that the AI’s proof might have a serious mistake.