f you see the world just in terms of math, it’s even worse; you’ve got some program with inputs from a USB cable connecting to a webcam, output to a computer monitor, and optimization criteria expressed over some combination of the monitor, the humans looking at the monitor, and the rest of the world. It’s a whole lot easier to call what’s inside a ‘planning Oracle’ or some other English phrase than to write a program that does the optimization safely without serious unintended consequences. Show me any attempted specification, and I’ll point to the vague parts and ask for clarification in more formal and mathematical terms, and as soon as the design is clarified enough to be a hundred light years from implementation instead of a thousand light years, I’ll show a neutral judge how that math would go wrong.
Reminds of you telling that AIXI would kill everyone (as opposed to finding a way to have it’s button pressed once (no value for holding it forever btw) and that’s it). Not convinced you can process much more complicated specification any better at any time in the future. Killing mankind is a very specific type of ‘go wrong’; just as true multi-kiloton nuclear explosion is a very specific type of nuclear power plant accident. You need enormous set of things to be exactly right without slightest fault, for this type of wrong.
also, btw: the optimization criteria over the real world—rather than over the map—is very problematic concept, and i’m getting impression that the belief that it is doable is a product of some map territory error. The optimization criteria of real software would be over it’s internal model of people looking at the monitor, and it’s model of the rest of the world. Imprecise model, because it has to outrun the real world. You have to have ideal concept of the world somewhere inside machine, to get anywhere close to killing mankind as solution to anything, just as you need very carefully arranged highly precise explosives around empty sphere of highly enriched uranium or plutonium to make true nuclear explosion in kilotons range. Both the devil, and the angels, are in the details. If you selectively ignore details you can argue for anything.
edit: added link. I recall seeing much later posts to the same tune. Seriously dude, you are useless; you couldn’t even drop your anthropomorphizing ‘what would i do in its shoes’ when presented with clean simple AIXI-tl that has as part of it solution space ‘get reward and then get destroyed’ and probably even ‘get destroyed then get reward’ due to it not being proper mind. Don’t you go on how the math would work out to what you believe. It won’t.
finding a way to have it’s button pressed once (no value for holding it forever btw)
Page 6, third sentence: “The task of the agent is to maximize its utility, defined as the sum of future rewards.”
The reward here is a function of the input string. So what maximizes utility for AIXI is receiving some high-reward string for all future time steps, so that the sum of future rewards is maximized.
That’s what you get when you skip the math, which is available, and go on reasoning in the fuzzy and largely irrelevant concepts based on verbal description that is rather imprecise. Which is what EY expressed dissatisfaction with, but which is what he is most guilty of.
edit: Actually I think EY changed his mind on the dangerousness of AIXI, which is an enormous plus point for him, but the one that should come with a penalty: meta-understanding of the difficulties involved and the tendency to put-yourself-in-its-shoes-tasked-with-complying-with-verbal-description, instead of understanding the math (as well as which should come with understanding that failure doesn’t imply everyone dies). Anyhow, the issue is that the AI doing something unintended due to a flaw is a far cry from killing mankind, as far as nuclear power plant design having a flaw is from nuclear power plant suffering multikiloton explosion. The FAI is a work on a non-suicidal AI; akin to a work on unmoderated fast neutron nuclear reactor with positive thermal coefficient of reactivity (and a ‘proof’ that the control system is perfect). One could switch the goalposts and argue that nonminds like AIXI are not true AGI; that’s about as interesting as arguing that submarines don’t really swim.
Reminds of you telling that AIXI would kill everyone (as opposed to finding a way to have it’s button pressed once (no value for holding it forever btw) and that’s it). Not convinced you can process much more complicated specification any better at any time in the future. Killing mankind is a very specific type of ‘go wrong’; just as true multi-kiloton nuclear explosion is a very specific type of nuclear power plant accident. You need enormous set of things to be exactly right without slightest fault, for this type of wrong.
also, btw: the optimization criteria over the real world—rather than over the map—is very problematic concept, and i’m getting impression that the belief that it is doable is a product of some map territory error. The optimization criteria of real software would be over it’s internal model of people looking at the monitor, and it’s model of the rest of the world. Imprecise model, because it has to outrun the real world. You have to have ideal concept of the world somewhere inside machine, to get anywhere close to killing mankind as solution to anything, just as you need very carefully arranged highly precise explosives around empty sphere of highly enriched uranium or plutonium to make true nuclear explosion in kilotons range. Both the devil, and the angels, are in the details. If you selectively ignore details you can argue for anything.
edit: added link. I recall seeing much later posts to the same tune. Seriously dude, you are useless; you couldn’t even drop your anthropomorphizing ‘what would i do in its shoes’ when presented with clean simple AIXI-tl that has as part of it solution space ‘get reward and then get destroyed’ and probably even ‘get destroyed then get reward’ due to it not being proper mind. Don’t you go on how the math would work out to what you believe. It won’t.
Page 6, third sentence: “The task of the agent is to maximize its utility, defined as the sum of future rewards.”
The reward here is a function of the input string. So what maximizes utility for AIXI is receiving some high-reward string for all future time steps, so that the sum of future rewards is maximized.
That’s what you get when you skip the math, which is available, and go on reasoning in the fuzzy and largely irrelevant concepts based on verbal description that is rather imprecise. Which is what EY expressed dissatisfaction with, but which is what he is most guilty of.
edit: Actually I think EY changed his mind on the dangerousness of AIXI, which is an enormous plus point for him, but the one that should come with a penalty: meta-understanding of the difficulties involved and the tendency to put-yourself-in-its-shoes-tasked-with-complying-with-verbal-description, instead of understanding the math (as well as which should come with understanding that failure doesn’t imply everyone dies). Anyhow, the issue is that the AI doing something unintended due to a flaw is a far cry from killing mankind, as far as nuclear power plant design having a flaw is from nuclear power plant suffering multikiloton explosion. The FAI is a work on a non-suicidal AI; akin to a work on unmoderated fast neutron nuclear reactor with positive thermal coefficient of reactivity (and a ‘proof’ that the control system is perfect). One could switch the goalposts and argue that nonminds like AIXI are not true AGI; that’s about as interesting as arguing that submarines don’t really swim.