Then, once you actually construct an arbitrary GAI, you already know how to transform it into an FAI.
Frankly, I don’t trust this claim for a second, because important components of the Friendliness problem are being completely shunted aside. For one thing, in order for this to even start making sense, you have to be able to specify a computable utility function for the AGI agent in the first place. The current models being used for this “mathematical” research don’t have any such thing, ie: AIXI specifies reward as a real-valued percept rather than a function over its world-model.
The problem is not the need for large amounts of computing power (ie: the problem is not specifying the right behavior and then “scaling it down” or “approximating” a “tractable example from the class”). The problem is not being able to specify what the agent values in detail. No amount of math wank about “approximation” and “candidate class of formal models U” is going to solve the basic problem of having to change the structure away from AIXI in the first place.
I really ought to apologize for use of the term “math wank”, but this really is the exact opposite approach to how one constructs correct programs. What you don’t do to produce a correct computer program, knowing its specification, is try to specify a procedure that will, given an incomplete infinity of time, somehow transform an arbitrary program from some class of programs into the one you want. What you do is write the single exact program you want, correct-by-construction, and prove formally (model checking, dependent types, whatever you please) that it exactly obeys its specification.
If you are wondering where the specification for an FAI comes from, well, that’s precisely the primary research problem to solve! But it won’t get solved by trying to write a function that takes as input an arbitrary instance or approximation of AIXI and returns that same instance of AIXI “transformed” to use a Friendly utility function.
Oh yes, it sounds like I did misunderstand you. I thought you were saying you didn’t understand how such a thing could happen in principle, not that you were skeptical of the currently popular models. The classes U and F above, should something like that ever come to pass, need not be AIXI-like (nor need they involve utility functions).
I think I’m hearing that you’re very skeptical about the validity of current toy mathematical models. I think it’s common for people to motte and bailey between the mathematics and the phenomena they’re hoping to model, and it’s an easy mistake for most people to make. In a good discussion, you should separate out the “math wank” (which I like to just call math) from the transfer of that wank to reality that you hope to model.
Sometimes toy models are helpful and some times they are distractions that lead nowhere or embody a mistaken preconception. I see you as claiming these models are distractions, not that no model is possible. Accurate?
Sometimes toy models are helpful and some times they are distractions that lead nowhere or embody a mistaken preconception. I see you as claiming these models are distractions, not that no model is possible. Accurate?
I very much favor bottom-up modelling based on real evidence rather than mathematical models that come out looking neat by imposing our preconceptions on the problem a priori.
The classes U and F above, should something like that ever come to pass, need not be AIXI-like (nor need they involve utility functions).
Right. Which is precisely why I don’t like when we attempt to do FAI research under the assumption of AIXI-like-ness.
I very much favor bottom-up modelling based on real evidence rather than mathematical models that come out looking neat by imposing our preconceptions on the problem a priori.
(edit: I think I might understand after-all; it sounds like you’re claiming AIXI-like things are unlikely to be useful since they’re based mostly on preconceptions that are likely false?)
I don’t think I understand what you mean here. Everyone favors modeling based on real evidence as opposed to fake evidence, and everyone favors avoiding the import of false preconceptions. It sounds like you prefer more constructive approaches?
Right. Which is precisely why I don’t like when we attempt to do FAI research under the assumption of AIXI-like-ness.
I agree if you’re saying that we shouldn’t assume AIXI-like-ness to define the field. I disagree if you’re saying it’s a waste for people to explore that idea space though: it seems ripe to me.
I don’t think it’s an active waste of time to explore the research that can be done with things like AIXI models. I do, however, think that, for instance, flaws of AIXI-like models should be taken as flaws of AIXI-like models, rather than generalized to all possible AI designs.
So for example, some people (on this site and elsewhere) have said we shouldn’t presume that a real AGI or real FAI will necessarily use VNM utility theory to make decisions. For various reasons, I think that exploring that idea-space is a good idea, in that relaxing the VNM utility and rationality assumptions can both take us closer to how real, actually-existing minds work, and to how we normatively want an artificial agent to behave.
I offered the transform as an example how things can mathematically factor, so like I said, that may not be what the solution looks like. My feeling is that it’s too soon to throw out anything that might look like that pattern though.
Frankly, I don’t trust this claim for a second, because important components of the Friendliness problem are being completely shunted aside. For one thing, in order for this to even start making sense, you have to be able to specify a computable utility function for the AGI agent in the first place. The current models being used for this “mathematical” research don’t have any such thing, ie: AIXI specifies reward as a real-valued percept rather than a function over its world-model.
The problem is not the need for large amounts of computing power (ie: the problem is not specifying the right behavior and then “scaling it down” or “approximating” a “tractable example from the class”). The problem is not being able to specify what the agent values in detail. No amount of math wank about “approximation” and “candidate class of formal models U” is going to solve the basic problem of having to change the structure away from AIXI in the first place.
I really ought to apologize for use of the term “math wank”, but this really is the exact opposite approach to how one constructs correct programs. What you don’t do to produce a correct computer program, knowing its specification, is try to specify a procedure that will, given an incomplete infinity of time, somehow transform an arbitrary program from some class of programs into the one you want. What you do is write the single exact program you want, correct-by-construction, and prove formally (model checking, dependent types, whatever you please) that it exactly obeys its specification.
If you are wondering where the specification for an FAI comes from, well, that’s precisely the primary research problem to solve! But it won’t get solved by trying to write a function that takes as input an arbitrary instance or approximation of AIXI and returns that same instance of AIXI “transformed” to use a Friendly utility function.
Oh yes, it sounds like I did misunderstand you. I thought you were saying you didn’t understand how such a thing could happen in principle, not that you were skeptical of the currently popular models. The classes U and F above, should something like that ever come to pass, need not be AIXI-like (nor need they involve utility functions).
I think I’m hearing that you’re very skeptical about the validity of current toy mathematical models. I think it’s common for people to motte and bailey between the mathematics and the phenomena they’re hoping to model, and it’s an easy mistake for most people to make. In a good discussion, you should separate out the “math wank” (which I like to just call math) from the transfer of that wank to reality that you hope to model.
Sometimes toy models are helpful and some times they are distractions that lead nowhere or embody a mistaken preconception. I see you as claiming these models are distractions, not that no model is possible. Accurate?
I very much favor bottom-up modelling based on real evidence rather than mathematical models that come out looking neat by imposing our preconceptions on the problem a priori.
Right. Which is precisely why I don’t like when we attempt to do FAI research under the assumption of AIXI-like-ness.
(edit: I think I might understand after-all; it sounds like you’re claiming AIXI-like things are unlikely to be useful since they’re based mostly on preconceptions that are likely false?)
I don’t think I understand what you mean here. Everyone favors modeling based on real evidence as opposed to fake evidence, and everyone favors avoiding the import of false preconceptions. It sounds like you prefer more constructive approaches?
I agree if you’re saying that we shouldn’t assume AIXI-like-ness to define the field. I disagree if you’re saying it’s a waste for people to explore that idea space though: it seems ripe to me.
I don’t think it’s an active waste of time to explore the research that can be done with things like AIXI models. I do, however, think that, for instance, flaws of AIXI-like models should be taken as flaws of AIXI-like models, rather than generalized to all possible AI designs.
So for example, some people (on this site and elsewhere) have said we shouldn’t presume that a real AGI or real FAI will necessarily use VNM utility theory to make decisions. For various reasons, I think that exploring that idea-space is a good idea, in that relaxing the VNM utility and rationality assumptions can both take us closer to how real, actually-existing minds work, and to how we normatively want an artificial agent to behave.
Modulo nitpicking, agreed on both points.
I offered the transform as an example how things can mathematically factor, so like I said, that may not be what the solution looks like. My feeling is that it’s too soon to throw out anything that might look like that pattern though.