Could a theist human with a fixed preference do the following: Change their mind about the existence of souls and sign up for cryonics? If they can’t then that is one situation where having a fixed preference is not good.
Being at the top of meta, preference is not obviously related to likes, wants or beliefs. It is what you want on reflection, given infinite computational power, etc., but not at all necessarily what you currently believe you want. (Compare to the semantics of a computer program, which is probably uncomputable vs. what you can conclude from its source code in finite time.)
I’m not sure you can have a fixed preference if you don’t have a fixed ontology, and not having a fixed ontology has been a good thing at least in terms of humanities ability to control the world.
This is called the ontology problem in FAI, and I believe I have a satisfactory solution to it for the purposes of FAI (roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state; here, no reference to the real world is made; for FAI, we only need to duplicate human preference in FAI, not understand it), which I’m currently describing on my blog.
I’ve read some of your blog. I find it hard to pin down and understand something that is not obviously related to what is going on around us.
This is called the ontology problem in FAI, and I believe I have a satisfactory solution to it for the purposes of FAI (roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state; here, no reference to the real world is made; for FAI, we only need to duplicate human preference in FAI, not understand it)
Hmm, interesting. Do you have a way of separating the epistemic state from the other state of a self-modifying intelligence? Would knowledge about what my goals are come under epistemic state?
I find it hard to pin down and understand something that is not obviously related to what is going on around us.
Me too, but it seems that what we really want, and would like an external agent to implement without further consulting with us, is really a structure with these confusing properties.
Would knowledge about what my goals are come under epistemic state?
Yes, everything you are (as a mind) is epistemic state. A rigid boundary around the mind is necessary to fight the ontology problem, even where people obviously externalize some of their computation, and depend on irrelevant low-level events that affect computation within the brain. (A brain won’t work in this context, though an emulated space ship, like in this metaphor, is fine, in which case preference of the ship is about what should be done on the ship, given each state of the ship.)
(roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state;
Yes, everything you are (as a mind) is epistemic state. A rigid boundary around the mind is necessary to fight the ontology problem,
Now I am really confused. If an agent is has the same epistemic state as me, that is it is everything that I am (as a mind), then surely it will have the same preference(assuming determinism)!?
Or are you talking about something like the following.
A B and C are agents
forall C. action/thought (A, C) = action/thought( B, C) → same_preference (A , B)
Where action/thought is a function that takes two agents and returns the actions and thoughts that the first agent thinks the second should have. As two humans will somewhat agree what a dog should do depending upon what the dog knows?
Now I am really confused. If an agent is has the same epistemic state as me, that is it is everything that I am (as a mind), then surely it will have the same preference(assuming determinism)!?
Yes, your exact copy has same preference as you, why?
forall C. action/thought (A, C) = action/thought( B, C) → same_preference (A , B)
More like action/thought (A, A) = action/thought( B, B) → same_preference (A , B). I don’t understand why you gave that particular formulation, so not sure if my reply is helpful. The ontologically boxed agents only have preference about their own thoughts/actions, there is no real world or other agents for them, though inside their mind they may have all kinds of concepts that they can consider (for example, agent A can have a concept of agent B, as an ontologically boxed agent).
(roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state; here
So lets say there is me and a paper clipper, do we share the same preference? If I was everything as a mind the paper clipper was, I would want to paper clip, right? And similarly the paper clipper if it was given my epistemic state would want to do what I do.
So I don’t see how all agents don’t share the same preference, under this definition.
Yes, technically stating this needs work, but the idea should be clear: you and a paperclipper disagree on what should be done by the paperclipper in a given paperclipper’s state.
That was what I was getting at with my A B C example.
A = you
B = paperclipper
C = different paperclipper states
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
Or is this another stage where we need to give infinite resources? Would that solve the problem?
A = you B = paperclipper C = different paperclipper states
I see. Yes, that should work as an informal explanation.
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
There is no difference in ontology between different programs, so I’m not sure what you refer to. They are all “boxed” inside their own computations, and they only work with their own computations, though this activity can be interpreted as thinking about external world. I expect the judging of similarity of preference to be some kind of generally uncomputable condition, such as asking whether two given programs (not the agent programs, some constructions of them) have the same outputs, which should be possible to theoretically verify in special cases, for example you know that two copies of the same program have the same outputs.
Being at the top of meta, preference is not obviously related to likes, wants or beliefs. It is what you want on reflection, given infinite computational power, etc., but not at all necessarily what you currently believe you want. (Compare to the semantics of a computer program, which is probably uncomputable vs. what you can conclude from its source code in finite time.)
This is called the ontology problem in FAI, and I believe I have a satisfactory solution to it for the purposes of FAI (roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state; here, no reference to the real world is made; for FAI, we only need to duplicate human preference in FAI, not understand it), which I’m currently describing on my blog.
I’ve read some of your blog. I find it hard to pin down and understand something that is not obviously related to what is going on around us.
Hmm, interesting. Do you have a way of separating the epistemic state from the other state of a self-modifying intelligence? Would knowledge about what my goals are come under epistemic state?
Me too, but it seems that what we really want, and would like an external agent to implement without further consulting with us, is really a structure with these confusing properties.
Yes, everything you are (as a mind) is epistemic state. A rigid boundary around the mind is necessary to fight the ontology problem, even where people obviously externalize some of their computation, and depend on irrelevant low-level events that affect computation within the brain. (A brain won’t work in this context, though an emulated space ship, like in this metaphor, is fine, in which case preference of the ship is about what should be done on the ship, given each state of the ship.)
Now I am really confused. If an agent is has the same epistemic state as me, that is it is everything that I am (as a mind), then surely it will have the same preference(assuming determinism)!?
Or are you talking about something like the following.
A B and C are agents
forall C. action/thought (A, C) = action/thought( B, C) → same_preference (A , B)
Where action/thought is a function that takes two agents and returns the actions and thoughts that the first agent thinks the second should have. As two humans will somewhat agree what a dog should do depending upon what the dog knows?
Yes, your exact copy has same preference as you, why?
More like action/thought (A, A) = action/thought( B, B) → same_preference (A , B). I don’t understand why you gave that particular formulation, so not sure if my reply is helpful. The ontologically boxed agents only have preference about their own thoughts/actions, there is no real world or other agents for them, though inside their mind they may have all kinds of concepts that they can consider (for example, agent A can have a concept of agent B, as an ontologically boxed agent).
So lets say there is me and a paper clipper, do we share the same preference? If I was everything as a mind the paper clipper was, I would want to paper clip, right? And similarly the paper clipper if it was given my epistemic state would want to do what I do.
So I don’t see how all agents don’t share the same preference, under this definition.
Yes, technically stating this needs work, but the idea should be clear: you and a paperclipper disagree on what should be done by the paperclipper in a given paperclipper’s state.
That was what I was getting at with my A B C example.
A = you B = paperclipper C = different paperclipper states
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
Or is this another stage where we need to give infinite resources? Would that solve the problem?
I see. Yes, that should work as an informal explanation.
There is no difference in ontology between different programs, so I’m not sure what you refer to. They are all “boxed” inside their own computations, and they only work with their own computations, though this activity can be interpreted as thinking about external world. I expect the judging of similarity of preference to be some kind of generally uncomputable condition, such as asking whether two given programs (not the agent programs, some constructions of them) have the same outputs, which should be possible to theoretically verify in special cases, for example you know that two copies of the same program have the same outputs.