(roughly, two agents have the same preference if they agree on what should be done/thought in each epistemic state; here
So lets say there is me and a paper clipper, do we share the same preference? If I was everything as a mind the paper clipper was, I would want to paper clip, right? And similarly the paper clipper if it was given my epistemic state would want to do what I do.
So I don’t see how all agents don’t share the same preference, under this definition.
Yes, technically stating this needs work, but the idea should be clear: you and a paperclipper disagree on what should be done by the paperclipper in a given paperclipper’s state.
That was what I was getting at with my A B C example.
A = you
B = paperclipper
C = different paperclipper states
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
Or is this another stage where we need to give infinite resources? Would that solve the problem?
A = you B = paperclipper C = different paperclipper states
I see. Yes, that should work as an informal explanation.
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
There is no difference in ontology between different programs, so I’m not sure what you refer to. They are all “boxed” inside their own computations, and they only work with their own computations, though this activity can be interpreted as thinking about external world. I expect the judging of similarity of preference to be some kind of generally uncomputable condition, such as asking whether two given programs (not the agent programs, some constructions of them) have the same outputs, which should be possible to theoretically verify in special cases, for example you know that two copies of the same program have the same outputs.
So lets say there is me and a paper clipper, do we share the same preference? If I was everything as a mind the paper clipper was, I would want to paper clip, right? And similarly the paper clipper if it was given my epistemic state would want to do what I do.
So I don’t see how all agents don’t share the same preference, under this definition.
Yes, technically stating this needs work, but the idea should be clear: you and a paperclipper disagree on what should be done by the paperclipper in a given paperclipper’s state.
That was what I was getting at with my A B C example.
A = you B = paperclipper C = different paperclipper states
However I am not sure that this solves the ontology problem, as you will have people with bad/simple ontologies judging what people with complex/accurate ontologies should do.
Or is this another stage where we need to give infinite resources? Would that solve the problem?
I see. Yes, that should work as an informal explanation.
There is no difference in ontology between different programs, so I’m not sure what you refer to. They are all “boxed” inside their own computations, and they only work with their own computations, though this activity can be interpreted as thinking about external world. I expect the judging of similarity of preference to be some kind of generally uncomputable condition, such as asking whether two given programs (not the agent programs, some constructions of them) have the same outputs, which should be possible to theoretically verify in special cases, for example you know that two copies of the same program have the same outputs.