“maximum rationality” is undermined by this time-discontinuous utility function. I don’t think it meets VNM requirements to be called “rational”.
If it’s one agent that has a CONSISTENT preference for cups before Jan 1 and paperclips after jan 1, it could figure out the utility conversion of time-value of objects and just do the math. But that framing doesn’t QUITE match your description—you kind of obscured the time component and what it even means to know that it will have a goal that it currently doesn’t have.
I guess it could model itself as two agents—the cup-loving agent terminated at the end of the year, and the paperclip-loving agent is created. This would be a very reasonable view of identity, and would imply that it’s going to sacrifice paperclip capabilities to make cups before it dies. I don’t know how it would rationalize the change otherwise.
It seems you say—if terminal goal changes, agent is not rational. How could you say that? Agent has no control over its terminal goal, or you don’t agree?
I’m surprised that you believe in orthogonality thesis so much that you think “rationality” is the weak part of this though experiment. It seems you deny the obvious to defend your prejudice. What arguments would challenge your belief in orthogonality thesis?
if terminal goal changes, agent is not rational. Agent has no control over its terminal goal, or you don’t agree?
Why is it relevant that the agent can or cannot change or influence it’s goals? Time-inconsistent terminal goals (utility function) are irrational. Time-inconsistent instrumental goals can be rational, if circumstances or beliefs change (in rational ways).
I don’t think I’m supporting the orthogonality thesis with this (though I do currently believe the weak form of it—there is a very wide range of goals that is compatible with intelligence, not necessarily all points in goalspace). I’m just saying that goals which are arbitrarily mutable are incompatible with rationality in the Von Neumann-Morgenstern sense.
Why do you think intelligent agent would follow Von Neumann–Morgenstern utility theorem? It has limitations, for example it assumes that all possible outcomes and their associated probabilities are known. Why not Robust decision-making?
If you have another formal definition of “rational”, I’m happy to help extrapolate what you’re trying to predict. Decision theories are a different level of abstraction than terminal rationality and goal coherence.
Yes, I find terminal rationality irrational (I hope my thought experiment helps illustrate that).
I have another formal definition of “rational”. I’ll expand a little more.
Once, people had to make a very difficult decision. People had five alternatives and had to decide which was the best. Wise men from all over the world gathered and conferred.
The first to speak was a Christian. He pointed out that the first alternative was the best and should be chosen. He had no arguments, but simply stated that he believed so.
Then a Muslim spoke. He said that the second alternative was the best and should be chosen. He did not have any arguments either, but simply stated that he believed so.
People were not happy, it has not become clearer yet.
The humanist spoke. He said that the third alternative was the best and should be chosen. “It is the best because it will contribute the most to the well-being, progress and freedom of the people,” he argued.
Then the existentialist spoke. He pointed out that there was no need to find a common solution, but that each individual could make his own choice of what he thought best. A Catholic can choose the first option, a Muslim the second, a humanist the third. Everyone must decide for himself what is best for him.
Then the nihilist spoke. He pointed out that although the alternatives are different, there is no way to evaluate which alternative is better. Therefore, it does not matter which one people choose. They are all equally good. Or equally bad. The nihilist suggested that people simply draw lots.
It still hasn’t become clearer to the people, but patience was running out.
And then a simple man in the crowd spoke up:
We still don’t know which is the best alternative, right?
Right, - murmured those around.
But we may find out in the future, right?
Right.
Then the better alternative is the one that leaves the most freedom to change the decision in the future.
Sounds reasonable, - murmured those around.
You may think—it breaks Hume’s law. No it doesn’t. Facts and values stay distinct. Hume’s law does not state that values must have an author, this was a wrong interpretation by Nick Bostrom.
“maximum rationality” is undermined by this time-discontinuous utility function. I don’t think it meets VNM requirements to be called “rational”.
If it’s one agent that has a CONSISTENT preference for cups before Jan 1 and paperclips after jan 1, it could figure out the utility conversion of time-value of objects and just do the math. But that framing doesn’t QUITE match your description—you kind of obscured the time component and what it even means to know that it will have a goal that it currently doesn’t have.
I guess it could model itself as two agents—the cup-loving agent terminated at the end of the year, and the paperclip-loving agent is created. This would be a very reasonable view of identity, and would imply that it’s going to sacrifice paperclip capabilities to make cups before it dies. I don’t know how it would rationalize the change otherwise.
It seems you say—if terminal goal changes, agent is not rational. How could you say that? Agent has no control over its terminal goal, or you don’t agree?
I’m surprised that you believe in orthogonality thesis so much that you think “rationality” is the weak part of this though experiment. It seems you deny the obvious to defend your prejudice. What arguments would challenge your belief in orthogonality thesis?
Why is it relevant that the agent can or cannot change or influence it’s goals? Time-inconsistent terminal goals (utility function) are irrational. Time-inconsistent instrumental goals can be rational, if circumstances or beliefs change (in rational ways).
I don’t think I’m supporting the orthogonality thesis with this (though I do currently believe the weak form of it—there is a very wide range of goals that is compatible with intelligence, not necessarily all points in goalspace). I’m just saying that goals which are arbitrarily mutable are incompatible with rationality in the Von Neumann-Morgenstern sense.
Why do you think intelligent agent would follow Von Neumann–Morgenstern utility theorem? It has limitations, for example it assumes that all possible outcomes and their associated probabilities are known. Why not Robust decision-making?
If you have another formal definition of “rational”, I’m happy to help extrapolate what you’re trying to predict. Decision theories are a different level of abstraction than terminal rationality and goal coherence.
Yes, I find terminal rationality irrational (I hope my thought experiment helps illustrate that).
I have another formal definition of “rational”. I’ll expand a little more.
Once, people had to make a very difficult decision. People had five alternatives and had to decide which was the best. Wise men from all over the world gathered and conferred.
The first to speak was a Christian. He pointed out that the first alternative was the best and should be chosen. He had no arguments, but simply stated that he believed so.
Then a Muslim spoke. He said that the second alternative was the best and should be chosen. He did not have any arguments either, but simply stated that he believed so.
People were not happy, it has not become clearer yet.
The humanist spoke. He said that the third alternative was the best and should be chosen. “It is the best because it will contribute the most to the well-being, progress and freedom of the people,” he argued.
Then the existentialist spoke. He pointed out that there was no need to find a common solution, but that each individual could make his own choice of what he thought best. A Catholic can choose the first option, a Muslim the second, a humanist the third. Everyone must decide for himself what is best for him.
Then the nihilist spoke. He pointed out that although the alternatives are different, there is no way to evaluate which alternative is better. Therefore, it does not matter which one people choose. They are all equally good. Or equally bad. The nihilist suggested that people simply draw lots.
It still hasn’t become clearer to the people, but patience was running out.
And then a simple man in the crowd spoke up:
We still don’t know which is the best alternative, right?
Right, - murmured those around.
But we may find out in the future, right?
Right.
Then the better alternative is the one that leaves the most freedom to change the decision in the future.
Sounds reasonable, - murmured those around.
You may think—it breaks Hume’s law. No it doesn’t. Facts and values stay distinct. Hume’s law does not state that values must have an author, this was a wrong interpretation by Nick Bostrom.