A speck in Adam’s eye vs Eve being tortured is not a utility comparison but a happiness comparison. Happiness is hard to compare but can be compared because it is a state, utility is an ordering function. There is no utility meter.
ArthurB
You’re correct. In the previous post given, it was somehow assumed that the score for a wrong answer was 0. In that case, the only proper score function is the log.
If you have a score function f1(q) for the right answer f0(q) for the wrong answer, and there are n possible choices, the right p are critical only if
f0′ (x) = (k—x.f1′ (x))/(1-x)
if we set f1(x) = 1 - (1-x)^p we can set f0(x) = -(1-x)^p + (1-x)^(p-1) * p/(p-1)
for p = 2, we find f0(x) = -(1-x)^2 + 2(1-x) = 1 - x^2 this is Brier score for p = 3, we find f0(x) = -(1-x)^3 + (1-x)^2 3⁄2 = x^3 − 3x^2/2
1-(1-x)^3 and x^3-3*x^2/2 shall be known as ArthurB’s score
Good thing with a log score rule is that if the student try to maximize the expected score, they should write in their belief.
For the same reason, when confronted with a set of odds on the outcome of an event, betting on each outcome in proportion to your belief will maximize the log of the expected gain (regardless of what the current odds are)
I confess I do not grasp the problem well enough to see where the problem lies in my comment. I am trying to formalize the problem, and I think the formalism I describe is sensible.
Once again, I’ll reword it but I think you’ll still find it too vague : to win, one must act rationally and the set of possible action includes modifying one’s code.
The question was
My timeless decision theory only functions in cases where the other agents’ decisions can be viewed as functions of one argument, that argument being your own choice in that particular case—either by specification (as in Newcomb’s Problem) or by symmetry (as in the Prisoner’s Dilemma). If their decision is allowed to depend on how your decision depends on their decision—like saying, “I’ll cooperate, not ‘if the other agent cooperates’, but only if the other agent cooperates if and only if I cooperate—if I predict the other agent to cooperate unconditionally, then I’ll just defect”—then in general I do not know how to resolve the resulting infinite regress of conditionality, except in the special case of predictable symmetry
I do not know the specifics of Eliezer’s timeless decision theory, but it seems to me that if one looks at the decision process of other based on their belief of your code, not on your decisions, there is no infinite regression progress.
You could say : Ah but there is your belief about an agent’s code, then his belief about your belief about his code, then your belief about his belief about your belief about his code, and that looks like an infinite regression. However, there is really no regression since “his belief about your belief about his code” is entirely contained in “your belief about his code”.
On this page, the cumulative refers to the probability of obtaining at most p successes. You want to run it with 30 and 9 which gives you the right answer, 2.14%
Or you could put in 30 and 20 which gives you the complement.
What is lower than 1% is the probability of getting 8 or less right answers.
Well, if you want practicality, I think Omega problems can be disregarded, they’re not realistic. It seems that the only feature needed for the real world is the ability to make trusted promises as we encounter the need to make them.
If we are not concerned with practicality but the theoretical problem behind these paradoxes, the key is that other agents make prediction on your behavior, which is the same as saying they have a theory of mind, which is simply a belief distribution over your own code.
To win, you should take the actions that make their belief about your own code favorable to you, which can include lying, or modifying your own code and showing it to make your point.
It’s not our choice that matters in these problem but our choosing algorithm.
I think you got your math wrong
If you get 20 out of 30 questions wrong, you are break even, therefore the probability of losing points by guessing is
Sum( (i 30), i = 21..30) / 2^30 ~ 2.14% > 1%
Instead of assuming that other will behave as a function of our choice, we look at the rest of the universe (including other sentient being, including Omega) as a system where our own code is part of the data.
Given a prior on physics, there is a well defined code that maximizes our expected utility.
That code wins. It one boxes, it pays Omega when the coin falls on heads etc.
I think this solves the infinite regress problem, albeit in a very unpractical way,
I find that going to the original paper generally does the trick. When an idea is new, the author will spell the details more carefully.
SilasBarta mentions difficulty with Boltzmann machines, Ackley et al.’s article is actually quite detailed, including a proof of the learning algorithm http://tinyurl.com/q4azfl
It would be interesting to try the experiment with Versed. You remove the dialectical aspect (steps 2,3,4) but you keep the wisdom of the crowd aspect.
There’s an ambiguity here. You’re talking about valuing something like world justice, I was talking about valuing acting justly. In particular, I believe that if optimal deterrence is unjust, it is also unjust to seek it.
Why does this relate to the subject again? Well, my point is we should not change our sense of justice. It’s tautological.
Your decision making works as a value scale, morality not so much.There is a subset of actions you can take which are just. If you do not give a high weight in acting justly, you’re a dangerous person.
When you say we “should” change our sense of justice, you’re making a normative statement because no specific goal is specified.
In this case, it seems wrong. Our sense of justice is part of our morality, therefore we should not change it.
“We should seek justice” is tautological. If justice and optimal deterrence are contradictory, then we should not seek optimal deterrence.
In some corporate structure, you may want to avoid over performing at a job interview. Your manager wants to hire someone who is competent, but no so competent that he will replace him.
For get an older scenario, imagine you’re a hunter in a tribe and there’s an alpha leader. You want to be perceived as a good hunter so that you’ll get a larger share of resources but not so good that you threaten his power.
It then pays to signal that you do not intend to challenge the authority. One way to do it is to have poor self-esteem, by attributing your successes to luck.
The equivalence class of the utility function should be the set of monotonous function of a canonical element.
However, what von Neumann-Morgenstern shows under mild assumptions is that for each class of utility functions, there is a subset of utility functions generated by the affine transforms of a single canonical element for which you can make decisions by computing expected utility. Therefore, looking at the set of all affine transforms of such an utility function really is the same as looking at the whole class. Still, it doesn’t make utility commensurable.