endoself
I, for one, have the terminal value of continued personal existence (a.k.a. being alive). On LW I’m learning that continuity, personhood, and existence might well be illusions. If that is the case, my efforts to find ways to survive amount to extending something that isn’t there in the first place
I am confused about this as well. I think the right thing to do here is to recognize that there is a lot we don’t know about, e.g. personhood, and that there is a lot we can do to clarify our thinking on personhood. When we aren’t confused about this stuff anymore, we can look over it and decide what parts we really valued; our intuitive idea of personhood clearly describes something, even recognizing that a lot of the ideas of the past are wrong. Note also that we don’t gain anything by remaining ignorant (I’m not sure if you’ve realized this yet).
Can you elaborate? This sounds interesting.
Neural signals represent things cardinally rather than ordinally, so those voting paradoxes probably won’t apply.
Even conditional on humans not having transitive preferences even in an approximate sense, I find it likely that it would be useful to come up with some ‘transativization’ of human preferences.
Agreed that there’s a good chance that game-theoretic reasoning about interacting submodules will be important for clarifying the structure of human preferences.
What’s wrong with the surreals? It’s not like we have reason to keep our sets small here. The surreals are prettier, don’t require an arbitrary nonconstructive ultrafilter, are more likely to fall out of an axiomatic approach, and can’t accidently end up being too small (up to some quibbles about Grothendieck universes).
No, that’s not what I meant at all. In what you said, the agent needs to be separate from the system in order to preform do-actions. I want an agent that knows it’s an agent, so it has to have a self-model and, in particular, has to be inside the system that is modelled by our causal graph.
One of the guiding heuristics in FAI theory is that an agent should model itself the same way it models other things. Roughly, the agent isn’t actually tagged as different from nonagent things in reality, so any desired behaviour that depends on correctly making this distinction cannot be regulated with evidence as to whether it is actually making the distinction the way we want it to. A common example of this is the distinction between self-modification and creating a successor AI; an FAI should not need to distinguish these, since they’re functionally the same. These sorts of ideas are why I want the agent to be modelled within its own causal graph.
Look, HIV patients who get HAART die more often (because people who get HAART are already very sick). We don’t get to see the health status confounder because we don’t get to observe everything we want. Given this, is HAART in fact killing people, or not?
Well, of course I can’t give the right answer if the right answer depends on information you’ve just specified I don’t have.
You’re sort of missing what Ilya is trying to say. You might have to look at the actual details of the example he is referring to in order for this to make sense. The general idea is that even though we can’t observe certain variables, we still have enough evidence to justify the causal model where HAART leads to fewer people die, so we can conclude that we should prescribe it.
I would object to Ilya’s more general point though. Saying that EDT would use E(death|HAART) to determine whether to prescribe HAART is making the same sort of reference class error you discuss in the post. EDT agents use EDT, not the procedures used to A0 and A1 in the example, so we really need to calculate E(death|EDT agent prescribes HAART). I would expect this to produce essentially the same results as a Pearlian E(death | do(HAART)), and would probably regard it as a failure of EDT if it did not add up to the same thing, but I think that there is value in discovering how exactly this works out, if it does.
If you want to change what you want, then you’ve decided that your first-orded preferences were bad. EDT recognizing that it can replace itself with a better decision theory is not the same as it getting the answer right; the thing that makes the decision is not EDT anymore.
No. For example, AIXI is what I would regard as essentially a Bayesian agent, but it has a notion of causality because it has a notion of the environment taking its actions as an input.
This looks like a symptom of AIXI’s inability to self-model. Of course causality is going to look fundamental when you think you can magically intervene from outside the system.
Do you share the intuition I mention in my other comment? I feel that they way this post reframes CDT and TDT as attempts to clarify bad self-modelling by naive EDT is very similar to the way I would reframe Pearl’s positions as an attempt to clarify bad self-modelling by naive probability theory a la AIXI.
These three causal graphs cannot be distinguished by the observational statistics. The causal information given in the problem is an essential part of its statement, and no decision theory which ignores causation can solve it.
I think this isn’t actually compatible with the thought experiment. Our hypothetical agent knows that it is an agent. I can’t yet formalize what I mean by this, but I think that it requires probability distributions corresponding to a certain causal structure, which would allow us to distinguish it from the other graphs. I don’t know how to write down a probability distribution that contains myself as I write it, but it seems that such a thing would encode the interventional information about the system that I am interacting with on a purely probabilistic level. If this is correct, you wouldn’t need a separate representation of causality to decide correctly.
- Jul 8, 2013, 10:54 AM; 7 points) 's comment on Evidential Decision Theory, Selection Bias, and Reference Classes by (
UDT corresponds to something more mysterious
Don’t update at all, but instead optimize yourself, viewed as a function from observations to actions, over all possible worlds.
There are tons of details, but it doesn’t seem impossible to summarize in a sentence.
I’d like to make explicit the connection of this idea to hard takeoff, since it’s something I’ve thought about before but isn’t stated explicitly very often. Namely, this provides some reason to think that by the time an AGI is human-level in the things humans have evolved to do, it will be very superhuman in things that humans have more difficulty with, like math and engineering.
It provides a usefully concept, which can be carried over into other domains. I suppose there are other techniques that use a temperature, but I’m much less familiar with them and they are more complicated. Is understanding other metaheuristics more useful to people who aren’t actually writing a program preforms some optimization than just understanding simulated annealing?
But it’s actually important to the example. If someone intends to allocate their time searching for small and large improvements to their life, then simulated annealing suggests that they should make more of the big ones first. (The person you describe has may not have done this, since they’ve settled into a local optimum but now decide to find a completely different point on the fitness landscape, though without more details it’s entirely possible they’ve decided correctly here.)
Your second paragraph could benefit from the concept of simulated annealing.
I’m not sure what you mean. Do you mean the scores given that you choose to cooperate and defect? There’s a lot of complexity hiding in ‘given that’, and we don’t understand a lot of it. This is definitely not a trivial fix to Lumifer’s program.
Another problem is that you cooperate agains CooperateBot.
From If Many-Worlds had Come First:
the thought experiment goes: ‘Hey, suppose we have a radioactive particle that enters a superposition of decaying and not decaying. Then the particle interacts with a sensor, and the sensor goes into a superposition of going off and not going off. The sensor interacts with an explosive, that goes into a superposition of exploding and not exploding; which interacts with the cat, so the cat goes into a superposition of being alive and dead. Then a human looks at the cat,’ and at this point Schrödinger stops, and goes, ‘gee, I just can’t imagine what could happen next.’ So Schrödinger shows this to everyone else, and they’re also like ‘Wow, I got no idea what could happen at this point, what an amazing paradox’. Until finally you hear about it, and you’re like, ‘hey, maybe at that point half of the superposition just vanishes, at random, faster than light’, and everyone else is like, ‘Wow, what a great idea!’”
Obviously this is a parody and Eliezer is making an argument for many worlds. However, this isn’t that far from how the thought experiment is presented in introductory books and even popularizations. Why, then, don’t more people realize that many worlds is correct? Why aren’t tons of bright middle-school children who read science fiction and popular science spontaneously rediscovering many worlds?
I agree with this; the ‘e.g.’ was meant to point toward the most similar theories that have names, not pin down exactly what Eliezer is doing here. I though that it would be better to refer to the class of similar theories here since there is enough uncertainty that we don’t really have details.
What? This makes no sense.
I guess you haven’t seen this stated explicitly, but the framework of causal networks makes an iid assumption. The idea is that the causal network represents some process that occurs a lot, and we can watch it occur until we get a reasonably good understanding of the joint distribution of variables. Part of this is that it the same process occurring, so there is no time dependence built into the framework.
For some purposes, we can model time by simply including it as an observed variable, which you do in this post. However, the different measurements of each variable are associated because they come from the same sample of the (iid) causal process, whether or not we are conditioning on time. The way you are trying to condition on time isn’t correct, and the correlation does exists in both cases. (Really, we care about dependence rather than correlation, but it doesn’t make a difference here.)
I do think that this is a useful general direction of analysis. If the question is meaningful at all, then the answer is probably that given by Armok_GoB in the original thread, but it would be useful to clarify what exactly the question means. There is probably a lot of work to be done before we really understand such things, but I would advise you to better understand the ideas behind causal networks before trying to contribute.