But if the premise is impossible, then the experiment has no consequences in the real world, and we shouldn’t consider its results in our decision theory, which is about consequences in the real world.
PhilGoetz
That equation you quoted is in branch 2, “2. Omega is a “nearly perfect” predictor. You assign P(general) a value very, very close to 1.” So it IS correct, by stipulation.
But there is no possible world with a perfect predictor, unless it has a perfect track record by chance. More obviously, there is no possible world in which we can deduce, from a finite number of observations, that a predictor is perfect. The Newcomb paradox requires the decider to know, with certainty, that Omega is a perfect predictor. That hypothesis is impossible, and thus inadmissible; so any argument in which something is deduced from that fact is invalid.
I appreciated this comment a lot. I didn’t reply at the time, because I thought doing so might resurrect our group-selection argument. But thanks.
What about using them to learn a foreign vocabulary? E.g., to learn that “dormir” in Spanish means “to sleep” in English.
To reach statistical significance, they must have tested each of the 8 pianists more than once.
I think you need to get some data and factor out population density before you can causally relate environmentalism to politics. People who live in rural environment don’t see as much need to worry about the environment as people who live in cities. It just so happens that today, rural people vote Republican and city people vote Democrat. That didn’t used to be the case.
Though, sure, if you call the Sierra Club “environmentalist”, then environmentalism is politically polarized today. I don’t call them environmentalists anymore; I call them a zombie organization that has been parasitized by an entirely different political organization. I’ve been a member for decades, and they completely stopped caring about the environment during the Trump presidency. As in, I did not get one single letter from them in those years that was aimed at helping the environment. Lots on global warming, but none of that was backed up by science. (I’m not saying global warming isn’t real; I’m saying the issues the Sierra Club was raising had no science behind them, like “global warming is killing off the redwoods”.)
Isn’t LessWrong a disproof of this? Aren’t we thousands of people? If you picked two active LWers at random, do you think the average overlap in their reading material would be 5 words? More like 100,000, I’d think.
I think it would be better not to use the word “wholesome”. Using it is cheating, by letting us pretend at the same time that (A) we’re explaining a new kind of ethics, which we name “wholesome”, and (B) that we already know what “wholesome” means. This is a common and severe epistemological failure mode which traces back to the writings of Plato.
If you replace every instance of “wholesome” with the word “frobby”, does the essay clearly define “frobby”?
It seems to me to be a way to try to smuggle virtue ethics into the consequentialist rationality community by disguising it with a different word. If you replace every instance of “wholesome” with the word “virtuous”, does the essay’s meaning change?
Thank you! The 1000-word max has proven to be unrealistic, so it’s not too long. You and g-w1 picked exactly the same passage.
Thank you! I’m just making notes to myself here, really:
Harry teaches Draco about blood science and scientific hypothesis testing in Chapter 22.
Harry explains that muggles have been to the moon in Chapter 7.
Quirrell’s first lecture is in chapter 16, and it is epic! Especially the part about why Harry is the most-dangerous student.
[Question] Good HPMoR scenes / passages?
I think the problem is that each study has to make many arbitrary decisions about aspects of the experimental protocol. This decision will be made the same way for each subject in a single study, but will vary across studies. There are so many such decisions that, if the meta-analysis were to include them as dependent variables, each study would introduce enough new variables to cancel out the statistical power gain of introducing that study.
You have it backwards. The difference between a Friendly AI and an unfriendly one is entirely one of restrictions placed on the Friendly AI. So an unfriendly AI can do anything a friendly AI could, but not vice-versa.
The friendly AI could lose out because it would be restricted from committing atrocities, or at least atrocities which were strictly bad for humans, even in the long run.
Your comment that they can commit atrocities for the good of humanity without worrying about becoming corrupt is a reason to be fearful of “friendly” AIs.
By “just thinking about IRL”, do you mean “just thinking about the robot using IRL to learn what humans want”? ’Coz that isn’t alignment.
‘But potentially a problem with more abstract cashings-out of the idea “learn human values and then want that”’ is what I’m talking about, yes. But it also seems to be what you’re talking about in your last paragraph.
“Human wants cookie” is not a full-enough understanding of what the human really wants, and under what conditions, to take intelligent actions to help the human. A robot learning that would act like a paper-clipper, but with cookies. It isn’t clear whether a robot which hasn’t resolved the de dicto / de re / de se distinction in what the human wants will be able to do more good than harm in trying to satisfy human desires, nor what will happen if a robot learns that humans are using de se justifications.
Here’s another way of looking at that “nor what will happen if” clause: We’ve been casually tossing about the phrase “learn human values” for a long time, but that isn’t what the people who say that want. If AI learned human values, it would treat humans the way humans treat cattle. But if the AI is to learn to desire to help humans satisfy their wants, it isn’t clear that the AI can (A) internalize human values enough to understand and effectively optimize for them, while at the same time (B) keeping those values compartmentalized from its own values, which make it enjoy helping humans with their problems. To do that the AI would need to want to propagate and support human values that it disagrees with. It isn’t clear that that’s something a coherent, let’s say “rational”, agent can do.
How is that de re and de dicto?
You’re looking at the logical form and imagining that that’s a sufficient understanding to start pursuing the goal. But it’s only sufficient in toy worlds, where you have one goal at a time, and the mapping between the goal and the environment is so simple that the agent doesn’t need to understand the value, or the target of “cookie”, beyond “cookie” vs. “non-cookie”. In the real world, the agent has many goals, and the goals will involve nebulous concepts, and have many considerations and conditions attached, eg how healthy is this cookie, how tasty is it, how hungry am I. It will need to know /why/ it, or human24, wants a cookie in order to intelligently know when to get the cookie, and to resolve conflicts between goals, and to do probability calculations which involve the degree to which different goals are correlated in the higher goals they satisfy.
There’s a confounding confusion in this particular case, in which you seem to be hoping the robot will infer that the agent of the desired act is the human, both in the case of the human, and of the AI. But for values in general, we often want the AI to act in the way that the human would act, not to want the human to do something. Your posited AI would learn the goal that it wants human24 to get a cookie.
What it all boils down to is: You have to resolve the de re / de dicto / de se interpretation in order to understand what the agent wants. That means an AI also has to resolve that question in order to know what a human wants. Your intuitions about toy examples like “human 24 always wants a cookie, unconditionally, forever” will mislead you, in the ways toy-world examples misled symbolic AI researchers for 60 years.
So, “mesa” here means “tabletop”, and is pronounced “MAY-suh”?
All right, yes. But that isn’t how anyone has ever interpreted Newcomb’s Problem. AFAIK is literally always used to support some kind of acausal decision theory, which it does /not/ if what is in fact happening is that Omega is cheating.