Enjoy your war on straw, I’m out.
VAuroch
A boxed AI won’t be able to magically make it’s creators forget about AI risks and unbox it.
The results of AI box game trials disagree.
t’s trivial to propose an AI model which only cares about finite time horizons. Predict what actions will have the highest expected utility at time T, take that action.
And what does it do at time T+1? And if you said ‘nothing’, try again, because you have no way of justifying that claim. It may not have intentionally-designed long-term preferences, but just because your eyes are closed does not mean the room is empty.
By that reasoning, there’s no such thing as a Friendly human.
True. There isn’t.
I suggest that most people when talking about friendly AIs do not mean to imply a standard of friendliness so strict that humans could not meet it.
Well, I definitely do, and I’m at least 90% confident Eliezer does as well. Most, probably nearly all, of people who talk about Friendliness would regard a FOOMed human as Unfriendly.
A prerequisite for planning a Friendly AI is understanding individual and collective human values well enough to predict whether they would be satisfied with the outcome, which entails (in the logical sense) having a very well-developed model of the specific humans you interact with, or at least the capability to construct one if you so choose. Having a sufficiently well-developed model to predict what you will do given the data you are given is logically equivalent to a weak form of “control people just by talking to them”.
To put that in perspective, if I understood the people around me well enough to predict what they would do given what I said to them, I would never say things that caused them to take actions I wouldn’t like; if I, for some reason, valued them becoming terrorists, it would be a slow and gradual process to warp their perceptions in the necessary ways to drive them to terrorism, but it could be done through pure conversation over the course of years, and faster if they were relying on me to provide them large amounts of data they were using to make decisions.
And even the potential to construct this weak form of control that is initially heavily constrained in what outcomes are reachable and can only be expanded slowly is incredibly dangerous to give to an Unfriendly AI. If it is Unfriendly, it will want different things than its creators and will necessarily get value out of modeling them. And regardless of its values, if more computing power is useful in achieving its goals (an ‘if’ that is true for all goals), escaping the box is instrumentally useful.
And the idea of a mind with “no long term goals” is absurd on its face. Just because you don’t know the long-term goals doesn’t mean they don’t exist.
As was first proposed on /r/rational (and EY has confirmed that he got the idea from that proposal)
No voting system can deal with people who have arbitrary preferences. I’ve lost track of the first time I looked into this, but I’m pretty sure that if you map preference space, impose a metric, and say that each candidate and voter choose a location in that space and the votes go in proportion to the distance by that metric, it gets around Arrow by imposing the requirement “voters may only express a preference that their representatives share their preferences”, which is reasonable but still violates the theorem’s preconditions.
The ripple effect is real, but as in Pascal’s Wager, for every possible situation where the timing is critical and something bad will happen if you are distracted for a moment, there’s a counterbalancing situation where the timing is critical and something bad will happen unless you are distracted for a moment, so those probably balance out into noise.
Yes, that’s my issue with the paper; it doesn’t distinguish that from actual catastrophes.
When someone is ignorant of the actual chance of a catastrophic event happening, even if they consider it possible, they will have fairly high EV. When they update significantly toward the chance of that event happening, their EV will drop very significantly. This change itself meets the definition of ‘existential catastrophe’.
That requires a precise meaning of expected value in this context that includes only certain varieties of uncertainty. It would take into account the actual probability that, for example, a comet exists which is on a collision course with the Earth, but could not include the state of our knowledge about whether that is the case.
If it did include states of knowledge, then going from ‘low probability that a comet strikes the Earth and wipes out all or most human life’ to ‘Barring our action to avoid it, near-certainty that a comet will strike the Earth and wipe out all or most human life’ is itself a catastrophic event and should be avoided.
That wasn’t central to my point; I mean that in the fields where most rationalists spend their time, there’s a significant gender imbalance. Even if you’re totally willing to date non-rationalists, by default the people you meet will be heavily imbalanced unless you’re specifically cultivating social circles not related to rationality or your profession.
Empirically, it is generally easier for women to find potential partners willing to date them than it is for men; this isn’t necessarily useful to them unless their standards are low-ish, but if they’re willing to sacrifice date quality, it’s a tradeoff that’s much easier for them to make.
This is massively exacerbated by the gender imbalance present in most fields that have a significant rationalist following, obviously.
I think ‘who are you, really’ is basically this plus ‘what do you want/what are your goals?’
Disagree. The movie, which is a straightforward adaptation of the play, is creepy and hilarious; it’s quite good.
From your description, your break looks more like a change in type of work, for the most part. The useful tasks you outline probably couldn’t actually fill up 15 minutes, but just from reading the description it looks dominated by things I also wouldn’t consider to be ‘resting’.
Programs exist for taking dictation and turning it into written notes, though I don’t know any good way to tag those notes to the corresponding part of the audiobook.
I retain my ability to walk after losing my ability to not throw up. What things you lose in what order is highly idiosyncratic.
No, you aren’t affecting the image of LessWrong. You’re affecting the image of yourself and Construtcive Development Theory, which I am now increasingly convinced is pure crackpottery.
Seriously, listen to yourself. You sound like a Scientologist, here.
Your writing mistakes here make this worse than useless; they signal crackpottery and cultishness in such a way that not posting it would be superior to posting it in it’s current form. Especially since if it’s true, then by your own admission it’s only comprehensible to people who don’t need it, and only useful to people who can’t comprehend it.
Also not a QM expert, but this matches my understanding as well.