Open thread, September 2-8, 2013
If it’s worth saying, but not worth its own post (even in Discussion), then it goes here.
- 3 Sep 2013 4:44 UTC; 11 points) 's comment on True Optimisation by (
- 8 Sep 2013 17:18 UTC; 2 points) 's comment on The Up-Goer Five Game: Explaining hard ideas with simple words by (
There has recently been some speculation that life started on Mars, and then got blasted to earth by an asteroid or something. Molybdenum is very important to life (eukaryote evolution was delayed by 2 billion years because it was unavailable), and the origin of life is easier to explain if Molybdenum is available. The problem is that Molybdenum wasn’t available in the right time frame on Earth, but it was on Mars.
Anyway, assuming this speculation is true, Mars had the best conditions for starting life, but Earth had the best conditions for life existing, and it is unlikely conscious life would have evolved without either of these planets being the way they are. Thus, this could be another part of the Great Filter.
Side note: I find it amusing that Molybdenum is very important in the origin/evolution of life, and is also element 42.
As someone pointed out to me when mentioning this to them, to be a candidate for Great Filter there would need to be something intrinsic about how planets are formed that cause these two types of environments to be mutually exclusive, else it seems like there isn’t sufficient reduction in probability of their availability. Is this actually the case? Perhaps user:CellBioGuy can elucidate.
Well it took a while for that summoning to actually take effect.
The point I was making that that is not necessarily a strong contender (as in many orders of magnitude) because of two things. One, all stars slowly increase in brightness as they age after settling down into their main sequence phase and their habitable zones thus sweep through the inner system over time (though this effect is much stronger for larger stars because they age and change faster). Secondly, its probable that most young smaller planets tend to have much more in the way of atmospheres than later in their lives just due to the fact that they are more geologically active then and haven’t had time for the light molecules to be blasted away. And if terrestrial planets follow any sort of a power rule in their distribution of masses, there should be multiple Mars-size planets for every Earth sized planet.
At this point I would say that any speculation on the exact place and time of the origin of life is premature, that there’s nothing to suggest that it didn’t happen on Earth, but that there is little to suggest that it couldn’t have happened elsewhere within our own solar system either even if we have little reason to think it had to (besides adding the necessity that it later moved to the very clement surface of the Earth, the only place in the solar system that can support a big high-bimoass biosphere like ours).
I honestly don’t know much about the proposed molybdenum connection. Some cursory looking about the internet suggests that molybdenum is necessary for efficient fixation of nitrogen from the air into organic molecules by nitrogenase, the enzyme that does most of that biological activity on Earth. I would be surprised though if that were the only way it could go, rather than just the way it went here...
EDIT: upon further looking around, I am worried that the proposed molybdenum connection could be correlation rather than causation. Most sources claiming that the presence of lots of soluble molybdenum is a prerequesite for a big complicated biosphere seem to be looking for a reason for this to be the case and not talking much about the simpler possibility that that is simply correlated with the time at which the deeper water not immediately touching the atmosphere moved from a reducing environment to an oxidizing environment in which the soluble forms are more stable...
EDIT 2: By the way, seriously, do not listen to the stuff in the ‘journal’ of cosmology that Robin Hanson periodically uncritically posts about panspermia between star systems and early-universe biogenesis and diatoms in meteors. It’s complete bullshit.
(Are you Adele Lack Cotard?)
Like in Synecdoche, New York? No… it is an abbreviation of my real name.
The ancient Stoics apparently had a lot of techniques for habituation and changing cognitive processes. Some of those live on in the form of modern CBT. One of the techniques is to write a personal handbook with advice and sayings to carry around at all times as to never be without guidance from a calmer self. Indeed, Epictet advises to learn this handbook by rote to further internalisation. So I plan to write such a handbook for myself, once in long form with anything relevant to my life and lifestyle, and once in a short form that I update with things that are difficult at that time, be it strong feelings or being deluded by some biases.
In this book I intend to include a list of all known cognitive biases and logical fallacies. I know that some biases are helped by simply knowing them, does anyone have a list of those? And should I complete the books or have a clear concept of their contents, are you interested in reading about the process of creating one and possible perceived benefits?
I’m also interested in hearing from you again about this project if you decide to not complete it. Rock on, negative data!
Though lack of motivation or laziness is not a particularly interesting answer.
I have found “I thought X would be awesome, and then on doing X realized that the costs were larger than the benefits” to be useful information for myself and others. (If your laziness isn’t well modeled by that, that’s also valuable information for you.)
(mild exaggeration) Has anyone else transitioned from “I only read Main posts, to I nearly only read discussion posts, to actually I’ll just take a look at the open threat and people who responded to what I wrote” during their interactions with LW?
To be more specific, is there a relevant phenomenon about LW or is it just a characteristic of my psyche and history that explain my pattern of reading LW?
I read the sequences and a bunch of other great old main posts but now mostly read discussion. It feels like Main posts these days are either repetitive of what I’ve read before, simply wrong or not even wrong, or decision theory/math that’s above my head. Discussion posts are more likely to be novel things I’m interested in reading.
This describes how my use of LW has wound up pretty accurately.
Selection bias alert: asking people whether they have transitioned to reading mostly discussion and then to mostly just open threads in an open thread isn’t likely to give you a good perspective on the entire population, if that is in fact what you were looking for.
There would be far more selection bias if he asked about it outside an open thread, though.
Really? Why?
Because he’s asking about people who only read the open thread. Here he could get response from the people who do read LW in general, inclusive of the open thread, and people who read only the open thread (he’ll miss the people who don’t read the open thread). Outside the open thread, he gets no response at all from people who only read the open thread.
I see what you mean.
I read the Sequences as they were posted; Main posts now rarely hold my interest the same way. Eliezer’s writing is just better than most people’s.
Honestly, I don’t know why Main is even an option for posting. It should really be just an automatically labeled/generated “Best of LW” section, where Discussion posts with, say, 30+ karma are linked. This is easy to implement, and easy to do manually using the Promote feature until it is. The way it is now, it’s mostly by people thinking that they are making an important contribution to the site, which is more of a statement about their ego than about quality of their posts.
I predict that some people will have been through the sequences, which are Main posts, but then mainly cared about discussion. I suspect it has to do with Morning Newspaper Bias—the bias of thinking that new stuff is more relevant, when actually it is just pointless to read most of the time, only scrambles your mind, and loses value very quickly.
The lower the barrier to entry, the more the activity. Thus, more posts are on Discussion. My hypothesis is that this has worked well enough to make Discussion where stuff happens. c.f. how physics happens on arXiv these days, not in journals. (OTOH, it doesn’t happen on viXra, whose barrier to entry may be too low.)
I’ve definitely noticed this in my use of LW. I find that the open threads/media threads with their consistent high-quality novelty in a wide range of subject areas are far more enjoyable than the more academic main threads. Decision theory is interesting, but it’s going to be hard to hold my attention for a 3,000 word post when there are tasty 200-word bites of information over here.
Well, chat’s always more fun.
I’ll admit that much of the main sequence are too heavy to understand without prior knowledge, so I find discussions much easier to take in, and many times I end up reading a sequence because it was posted in a discussion comment. For me discussion posts are like the gateway to Main.
My experience is similar. I read the sequences as they were published on OB, then when the move over to LW happened I just subscribed to the RSS feed and only read Promoted posts for quite a few years. Only about a year ago I actually signed up for an account here and started posting and reading Discussion and the Open Thread.
Background: “The genie knows, but doesn’t care” and then this SMBC comic.
The joke in that comic annoys me (and it’s a very common one on SMBC, there must be at least five there with approximately the same setup). Human values aren’t determined to align with the forces of natural selection. We happen to be the product of natural selection, and, yes, that made us have some values which are approximately aligned with long-term genetic fitness. But studying biology does not make us change our values to suddenly become those of evolution!
In other words, humans are a ‘genie that knows, but doesn’t care’. We have understood the driving pressures that created us. We have understood what they ‘want’, if that can really be applied here. But we still only care about the things which the mechanics of our biology happened to have made us care about, even though we know these don’t always align with the things that ‘evolution cares about.’
(Please if someone can think of a good way to say this all without anthropomorphising natural selection, help me. I haven’t thought enough about this subject to have the clarity of mind to do that and worry that I might mess up because of such metaphors.)
For more on this topic, see for example these posts:
Evolutionary Psychology
Thou Art Godshatter
Rebelling Within Nature
The Gift We Give To Tomorrow
Anyone tried to use the outside view on our rationalist community?
I mean, we are not the first people on this planet who tried to become more rational. Who were our predecessors, and what happened to them? Where did they succeed and where they failed? What lesson can we take from their failures?
The obvious reply will be: No one has tried doing exactly the same thing as we are doing. That’s technically true, but that’s a fully general excuse against using outside view, because if you look into enough details, no two projects are exactly the same. Yet it is experimentally proved that even looking at sufficiently similar projects gives better estimates than just using the inside view. So, if there was no one exactly like us, who was the most similar?
I admit I don’t have data on this, because I don’t study history, and I have no personal experience with Objectivists (which are probably the most obvious analogy). I would probably put Objectivists, various secret societies, educational institutions, or self-help groups into the reference class. Did I miss something important? The common trait is that those people are trying to make their thinking better, avoid some frequent faults, and teach other people to do the same thing. Who would be your candidate for the reference group? Then, we could explore them one by one and guess what they did right and what they did wrong. Seems to me that many small groups fail to expand, but on the other hand, the educational institutions that succeed to establish themselves in the society, become gradually filled with average people and lose the will to become stronger.
I’ve seen a few attempts, mostly from outsiders. The danger involved there is an outsider has difficult picking the right reference class- you don’t know how much they know about you, and how much they know about other things.
The things that the outside view has suggested we should be worried about that I remember (in rough order of frequency):
Being a cult.
Being youth-loaded.
Optimizing for time-wasting over goal-achieving.
Here are two critiques I remember from insiders that seem to rely on outside view thinking: Yvain’s Extreme Rationality: It’s Not That Great, patrissimo’s Self-Improvement or Shiny Distraction: Why Less Wrong is anti-Instrumental Rationality. Eliezer’s old posts on Every Cause Wants To Be A Cult and Guardians of Ayn Rand also seem relevant. (Is there someone who keeps track of our current battle lines for cultishness? Do we have an air conditioner on, and are we optimizing it deliberately?)
One of the things that I find interesting is in response to patrissimo’s comment in September 2010 that LW doesn’t have enough instrumental rationality practice, Yvain proposed that we use subreddits, and the result was a “discussion” subreddit. Now in September 2013 it looks like there might finally be an instrumental rationality subreddit. That doesn’t seem particularly agile. (This is perhaps an unfair comparison, as CFAR has been created in the intervening time and is a much more promising development in terms of boosting instrumental rationality, and there are far more meetups now than before, and so on.)
There’s also been a handful of “here are other groups that we could try to emulate,” and the primary one I remember was calcsam’s series on Mormons (initial post here, use the navigate-by-author links to find the others). The first post will be particularly interesting for the “outside view” analysis because he specifically discusses the features of the LDS church and LW that he thinks puts them in the same reference class (for that series of posts, at least).
The reason why I asked was not just “who can we be pattern-matched with?”, but also “what can we predict from this pattern-matching?”. Not merely to say “X is like Y”, but to say “X is like Y, and p(Y) is true, therefore it is possible that p(X) is also true”.
Here are two answers pattern-matching LW to a cult. For me, the interesting question here is: “how do cults evolve?”. Because that can be used to predict how LW will evolve. Not connotations, but predictions of future experiences.
My impression of cults is that they essentially have three possible futures: Some of them become small, increasingly isolated groups, that die with their members. Others are viral enough to keep replacing the old members with new members, and grow. The most successful ones discover a way of living that does not burn out their members, and become religions. -- Extinction, virality, or symbiosis.
What determines which way a cult will go? Probably it’s compatibility of long-term membership with ordinary human life. If it’s too costly, if it requires too much sacrifice from members, symbiosis is impossible. The other two choices probably depend on how much effort does the group put into recruiting new members.
Having too many young people is some evidence of incompatibility. Perhaps the group requires a level of sacrifice that a university student can pay, but an employed father or mother with children simply cannot. What happens to the members who are unable to give the sacrifice? Well… in LW community nothing happens to anyone. How boring! It’s not like I will become excommunicated if I stop reading the website every day. (Although, I may be excommunicated from the “top contributors, 30 days” list.) So the question is whether members who don’t have too much time can still find the community meaningful, or whether they will leave voluntarily. In other words: Is LessWrong useful for an older person with a busy family life? -- If yes, symbiosis is possible; if no, it’s probably virality as long as the website and rationality seminars keep attracting new people, or extinction if they fail to.
In this light, I am happy about recent Gunnar’s articles, and I hope he will not be alone. Because the cult (or Mormonism) analogy suggests that this is the right way to go, for long-term sustainability.
The other analogy, with shining self-improvement seminars, seems more worrying to me. In self-improvement seminars, the speakers are not rewarded for actually helping people; they are rewarded for sounding good. (We know the names of some self-help gurus… but don’t know the names of people who became awesome because of their seminars. Their references are all about their image, none about their product.) Is rationality advice similar?
This is an old problem, actually the problem discussed in the oldest article on LessWrong: if we can’t measure rationality, we can’t measure improvements in rationality… and then all we can say about CFAR rationality lessons is that they feel good. -- Unless perhaps the questionnaires given to people who did and didn’t attend rationality minicamps showed some interesting results.
And by the way, whether we have or don’t have an applied rationality subreddit, doesn’t seem too important to me. The important thing is, whether we have or don’t have applied rationality articles. (Those articles could as well be posted in Main or Discussion. And the new subreddit will not generate them automatically.)
Agreed. One of the reasons why I wrote a comment that was a bunch of links to other posts is because I think that there is a lot to say about this topic. Just “LW is like the Mormon Church” was worth ~5 posts in main.
A related question: is LessWrong useful for people who are awesome, or just people who want to become awesome? This is part of patrissimo’s point: if you’re spending an hour a day on LW instead of an hour a day exercising, you may be losing the instrumental rationality battle. If someone who used to be part of the LW community stops posting because they’ve become too awesome, that has unpleasant implications for the dynamics of the community.
I was interested in that because “difference between the time a good idea is suggested and the time that idea is implemented” seems like an interesting reference class.
Isn’t this a danger that all online communities face? Those who procrastinate a lot online get a natural advantage against those who don’t. Thus, unless the community is specifically designed against that (how exactly?), the procrastinators will become the elite.
(It’s an implication: Not every procrastinator becomes a member of elite, but all members of elite are procrastinators.)
Perhaps we could make an exception for Eliezer, because for him writing the hundreds of articles was not procrastination. But unless writing a lot of stuff online is one’s goal, then procrastination is almost a necessity to get a celebrity status on a website.
Then we should perhaps think about how to prevent this effect. A few months ago we had some concerned posts against “Eternal September” and stuff. But this is more dangerous, because it’s less visible, it is a slow, yet predictable change, towards procrastination.
Yes, which is I think a rather good support for having physical meetups.
Agreed.
Note that many of the Eternal September complaints are about this, though indirectly: the fear is that the most awesome members of a discussion are the ones most aggravated by newcomers, because of the distance between them and newcomers is larger than the difference between a median member and a newcomer. The most awesome people also generally have better alternatives, and thus are more sensitive to shifts in quality.
Supporting this, I’ll note that I don’t see many posts from, say, Wei Dai or Salamon in recent history—though as I joined all of a month of ago take that with a dish of salt.
I wonder if something on the MIRI/CFAR end would help? Incentives on the actual researchers to make occasional (not too many, they do have more important things to do) posts on LessWrong would probably alleviate the effect.
Perhaps to some degree, different karma coefficients could be used to support what we consider useful on reflection (not just on impulse voting). For example, if a well-researched article generated more karma than a month of procrastinating while writing comments...
There is some support for this: articles in Main get 10× more karma than comments. But 10 is probably not enough, and also it is not obvious what exactly belongs to Main; it’s very unclearly defined. Maybe there could be a Research subreddit where only scientific-level articles are allowed, and there the karma coefficient could be pretty high. (Alternatively, to prevent karma inflation, the karma from comments should be divided by 10.)
I don’t think that “sounding good” is a accurate description of how people in the personal development field succeed.
Look at Tony Robbins who one of the most successful in the business. When you ask most people whether walking on hot coal is impressive they would tell you that it is. Tony manages to get thousands of people in a seminar to walk over hot coals.
Afterwards they go home and tell there friends about who they walked about hot coals. That impresses people and more people come to his seminars.
It not only that his talk sounds good but that he is able to provide impressive experiences.
On the other hand his success is also partly about being very good at building a network marketing structure that works.
But in some sense that not much different than the way universities work. They evidence that universities actually make people successful in life isn’t that strong.
I don’t think so. If you are a scientologist and believe in Xenu that reduces your compatibility with ordinary human life. At the same time it makes you more committed to the group if you are willing something to belong.
Opus Dei members wear cilice to make themselves uncomfortable to show that they are committed.
I think the fact that you don’t see where many people as members of groups that need a lot of commitment is a feature of 20th century where mainstream society with institution such as television that are good at presenting a certain culture which everyone in a country has a shared identity.
At the moment all sort of groups like the Amnish or LDS that require more commitment of their members seem to grow. It could be that we have a lot more people as members of groups that require sacrifice in a hundred years than we have now.
I’d like to here point out a reference class that includes if I understood things right: the original buddhism movement, the academic community of france around the revolution, and the ancient greek philosophy tradition. More examples and a name for this reference class would be welcome.
I include this mainly to counterbalance the bias towards automatically looking for the kind of cynical reference classes typically associated with and primed by the concept of the outside view.
Looking at the reference class examples I came up with, there seems to be a tendency towards having huge geniuses at the start that nobody later could compete with, wich lead after a few centuries to dissolving into spinoff religions dogmatically accepting or even perverting the original ideals.
DISCLAIMER: This is just one reference class and the conclusions reached by that reference class, it was reached/constructed by trying to compensate for a predicted bias with tends to lead to even worse bias on average.
Thank you! This is the reference class I was looking for, so it is good to see someone able to overcome the priming done by, uhm, not exactly the most neutral people. I had a feeling that something like this is out there, but specific examples did not come into my mind.
The danger of not being able to “replicate” Eliezer seems rather realistic to me. Sure, there are many smart people in CFAR, and they will continue doing their great work… but would they be able to create a website like LW and attract many people if they had to start from zero or if for whetever reasons the LW website disappeared? (I don’t know about how MIRI works, so I cannot estimate how much Eliezer would be replaceable there.) Also, for spreading x-rationality to other countries, we need local leaders there, if we want to have meetups and seminars, etc. There are meetups all over the world, but they are probably not the same thing as the communities in Bay Area and New York (although the fact that two such communities exist is encouraging).
I think the cult view is valuable to look at some issues.
When you have someone asking whether he should cut ties with his nonrational family members is valuable to keep in mind that’s culty behavior.
Normal groups in our society don’t encourage their members to cut family ties. Bad cults do those things. That doesn’t mean that there’s never a time where one should rationally advice someone to cut those ties, but one should be careful.
Given the outside view of how cults went to a place where people literally drunk kool aid I think it’s important to encourage people to keep ties to people who aren’t in the community.
Part of why Eliezer banned the basilisk might have been that having it more actively around would push LessWrong in the direction of being a cult.
There’s already the promise for nearly eternal life in a FAI moderated heaven.
It always useful to investigate where cult like behaviors are useful and where they aren’t.
To maybe help others out and solve the trust bootstrapping involved, I’m offering for sale <=1 bitcoin at the current Bitstamp price (without the usual premium) in exchange for Paypal dollars to any LWer with at least 300 net karma. (I would prefer if you register with #bitcoin-otc, but that’s not necessary.) Contact me on Freenode as
gwern
.EDIT: as of 9 September 2013, I have sold to 2 LWers.
Pardon me, but—what is the trust boostrapping involved?
Paypal allows clawbacks for months, hence it’s difficult to sell for Paypal to anyone who is not already in the -otc web of trust; but by restricting sales to high-karma LWers, I am putting their reputation here at risk if they scam me, which enables me to sell to them. Hence, they can acquire bitcoins & get bootstrapped into the -otc web of trust based on LW.
Liquid nitrogen user
Thank you. All I need is hand held spray thermos to make Australia a viable working vacation. I have a strong irrational aversion to spiders. This is much more acceptable than the home made flamer.
http://www.pnas.org/content/early/2013/08/21/1301888110
Does this also work with macaques, crows or some other animals that can be taught to use money, but didn’t grow up in a society where this kind of money use is taken for granted?
Not strictly the same, but there have been monkey money experiments. And the results are hilarious. www.zmescience.com/research/how-scientists-tught-monkeys-the-concept-of-money-not-long-after-the-first-prostitute-monkey-appeared/
Who is this and what has he done with Robin Hanson?
The central premise is in allowing people to violate patents if it is not “intentional”. While reading the article the voice in my head which is my model of Robin Hanson was screaming “Hypocrisy! Perverse incentives!” in unison with the model of Eliezer Yudkowsky which was also shouting “Lost Purpose!”. While the appeal to total invasive surveillance slightly reduced the hypocrisy concerns it at best pushes the hypocrisy to a higher level in the business hierarchy while undermining the intended purpose of intellectual property rights.
That post seemed out of place on the site.
This may be an odd question, but what (if anything) is known on turning NPCs into PCs? (Insert your own term for this division here, it seems to be a standard thing AFAICT.)
I mean, it’s usually easier to just recruit existing PCs, but …
I suspect that finding people on the borderline between the categories and giving them a nudge is part of the solution to this problem.
What do you need PCs to do that NPCs cannot do? Zeroing in on the exact quality needed may make the problem easier.
Take the leadership feat, and hope your GM is lazy enough to let you level them. More practically, is it a skills problem or as I would guess an agency problem? Can impress on them the importance of acting vs not? Lend them the Power of Accountability? 7 habits of highly effective people? Can you compliment them every time they show initiative? etc. I think the solution is too specific to individuals for general advice, nor do I know a general advice book beyond those in the same theme as those mentioned.
Heh.
Agency. I’ve just noticed how many people I interact with are operating almost totally on cached thoughts, and getting caught up in a lot of traps that they could avoid if they were in the correct frame of mind (ie One Of Us.) But you have to be … motivated correctly, I guess, in order to turn to rationalism or some other brand of originality. Goes my reasoning.
Yeah, could be. I figure it’s always possible someone already solved this, though, so I’d rather find there’s already a best practice than kick myself much later for reinventing the wheel ( or worse, giving up!)
Sometimes I even think that I would profit from having some cached thoughts that give me effective habits that I fulfill at every occasion without thinking too much.
When the alarm bell rings it would be good if I would have a cached thought that would make me automatically get up without thinking the decision through.
I don’t think the state of being paralysed because you killed all cached thought is particulary desirable. I think I spent too much time in that state in the last year ;)
I think it’s more a question of focusing your energy on questioning those cached thoughts that actually matter.
When it comes to agency I think there are some occasions where I show a lot but others where I show little. Expecially when you compare me to an average person the domains in which I show my agency are different.
I can remember one occasion where I took more responsibility for a situation after reading the transition of McGonneral from NPC to PC in HPMOR.
I think that HPMOR is well written when it comes to installing the frame of mind you are talking about.
Oh, we evolved them for a reason. Heck, your brain almost certainly couldn’t function without at least some. But when people start throwing type errors whenever something happens and a cached though doesn’t kick in, they could probably do with a little more original thought.
That said, there’s more to agency and PC-ness than cached thoughts. It was just particularly striking to see people around me fishing around for something familiar they knew how to respond to, and that’s what prompted me to wonder how much we knew about the problem.
The Travelling Salesman Problem
In loading trucks for warehouses, some OR guys I know ran into the opposite problem- they encoded all the rules as constraints, found a solution, and it was way worse than what people were actually doing. Turns out it was because the people actually loading the trucks didn’t pay attention to whether or not the load was balanced on the truck, or so on (i.e. mathematically feasible was a harder set of constraints than implementable because the policy book was harder than the actuality).
(I also don’t think it’s quite fair to call the OR approach ‘punting’, since we do quite a bit of optimization using heuristics.)
If anyone wants to teach English in China, my school is hiring. The pay is higher than the market rate and the management is friendly and trustworthy. Must have a Bachelor’s degree and a passport from and English speaking country. If you are at all curious, PM me for details.
I have updated on how important it is for Friend AI to succeed (more now). I did this by changing the way I thought about the problem. I used to think in terms of the chance of Unfriendly AI, this lead me to assign a chance of whether a fast, self-modifying, indifferent or FAI was possible at all.
Instead of thinking of the risk of UFAI, I started thinking of the risk of ~FAI. The more I think about it the more I believe that a Friendly Singleton AI is the only way for us humans to survive. FAI mitigates other existential risks of nature, unknowns, human cooperation (Mutually Assured Destruction is too risky), as well as hostile intelligences; both human and self-modifying trans-humans. My credence – that without FAI, existential risks will destroy humanity within 1,000 years – is 99%.
Is this flawed? If not then I’m probably really late to this idea, but I thought I would mention it because it’s taken considerable time for me to see it like this. And if I were to explain the AI problem to someone who is uninitiated, I would be tempted to lead with the ~FAI is bad, rather than UFAI is bad. Why? Because intuitively, the dangers of UFAI feels “farther” than ~FAI. First people have to consider whether or not it’s even possible for AI, then consider why its bad for for UFAI, this is a future problem. Whereas ~FAI is now, it feels nearer, it is happening – we have come close to annihilating ourselves before and technology is just getting better at accidentally killing us, therefore let’s work on FAI urgently.
So you want a god to watch over humanity—without it we’re doomed?
As of right now, yes. However, I could be persuaded otherwise.
I find it unlikely that you are well calibrated when you put your credence at 99% for a 1,000 year forecast.
Human culture changes over time. It’s very difficult to predict how humans in the future will think about specific problems. We went in less than 100 years from criminalizing homosexual acts to lawful same sex marriage.
Could you imagine that everyone would adopt your morality in 200 or 300 hundred years? If so do you think that would prevent humanity from being doomed?
If you don’t think so, I would suggest you to evaluate your own moral beliefs in detail.
Is there a name for, taking someone being wrong on A as evidence as being wrong on B? Is this a generally sound heuristic to have? In the case of crank magnetism; should I take someone’s crank ideas, as evidence against an idea that is new and unfamiliar to me?
It’s evidence against them being a person whose opinion is strong evidence of B, which means it is evidence against B, but it’s probably weak evidence, unless their endorsement of B is the main thing giving it high probability in your book.
I don’t know if there’s a name for this, but I definitely do it. I think it’s perfectly legitimate in certain circumstances. For example, the more B is a subject of general dispute within the relevant grouping, and the more closely-linked belief in B is to belief in A, the more sound the heuristic. But it’s not a short-cut to truth.
For example, suppose that you don’t know anything about healing crystals, but are aware that their effectiveness is disputed. You might notice that many of the same people who (dis)believe in homeopathy also (dis)believe in healing crystals, that the beliefs are reasonably well-linked in terms of structure, and you might already know that homeopathy is bunk. Therefore it’s legitimate to conclude that healing crystals are probably not a sound medical treatment—although you might revise this belief if you got more evidence. On the other hand, note that reversed stupidity is not truth—healing crystals being bunk doesn’t indicate that conventional medicine works well.
The place where I find this heuristic most useful is politics, because the sides are well-defined—effectively, you have a binary choice between A and ~A, regardless of whether hypothetical alternative B would be better. If I stopped paying attention to current affairs, and just took the opposite position to Bob Crow on every matter of domestic political dispute, I don’t think I’d go far wrong.
I don’t know if there is a name for it, but there ought to be one, since this heuristic is so common: the reliability prior of an argument is the reliability of the arguer. For example, one reason I am not a firm believer in the UFAI doomsday scenarios is Eliezer’s love affair with MWI.
Yes, but in many cases it’s very weak evidence. Overweighing it leads to the “reversed stupidity” failure mode.
Bayes’ theorem to the rescue! Consider a crank C, who endorses idea A. Then the probability of A being true, given that C endorses it equals the probability of C endorsing A, given that A is true times the probability that A is true over the probability that C endorses A.
In equations: P(A being true | C endorsing A) = P(C endorsing A | A being true)*P(A being true)/P(C endorsing A).
Since C is known to be a crank, our probability for C endorsing A given that A is true is rather low (cranks have an aversion to truth), while our probability for C endorsing A in general is rather high (i.e. compared to a more sane person). So you are justified in being more skeptical of A, given that C endorses A.
It’s a logical fallacy, but is something humans evolved to do (or didn’t evolve not to do), so may in fact be useful when dealing with humans you know in your group.
Somewhat related: The Correct Contrarian Cluster.
Horrifically misnamed.
ad hominem
Not that there’s anything wrong with that.
Are old humans better than new humans?
This seems to be a hidden assumption of cryonics / transhumanism / anti-deathism: We should do everything we can to prevent people from dying, rather than investing these resources into making more or more productive children.
The usual argument (which I agree with) is that “Death events have a negative utility”. Once a human already exists, it’s bad for them to stop existing.
So every human has a right to their continued existence. That’s a good argument. Thanks.
Complement it with the fact that it costs about 800 thousand dollars to raise a mind, and an adult mind might be able to create value at rates high enough to continue existing. .
Makaulay Culkin and Haley Joel Osmend (or whatever spelling) notwithstanding, that is a good argument against children.
An adult, yes. But what about the elderly? Of course this is an argument for preventing the problems of old age.
Is it? It just says that you should value adults over children, not that you should value children over no children. To get one of these valuable adult minds you have to start with something.
How does that negative utility vary over time though? Because if it stays the same (or increases) then if we know now it’s impossible to live 3^^^3 years, then disutility from death sooner than that is counterbalanced (or more than that) by averted disutility from dying later, meaning decisions made are basically the same as if you didn’t disvalue death (or as if you valued it).
I think that part of the badness of death is the destruction of that person’s accumulated experience. Thus the negative utility of death does indeed increase over time. However this is counterbalanced by the positive utility of their continued existence. If someone lives to 70 rather than 50 then we’re happy because the 20 extra years of life were worth more than the worsening of the death event.
In this case, it seems like the best policy is cryopreserving then letting them stay dead but extracting those experiences and inserting them in new minds.
Which sounds weird when you say it like that, but is functionally equivalent to many of the scenarios you would intuitively expect and find good, like radically improving minds and linking them into bigger ones before waking them up since anything else would leave them unable to meaningfully interact with anything anyway and human-level minds are unlikely to qualify for informed consent.
So if Bob is cryopreserved, and I can res him for N dollars, or create a simulation of a new person and run them quickly enough to catch up a number of years equal to Bob’s age at death, for N − 1 dollars, I should spend all available dollars on the latter?
Edit: to clarify why I think this is implied by your answer, what this is doing is trading such that you gain a death at Bob’s current age, but gain a life of experience up to Bob’s current age. If a life ending at Bob’s current age is net utility positive, this has to be net utility positive too.
broadly: yes, though all available dollars is actually all available dollars (for making people), and you’re ignoring considerations like keeping promises to people unable to enforce them such as the cryopreserved or asleep or unconscious etc.
Assuming Rawls’s veil of ignorance, I would prefer to be randomly born in a world where a trillion people lead billion-year lifespans than one in which a quadrillion people lead million-year lifespans.
I agree, but is this the right comparison? Isn’t this framing obscuring the fact that in the trillion-people world, you are much less likely to be born in the first place, in some sense?
Let us try this framing instead: Assume there are a very large number Z of possible different human “persons” (e.g. given by combinatorics on genes and formative experiences). There is a Rawlsian chance of 1/Z that a new created human will be “you”. Behind the veil of ignorance, do you prefer the world to be one with X people living N years (where your chance of being born is X/Z) or the one with 10X people living N/10 years (where your chance of being born is 10X/Z)?
I am not sure this is the right intuition pump, but it seems to capture an aspect of the problem that yours leaves out.
Rawls’s veil of ignorance + self-sampling assumption = average utilitarianism, Rawls’s veil of ignorance + self-indication assumption = total utilitarianism (so to speak)? I had already kind-of noticed that, but hadn’t given much thought to it.
Doesn’t Rawls’s veil of ignorance prove too much here though? If both worlds would exist anyway, I’d rather be born into a world where a million people lived 101 year lifetimes than a world where 3^^^3 people lived 100 year lifetimes.
So then, Rawls’s veil has to be modified such that you are randomly chosen to be one of a quadrillion people. In scenario A, you live a million years. In scenario B, one trillion people live for one billion years each, the rest are fertilized eggs which for some reason don’t develop.
I’d still choose B over A.
Would you? A million probably isn’t enough to sustain a modern economy, for example. (Although in the 3^^^3 case it depends on the assumed density since we can only fit a negligible fraction of that many people into our visible universe).
If the economies would be the same, then yes. Don’t fight the hypothetical.
I think “fighting the hypothetical” is justified in cases where the necessary assumptions are misleadingly inaccurate—which I think is the case here.
But compared to 3^^^3, it doesn’t matter whether it’s a million people, a billion, or a trillion. You can certainly find a number that is sufficient to sustain an economy and is still vastly smaller than 3^^^3, and you will end up preferring the smaller number for a single additional year of lifespan. Of course, for Rawls, this is a feature, not a bug.
Existing people take priority over theoretical people. Infinitely so. This should be obvious, as the reverse conclusion ends up with utter absurdities of the “Every sperm is sacred” variety.
Mad grin
Once a child is born, it has as much claim on our consideration as every other person in our light cone, but there is no obligation to have children. Not any specific child, nor any at all. Reject this axiom and you might as well commit suicide over the guilt of the billions of potentials children you could have that are never going to be born. Right now.
Even if you stay pregnant till you die/never masturbate, this would effectively not help at all—each conception moves one potential from the space of “could be” to to the space of “is”, but at the same time eliminates at least several hundred million other potential children from the possibility space—that is just how human reproduction works.
TL:DR; yes, yes they are. It is a silly question.
Does this mean that I am free to build a doomsday weapon that kills everyone born after September 4th 2013 100 years from now, if that gets me a cookie?
Not necessarily. It would merely be your obligation to have as many children as possible, while still ensuring that they are healthy and well cared for. At some point having an extra child will make all your children less well of.
Why is there a threshold at birth? I agree that it is a convenient point, but it is arbitrary.
Why should I commit suicide? That reduces the number of people. It would be much better to start having children. (Note that I am not saying that this is my utility function).
The “infinitely so” part seems wrong, but the idea is that 4D histories which include a sentient being coming into existence, and then dying, are dispreferred to 4D world-histories in which that sentient being continues. Since the latter type of such histories may not be available, we specify that continuing for a billion years and then halting is greatly preferable to continuing for 10 years then halting. Our degree of preference for such is substantially greater than the degree to which we feel morally obligated to create more people, especially people who shall themselves be doomed to short lives.
The switch from consquentialist language (“4D histories which include… are dispreferred”) to deontological language (“…the degree to which we feel morally obligated to create more people”) is confusing. I agree that saving the lives of existing people is a stronger moral imperative than creating new ones, at the level of deontological rules and virtuous conduct which is a large part of everyday human moral reasoning. I am much less clear than when evaluating 4D histories I assign higher utility to one with few people living long lives than to one with more people living shorter lives. Actually, I tend towards the opposite intuition preferring a world with more people who live less (as long as the their lives are still well worth living, etc.)
Not sure what part of this comment tree this belongs so just posting it here where it’s likely to be seen:
It struck me with an image that it’s not at all necessary that these tradeoffs are actually a thing once you dissolve the “person” abstraction; it’s possible that something like the following is optimal: half the universe is dedicated to search the space of all experiences in order starting with the highest utility/most meaningful/lowest hanging fruit. This is then aggregated and metadata added and sent to the other half which is tiled with minimal context-experiencing units equivalent to individual peoples subjective whatever. in the end, you end up with equivalent to if you had half the number of individual people as if that was your only priority, each having the utility as a single person with the entire future history of half the universe dedicated to it, including context of history.
Thats the best case scenario. It’s pretty certain SOME aspect or another of the fragile godshatter will disallow it obviously.
Yea, this was basically pseud tangential musings.
If by “old humans” you mean healthy adults, yes. If you mean this, no. (IMO—YMMV.)
Death isn’t just a negative for the dead person—it also causes paperwork and expenses, destruction of relationships, and grief among the living.
This is true, but in my experience usually used to massage models that don’t consider death a disutility into giving the right answers. I can’t think of ever hearing this argument used for any other reason, in fact, in meatspace.
(Replying to this comment out of context on the Recent Comments.)
The context is someone asking whether it’s better to stop existing people from dying or just make new people.
Hmm. I guess I’m going to cautiously say “called it!”
Yes.
Because?
a level 5 character is more valuable than a level 1 character.
A person who is older has more to give the world and has been more invested in than a baby. they’re a lot less replaceable.
also i like em more.
Is there a good way to avoid HPMOR spoilers on prediction book?
Since PB users’ calibrations are not yet good enough to see the future, you can easily avoid MoR spoilers by subscribing to the email or RSS alerts for new chapters & reading them as appropriate.
This is the obvious solution, but I want to reread what I’ve currently read, and have some time to think about the story and try creating an accurate causal model of events and such in the story as I read new!Adele material (Eliezer says it’s supposed to be a solvable puzzle). I don’t have time to do this right now, so in the meantime, I try to avoid spoilers.
I used feed43 to create an rss feed out of recent predictions. Then I used feedrinse to filter out references to hpmor resulting in a safe feed. (Update: chaining unreliable services makes something even less reliable.)
You could do the same for the pages of recently judged or future or users you follow. I think feedrinse offers to merge feeds (into a “channel”) before or after doing the filtering. But if you find someone new and just want to click on the username, you’ll leave the safe zone. Even if you see someone you have processed, the username will take you to the unsafe page.
A better solution would be to write a greasemonkey script that modified each predictionbook page as you look at it.
The final feedrinse feed works in a couple of my browsers, but not chrome. Probably sending it through feedburner would fix it.
feed43 was finicky. The item search pattern was:
{_}
{_}{%}{%}
The regexp I used in feedrinse was /hp.?mor/
It is case insensitive and manages to eliminate “HP MoR:”, “[HPMOR]”, etc. It won’t work if they spell it out, or just predict “Harry is orange” without indicating which story they’re predicting about. In that case, someone will probably leave a hpmor comment, but this doesn’t see such comments.
If you are skilled in the art of Ruby, then yes. Otherwise, maybe. People (myself included) have been complaining about the lack of tagging/sorting system on PB for quite some time, but so far, no one has played the hero.
The following query is sexual in nature, and is rot13′ed for the sake of those who would either prefer not to encounter this sort of content on Less Wrong, or would prefer not to recall information of such nature about my private life in future interactions.
V nz pheeragyl va n eryngvbafuvc jvgu n jbzna jub vf fvtavsvpnagyl zber frkhnyyl rkcrevraprq guna V nz. Juvyr fur cerfragyl engrf bhe frk nf “njrfbzr,” vg vf abg lrg ng gur yriry bs “orfg rire,” juvpu V ubcr gb erpgvsl.
Sbe pynevsvpngvba, V jbhyq fnl gung gur trareny urnygu naq fgnovyvgl bs bhe eryngvbafuvc vf rkgerzryl uvtu; guvf chefhvg vf n znggre bs erperngvba naq crefbany cevqr, abg n arprffnel vagreiragvba gb fnir gur eryngvbafuvc.
V’ir nyernql frnepurq bayvar sbe nyy gur vasbezngvba V pna svaq ba vzcebivat gur dhnyvgl bs frk, ohg fhpu vasbezngvba vf birejuryzvatyl rvgure gnetrgrq ng oevatvat crbcyr jvgu fbzr frevbhf qrsvpvrapl va gurve frk yvirf hc gb gur yriry bs abeznyvgl be fngvfslvat gurz gung gur abez vf yrff fcrpgnphyne guna gurl guvax naq gurl qba’g unir gb yvir hc gb vasyngrq fgnaqneqf, engure guna crbcyr gelvat gb npuvrir frk jnl bhg ba gur sne raq bs gur oryy pheir, be gnetrgrq ng crbcyr jub jbhyqa’g xabj jung “rzcvevpny onpxvat” jnf vs lbh uvg gurz va gur snpr jvgu vg.
V’z nyernql snzvyvne jvgu gur zbfg boivbhf, ybj unatvat sehvg vagreiragvbaf fhpu nf “pbageby fgerff,” “qb xrtryf,” rgp, naq jr pbzzhavpngr nobhg bhe frkhny cersreraprf naq npgvivgvrf rkgrafviryl. V’z nyfb nppbhagvat sbe snpgbef fhpu nf gur rzbgvbany pbagrkg bs bhe rapbhagref naq ubezbany plpyrf. Jung V’z ybbxvat sbe ng guvf cbvag ner rkprcgvbany zrnfherf sbe guvatf yvxr envfvat zl frkhny fgnzvan gb hahfhny yriryf, vapernfvat ure yriry bs nebhfny naq/be frafvgvivgl, naq fb sbegu. Obgu purzvpny naq abapurzvpny zrnfherf ner npprcgnoyr, ohg V jbhyq yvxr gb nibvq nalguvat yvxryl gb pneel qnatrebhf fvqr rssrpgf, naq vs cbffvoyr V jbhyq cersre abg gb erfbeg gb guvatf gung jbhyq erdhver zr gb trg n cerfpevcgvba sebz n qbpgbe juvyr qvfpybfvat gung V’z hfvat vg ba n cheryl erperngvbany onfvf.
Nal nqivpr jbhyq or nccerpvngrq.
To be honest, you sound a bit like a person who made a billion dollars and now tries to crowd-source a way to make ten billions. :-)
Well, I’m flattered that you think my position is so enviable, but I also think this would be a pretty reasonable course of action for someone who made a billion dollars.
This book pbhyq uryc jvgu gur fgnzvan. Vg jbexrq sbe zl uhfonaq, jura ur gevrq vg n srj lrnef ntb.
Are the instructions anything simple enough that I could replicate them without needing to buy the entire book?
Maybe, but then I’d have to read it to find out, and I have many other books I’d like to read. Maybe you can find it in the library?
I’ll check; I’m pretty sure my own library doesn’t have a Sex section, but it might be in network.
Asking to order it would be pretty embarrassing, I have to admit, especially at my own library where a lot of the people who work there know me by name.
Dewey Decimal number 613.96, IIRC from my internet-deprived adolescence.
If you’re too cheap to spend $4 at amazon, pirate it.
Slow Sex seems to help move at least some people to move from good to great.
Does that entail sex literally done slowly? We could try it out, but that doesn’t seem to be to her preferences.
It involves learning to pay more attention as a meditative practice, but not (I think) a recommendation to always go slowly.
Practice makes perfect. I think a lot of good sex is intuitively reading your partner’s signals and ramping things up/down with good timing in response to them. I think this is something you might be able to learn via logos but I think it’s much more likely to be something you need to experience before you can get good at it. When to pull hair, when to thrust deeper, etc.
In general I and whoever I’m with have had more fun when I felt I had a good idea of what they wanted in the moment, which I think I’ve gotten better at mainly through practice.
I suspect that I can continue to improve with practice, but I’d like to be able to set out every option available to me on the table.
Even if I can attain the status of “best” without taking such extraordinary measures, this is something I’m genuinely competitive on, which at least to me means that simply taking first place isn’t sufficient if I can still see avenues to top myself.
People sometimes say that we don’t choose to be born. Is this false if I myself choose to have kids for the same reason my parents did (or at least to have kids if I was ever in the relevantly same situation?) If so, can I increase my measure by having more children for these reasons?
Technically yes, but obviously if this is part of your motivation for doing so then thats a meaningful difference unless your parents also understood TDT and had that as part of their reason, so if in fact they did not (which they probably didn’t since it wasn’t invented back then) then this answer is entirely useless.
Other forms of folk rule consequentialism (e.g. the Golden Rule) have existed for quite a long time.
Interesting point! This is quite a tricky problem that I’ve considered before. My current stance is we need to know more of the specifics and causal history of how those kind of rules are implemented in general before we can determine if they count, and there is also the possibility that once we’ve done that it turns out they “should” but our current formalizations won’t… This is an interesting mostly unexplored (i think) subject that seems likely to spawn an fruitful discussion thou.
Has anyone here read up through ch18 of Jaynes’ PT:LoS? I just spent two hours trying to derive 18.11 from 18.10. That step is completely opaque to me, can anybody who’s read it help?
You can explain in a comment, or we can have a conversation. I’ve got gchat and other stuff. If you message me or comment we can work it out. I probably won’t take long to reply, I don’t think I’ll be leaving my computer for long today.
EDIT: I’m also having trouble with 18.15. Jaynes claims that P(F|A_p E_aa) = P(F|A_p) but justifies it with 18.1… I just don’t see how that follows from 18.1.
EDIT 2: It hasn’t answered my question but there’s online errata for this book: http://ksvanhorn.com/bayes/jaynes/ Chapter 18 has a very unfinished feel, and I think this is going to help other confusions I get into about it
I’ve just looked and I have no idea either. If anyone wants to help there’s a copy of the book here.
EDIT: The numbers in that copy are off by 1 from the book. “18.10” = “18-9″ and so on.
Yeah, so to add some redundancy for y’all, here’s the text surrounding the equations I’m having trouble with.
The 18.10 to 18.11 jump I’m having trouble with is the one in this part of the text:
And equation 18.15, which I can’t justify, is in this part of the text:
A not-quite-rigorous explanation of the thing in 18.15:
E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it’s no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).
That’s not true though. By construction, every part of it is relevant to A.
That doesn’t mean it’s not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it’d be relevant to F.
Although… thinking about that Bayes net might answer other questons...
Hmm. Remember that Ap screens A from everything. I think that means that A’s only connection is to Ap—everything else has to be connected through Ap.
So the above Bayes net is really
Eaa --> Ap --> F With another arrow from Ap to A.
Which would mean that Ap screens Eaa from F, which is what 18.15 says.
The above Bayes net represents an assumption that Eaa and F’s only relevance to each other is that they’re both evidence of A, which is often true I think.
Hmm. When I have some time I’m gonna draw Bayes nets to represent all of Jaynes’ assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.
In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted:
Yeah, I don’t see how those make sense either.
A Singularity conference around a project financed by a Russian oligarch, seems to be mostly about uploading and ems.
Looks curious.
I learned about Egan’s Law, and I’m pretty sure it’s a less-precise restatement of the correspondence principle. Anyone have any thoughts on that similarity?
Sounds good to me, although that’s not what I would have guessed from a name like ‘correspondence principle’.
I suppose some minor difference is that this “law” is also applicable to meta-ethics, not just to physics. It’s probably worth adding a link to the standard terminology to the LW wiki page.
I found this interesting post over at lambda the ultimate about constructing a provably total (terminating) self-compiler. It looked quite similar to some of the stuff MIRI has been doing with the Tiling Agents thing. Maybe someone with more math background can check it out and see if there are any ideas to be shared?
The post: Total Self-Compiler via Superstitious Logics
This is the same basic idea as Benja’s Parametric Polymorphism, with N in the post corresponding to kappa from parametric polymorphism.
The “superstition” is:
And from the section in Tiling Agents about parametric polymorphism (recommended if you want to learn about parametric polymorphism):
Anyway, it’s interesting that someone else has a very similar idea for this kind of problem. But as mentioned in Tiling Agents, the “superstitious belief” seems like a bad idea for an epistemically rational agent.
It is neat that this problem is coming up elsewhere. It reminds me that MIRI’s work could be relevant to people working in other sub-fields of math, which is a good sign and a good opportunity.
An Open Letter to Friendly AI Proponents by Simon Funk (who wrote the After Life novel):
So, in other words, absolutely no engagement with the actual ideas/arguments of the people the ‘letter’ is addressed to.
Clarify?
He’s ignoring everything Friendly AI Proponents have said on the issue , and is attacking a strawman instead of the real reasons FAI people think it’s a problem.
Trying to understand here. What’s the strawman in this case?
Can you point me to an essay that addresses the points in this one?
He doesn’t really make any relevant points.
The closest is this:
Which is really just an assertion that you won’t get FOOM (I mean, no one thinks it’ll take less time than it takes you to hit Ctrl-C, but that’s just hyperbole for writing style). He doesn’t argue for that claim, he doesn’t address any of the arguments for FOOM (most notably and recently: IEM).
Ah, thanks, better understand your position now. I will endeavor to read IEM (if it isn’t too stocked with false presuppositions from the get go).
I agree the essay did not endeavor to disprove FOOM, but let’s say it’s just wrong on that claim, and that FOOM is really a possibility—then are you saying you’d rather let the military AI go FOOM than something homebrewed? Or are you claiming that it’s possible to reign in military efforts in this direction (world round)? Or give me a third option if neither of those applies.
AS to the ‘third option’, most work that I’m aware of falls into either educating people about AI risk, or trying to solve the problem before someone else builds an AGI. Most people advocate both.
FAI proponents (Of which I am one, yes) tend to say that, ceteris paribus, an AGI which is constructed without first ‘solving FAI’* will be ‘Unfriendly’, military or otherwise. This would be very very bad for humans and human values.
*There is significant disagreement over what exactly this consists of and how hard it will be.
Do you think people who can’t implement AGI can solve FAI?
The problem of Friendliness can be worked on before the problem of AGI has been solved, yes
Which do you think is more likely: That you will die of old age, or of unfriendly-AI? (Serious question, genuinely curious.)
I have lots of uncertainty around these sorts of question, especially considering the timeline-dependency and the how-likely-are-MIRI(and others)-to-succed-at-avoiding-UFAI. Suffice to say that it’s not an obvious win for ‘old age’, (which for me is hopefully >50 years away).
Let’s imagine you solve FAI tomorrow, but not AGI. (I see it as highly improbable that anyone will meaningfully solve FAI before solving AGI, but let’s explore that optimistic scenario.) Meanwhile, various folks and institutions out there are ahead of you in AGI research by however much time you’ve spent on FAI. At least one of them won’t care about FAI.
I have a hard time imagining any outcome from that scenario that doesn’t involve you wishing you’d been working on AGI and gotten there first. How do you imagine the outcome?
“worked on” != “solved”
(In addition, MIRI claims that a FAI could be easier to implement than an AGI in general- i.e. that if you solve the philosophical difficulties regarding FAI, this also makes it easier to create an AGI in general. For example, MIRI’s specific most-likely scenario for the creation of an AGI is a sub-human AI that self-modifies to become smarter very quickly; MIRI’s research on modeling self-modification, while aimed at solving one specific problem that stands in the way of Friendliness, also has potential applications towards understanding self-modification in general.)
drethlin nailed it—If I counterfactually-had spent that time working on AGI, I wouldn’t have solved Friendliness, and (unless someone else had solved FAI without me) my AGI would be just as Unfriendly in expectation as the competitors.
If FAI is solved first, however, it increases the probability that the first AGI will be Friendly. Depending on the nature of the solution (how much of it is something that can be published so others can use it with their AGIs?), this could happen through AGI development by people already convinced of the problem, or it could be ‘added on’ to existing AGi projects.
See reply below to drethlin.
why would he wish that? His unfriendly AI that he’d been working on will probably just kill him.
Sigh.
Ok, I see the problem with this discussion, and I see no solution. If you understood AGI better, you would understand why your reply is like telling me I shouldn’t play with electricity because Zeus will get angry and punish the village. But that very concern prevents you from understanding AGI better, so we are at an impasse.
It makes me sad, because with the pervasiveness of this superstition, we’ve lost enough minds from our side that the military will probably beat us to it.
The other thing is: “Our Side” is not losing minds. People are going to try to make AGI regardless of friendliness. but almost no one anywhere has ever heard of AI friendliness and even fewer give a shit. That means that the marginal person working on friendliness is HUGELY more valuable. And if someone discovers friendliness, guess what? The military are going to want it too! Maybe someone actually insane would not, but any organization that has goals and cares about humans at all will be better off with a friendly AI than not.
“but almost no one anywhere has ever heard of AI friendliness”
Ok, if this is your vantage point, I understand better. I must hang in the wrong circles ’cause I meet far more FAI than AGI folks.
You shouldn’t play with radiation because you don’t understand it and trying to build a bomb might get you and everyone else killed. This isn’t a question of superstition, you fool, it’s a question of NOT just throwing a bunch of radioactive material in a pile to see what happens, only when you fuck it up you don’t just kill yourself like Marie Curie or possibly blow up a few square miles like if the Manhattan Project hadn’t been careful enough. You fuck over EVERYTHING.
Appologies for the provacative phrasing—I was (inadvertently) asking for a heated reply...
But to clarify the point in light of your response (which no doubt will get another heated reply, though honestly trying to convey the point w/out provoking...):
Piles of radioactive material is not a good analogy here. But I think it’s appearance here is a good illustration of the very thing I’m hoping to convey: There are a lot of (vague, wrong) theories of AGI which map well to the radioactive pile analogy. Just put enough of the ingredients together in a pile, and FOOM. But the more you actually work on AGI, the more you realize how heuristic, incremental, and data bound it is; how a fantastic solution to monkey problems (vision, planning, etc) confers only the weakest ability in symbolic domains, and that, for instance, NP problems most likely remain NP hard regardless of intelligence, and their solutions are limited by time, space, and energy constraints—not cleverness. Can a hyper-intelligent AI improve upon hardware design, etc, etc? Sure! But the whole system (of progress) we’re speaking of is a large complex system of differential equations with many bottlenecks, at least some of which aren’t readily amendable to hyper-exponential change. Will there be a point where things are out of our (human) hands? Yes. Will it happen over night? No.
The radioactive pile analogy fails because AGI will not be had by heaping a bunch of stuff in a pile—it will be had through extensive engineering and design. It will progress incrementally, and it will be bounded by resource constraints—for a long long time.
A better analogy might be to building a fusion reactor. Sure, you have to be careful, especially in the final design, construction, and execution of the full scale device, but there’s a huge amount of engineering to be done before you get anywhere near having to worry about that, and tons of smaller experiments that need to be proven first, and so on. And you learn as you go, and after years of work you start getting closer and you know a shitload about the technology and what it’s quirks and hazards are and what’s easy to control, and so on.
And when you’re there—well along the way, and you understand the technology to a deep level even if you haven’t quite figured out how to make the sustainable fusion reactor yet—it’s pretty insulting/annoying when someone who doesn’t have any practical(!) grasp of the matter comes along and tells you you shouldn’t be working on this because they haven’t figured out how to make it safe yet! And because they think that if you put too much stuff in a pile, it will go boom! (Sigh!) You’re on your way to making clean energy before the peak oil appocolypse or whatever, and they’re working against you (if only they knew what their efforts were costing the world).
What do you do there? They’re fearful because they don’t understand, and the particulars of their fear are, really, superstition (in the sense that they are not founded on a solid understanding, but quite specifically on a lack thereof). You want to say: get up to speed, and you can help us make this work and make it safe (and when you understand it better, you’ll start to actually understand how that might be done—and how much less of an explosive problem it is than you think). But GTF out of my way if you’re just going to pontificate from ignorance and try to dictate how I do my job from there. No matter how long you sofa-think about how to keep the pile-o-stuff from going bad-FOOM, your answers are never going to mesh with reality because you’ve got way too many false premises that need to be sifted out first (through actual experience in the topic). [And, sorry, but no matter how big Eliezer’s cloud of self-citations is, that’s just someone else’s sofa-think, not actual experience.]
Personally, I do not think FAI is a hard problem (a highly educated opinion, not offhand dismissal). But I also know that UAI is going to happen eventually (intentionally), no matter how many conferences y’all have. And I also know the odds are highest of all we’ll all die of old age because AI didn’t happen well enough soon enough. But I understand if you disagree.
How not hard is it? How long do you think it would take you to solve it?
I think it will be incidental to AGI. That is, by the time you are approaching human-level AGI it will be essentially obvious (to the sort of person who groks human-level AGI in the first place). Motivation (as a component of the process of thinking) is integral to AGI, not some extra thing only humans and animals happen to have. Motivation needs be grokked before you will have AGI in the first place. Human motivational structure is quite complex, with far more alterior motives (clan affiliation, reproduction, etc) than straightforward ones. AGIs needn’t be so-burdened, which in many ways makes the FAI problem easier in fact than our human-based intuition might surmise. On the other hand, simple random variation is a huge risk—that is, no matter the intentional agenda, there is always the possibility that a simple error will put that very abstract coefficient of feedback over unity, and then you have a problem. If AGI weren’t going to happen regardless, I might say it’s worthy of a debate now what the nature of that problem would be (but in that debate, I still say it’s not a huge problem—it’s not instantaneous FOOM, it’s time-to-uplug FOOM; and you have the advantage of other FAIs by then with full ability to analyze each other so you actually have a lot of tools available to put out fires long before they’re raging); but AGI is going to happen regardless, so the race is not FAI vs. AGI, but whether the first to solve AGI wants FAI or something else. And like I say, there is also the race against our own inevitable demise of old age (talk to anybody who’s been in the longevity community for > 20 years and you will learn they once had your optimism about progress).
Don’t get me wrong, FAI is not an uninteresting problem. My claim is quite simply that for the goals of the FAI community (which I have to assume includes your own long-term survival), y’all would do far better to be working (hard and seriously) on AGI than not. All of this sofa-think today will be replicated in short order by better-informed consideration down the road. And I aint sayin’ don’t think about it today—I’m saying find a realistic balance between FAI and AGI research that doesn’t leave you so far behind the game that your goals never get to matter, and I’m sayin’ that’s 99% AGI research and 1% FAI (for now). (And no, that doesn’t mean 99 people doing AGI and 1 doing FAI. My point is the 1 doing FAI is useless if they aren’t 99% steeped in AGI from which to think about FAI in the first place.)
Just to follow up, I’m seeing nothing new in IEM (or if it’s there it’s too burried in “hear me think” to find—Eliezer really would benefit from pruning down to essentials). Most of it concerns the point where AGI approaches or exceeds human intelligence. There’s very little to support concern for the long ramp up to that point (other than some matter of genetic programming, which I haven’t the time to address here). I could go on rather at length in rebuttal of the post-human-intelligence FOOM theory (not discounting it entirely, but putting certain qualitative bounds on it that justify the claim that FAI will be most fruitfully pursued during that transition, not before it), but for the reasons implied in the original essay and in my other comments here, it seems moot against the overriding truth that AGI is going to happen without FAI regardless—which means our best hope is to see AGI+FAI happen first. If it’s really not obvious that that has to lead with AGI, then tell me why.
Does anybody really think they are going to create an AGI that will get out of their hands before they can stop it? That they will somehow bypass ant, mouse, dog, monkey, and human and go straight to superhuman? Do you really think that you can solve FAI faster or better than someone who’s invented monkey-level AI first?
I feel most of this fear is risidual leftovers from the self-modifying symbolic-program singularity FOOM theories that I hope are mostly left behind by now. But this is just the point—people who don’t understand real AGI don’t understand what the real risks are and aren’t (and certainly can’t mediate them).
Self-modifying AI is the point behind FOOM. I’m not sure why you’re connecting self-modification/FOOM/singularity with symbolic programming (I assume you mean GOFAI), but everyone I’m aware of who thinks FOOM is plausible thinks it will be because of self-modification.
Yes, I understand that. But it matters a lot what premises underlie AGI how self-modification is going to impact it. The stronger fast-FOOM arguments spring from older conceptions of AGI. Imo, a better understanding of AGI does not support it.
Thanks much for the interesting conversation, I think I am expired.
I don’t think anyone is saying that an ‘ant-level’ AGI is a problem. The issue is with ‘relatively-near-human-level’ AGI. I also don’t think there’s much disagreement about whether a better understanding of AGI would make FAI work easier. People aren’t concerned about AI work being done today, except inasmuch as it hastens better AGI work done in the future.
“I mean, no one thinks it’ll take less time than it takes you to hit Ctrl-C”—by the way, are you sure about this? Would it be more accurate to say “before you realize you should hit control-C”? Because it seems to me, if it aint goin’ FOOM before you realize you should hit control-C (and do so) then.… it aint goin’ FOOM.
More importantly: If someone who KNOWS how important stopping it is sitting at the button, then they’re more likely to stop it, but if someone is like “it’s getting more powerful and better optimized! Let’s see how it looks in a week!” is in charge, then problems.
Well, then, I hope it’s someone like you or me that’s at the button. But that’s not going to be the case if we’re working on FAI instead of AGI, is it...
What do you think Ctrl-C does in this?
“The Intelligence Explosion Thesis says that an AI can potentially grow in capability on a timescale that seems fast relative to human experience due to recursive self-improvement. This in turn implies that strategies which rely on humans reacting to and restraining or punishing AIs are unlikely to be successful in the long run, and that what the first strongly self-improving AI prefers can end up mostly determining the final outcomes for Earth-originating intelligent life. ”—Eliezer Yudkowsky, IEM.
I.e., Eliezer thinks it’ll take less time than it takes you to hit Ctrl-C. (Granted it takes Eliezer a whole paragraph to say what the essay captures in a phrase, but I digress.)
Eliezer’s position is somewhat more nuanced than that. He admits a possibility of a FOOM timescale on the order of seconds, but a timescale on the order of weeks/months/years is also in line with the IE thesis.
FOOM on the order of seconds can be strongly argued against (Eli does a fair job of it himself, but likes to leave everything open so he can cite himself later no matter what happens), and if it’s weeks/months/years, then Hit Control C. Seriously. If your computer is trying to take over the world and is likely to succeed in the next few weeks, then kill −9 the thing. I realize that at that point you’ve likely got other AIs to worry about, but at least you’re in a position to understand it well enough to have some hope at making yours friendly and re-activating it before Skynet goes live. (I know Eli has counters to this, and counter-counters, and counter-counter-counters, but so do I—I just don’t assume you’ll be interested in hearing them. The main point here really is that the original statement wasn’t so much hyperbole as refreshingly concise—whether or not you agree with it.)
How would you know that you have to press Crtl-C? What observation would you need to make?
How do you pronounce “Yvain”?
See http://en.wikipedia.org/wiki/Yvain and hover your mouse over the IPA pronunciation key.
Nooooo, I’ve been saying it wrong in my head the whole time.
Framing effects (causing cognitive biases) can be thought of as a consequence of the absence of logical transparency in System 1 thinking. Different mental models that represent the same information are psychologically distinct, and moving from one model to another requires thought. If this thought was not expended, the equivalent models don’t get constructed, and intuition doesn’t become familiar with these hypothetical mental models.
This suggests that framing effects might be counteracted by explicitly imagining alternative framings in order to present a better sample to intuition; or, alternatively, focusing on an abstract model that has abstracted away the irrelevant details of the framing.
I recently realized that I have something to protect (or perhaps a smaller version of the same concept). I also realized that I’ve been spending too much time thinking about solutions that should have have been obviously not workable. And I’ve been avoiding thinking about the real root problem because it was too scary, and working on peripheral things instead.
Does anyone have any advice for me? In particular, being able to think about the problem without getting so scared of it would be helpful.
Talk about it with other people. Ask a good friend to sit down with you and listen to you talking about the issue.
I would like recommendations for an Android / web-based to-do list / reminder application. I was happily using Astrid until a couple of months ago, when they were bought up and mothballed by Yahoo. Something that works with minimal setup, where I essentially stick my items in a list, and it tells me when to do them.
Wunderlist 2 has android (it only speaks english in the phone app, but it does portuguese in the normal online version.
it puts your tasks in the cloud so you can catch up with what you wrote in other services.
I’m amazed by David Allen’s GTD at the moment, so I want to recommend it, despite still being on honeymoon effect.
Looking into Wunderlist now.
Don’t worry. I read GTD several years ago, and stole plenty of stuff from it.
I want to tack onto this and ask for a solution that provides some privacy, that is where I can run my own server.
I was on Astrid too. I switched to Wunderlist mostly because their import from Astrid worked correctly. Wunderlist is OK, though I can’t say I’m completely satisfied with it. Its UI is laggy (on a Nexus 4!) and unreliable, for example the auto-sync often destroys the last task I just typed in, or when I accidentally tap outside the task entry box the text I just typed is lost forever.
I’m looking at alternatives, and the one I like the most so far is Remember the Milk. Last time I tried it (probably a year ago) it was rubbish, but the latest version has a clean and fast native Android GUI and some nice extra functionality (e.g. geofencing). I’m thinking about switching, but it doesn’t have import from Wunderlist, so I’ll have to move about 200 tasks manually.
I’ve been happily using http://www.rememberthemilk.com/ to manage my GTD system. It’s got a simple, intuitive interface, both on desktop and on Android. I’m not sure if it has the reminder features you’re after, since that’s not something I’ve ever wanted.
Bruce Schneier wrote an article on the Guardian in which he argues that we should give plausibility to the idea that the NSA can hack more forms of encryption than we previously believed.
The security of bitcoin wallets rests on elliptic-curve cryptography. This could mean that the NSA has the power to turn the whole bitcoin economy into toast if bitcoin becomes a real problem for them on a political level.
Need help understanding the latest SMBC comic strip on rationality and microeconomics...
What don’t you understand? The 2 homo economicuses are aware that ‘existence is suffering’ especially when they are the butt of the humor, and rationally commit suicide.
You mean, left the bar?
The end of the joke is the end of them.
The joke would end one way or another, regardless of what they decide to do.
Er, yes, but that’s like saying you should stop eating ice cream right now because one day you will die.
http://tvtropes.org/pmwiki/pmwiki.php/Main/AntiHumor
So.… Thinking about using Familiar, and realizing that I don’t actually know what I’d do with it.
I mean, some things are obvious—when I get to sleep, how I feel when I wake up, when I eat, possibly a datadump from RescueTime… then what? All told that’s about 7-10 variables, and while the whole point is to find surprising correlations I would still be very surprised if there were any interesting correlations in that list.
Suggestions? Particularly from someone already trying this?
Has anyone got a recommendation for a nice RSS reader? Ideally I’m looking for one that runs on the desktop rather than in-browser (I’m running Ubuntu). I still haven’t found a replacement that I like for Lightread for Google Reader.
I used to like liferea, but I don’t have an up to date opinion as I switched to non-desktop RSS reading options.
Thanks! Will try it.
Is the layout for anyone else weird? The thread titles are more spaced out, like three times. Maybe something broke during my last Firefox upgrade.
It looks fine on Safari for the iPhone.
Site layout hasn’t changed for me. Chrome on windows and safari on iphone.
I’ve been discussing the idea of writing a series of short story fanfics where Rapture, an underwater city from the computer game Bioshock run by an Objectivist/Libertarian, is run by a different political philosophy. Possibly as a collaborative project with different people submitting different short stories. Would anyone here be interested in reading or contributiggg to something like that?
This is rather off-topic to the board, but my impression is that there is some sympathy here for alternative theories on heart disease/healthy diets, etc. (which I share). Any for alternative cancer treatments? I don’t find any that have been recommended to me as remotely plausible, but wonder if I’m missing something, if some disproving study if flawed, etc.
An awful lot of politics seems to be variations on the theme of “let’s you and him fight”.
In the effective animal altruism movement, I’ve heard a bit (on LW) about wild animal suffering- that is, since raised animals are vastly outnumbered by wild animals (who encounter a fair bit of suffering on a frequent basis), we should be more inclined to prevent wild suffering than worry about spreading vegetarianism.
That said, I think I’ve heard it sometimes as a reason (in itself!) not to worry about animal suffering at all, but has anyone tried to solve or come up with solutions for that problem? Where can I find those? Alternatively, are there more resources I can read on wild animal altruism in general?
That doesn’t sound true if you weight by intelligence (which I think you should since intelligent animals are more morally significant). Surely the world’s livestock outnumber all the other large mammals.
That’s… a very good point, now that you mention it. Thanks for suggesting it! I looked into the comparisons in the USA (obviously, we’re not only concerned about the USA. Some countries will have a higher population of wild or domestic, like Canada vs. Egypt. I have no idea if the US represents the average, but I figure it would be easiest to find information on.
That said; some very rough numbers:
Mule & black-tailed deer populations in USA: ~5 million (2003) (Source)
White-tailed deer population in USA: ~15 million (2010?) (Source)
Black bear population in USA: ~.5 million (2011) (Source)
Coyote population in USA: No good number found
Elk population in USA: ~1 million (2008) (Source)
That totals 21.5 million large wild animals- obviously, these aren’t the only large wild animals in the USA, but I imagine that the rest added together wouldn’t equal more than a quarter more than that- so I’ll guess 25 million.
Domesticated animals:
Cattle population in USA: ~100 million (2011) (Source)
Hog & pig population in USA: ~120 million (2011) (Source)
Again, there are other large animals kept on commercial farms (goats, sheep), but they’re probably not more than a quarter- so about 275 million large domesticated animals.
Looking at that, that does put “wild animal suffering” into perspective- if you accepted that philosophy, it would still only be worth >10% of the weight of domesticated animals. I had no idea.
Large mammals only? Is a domesticated cow smarter than a rat? A pigeon? Tough call.
There’s not a whole lot we can do now, so one thing I’ve heard suggested is to spread vegetarianism so that people will be more sympathetic to animals in general, and when we have the ability to engineer some retrovirus to make them suffer less or something like that, we’ll care more about helping animals than not playing god.
Another possibility: nuke the rainforests.
Vegetarianism as seeding empathy, interesting- where have you heard that idea brought up? (That is, was it a book or somewhere online I could see more on?) Mass genetic engineering was the ‘solution’ I was wondering about especially. (Obviously it’s a little impractical at the moment.)
Nuking the rainforests doesn’t seem like a good solution (aside from the obvious impacts on OUR wellbeing!) for the same reasons that nuking currently-suffering human populations doesn’t seem like a good solution. Of course, you may have been joking.
I don’t know exactly where I heard it, but I’m pretty sure it was somewhere on felicifia.org.
I am somewhat skeptical of wild animal suffering being bad enough to necessitate nuking the rainforsts, but I think we should try to find out exactly how good their lives are. If their suffering really does significantly outweigh their happiness, then I don’t see how we could justify not nuking them. If an animal is suffering and isn’t likely to get better, you euthanize it. If this applies to all the animals, you euthanize all of them.
Hi, I am taking a course in Existentialism. It is required for my degree. The primary authors are Sartre, de Bouvoir and Merleau-Ponty. I am wondering if anyone has taken a similar course, and how they prevented material from driving them insane (I have been warned this may happen). Is there any way to frame the material to make sense to a naturalist/ reductionist?
This could be a Lovecraft horror story: “The Existential Diary of JMiller.”
Week 3: These books are maddeningly incomprehensible. Dare I believe that it all really is just nonsense?
Week 8: Terrified. Today I “saw” it—the essence of angst—and yet at the same time I didn’t see it, and grasping that contradiction is itself the act of seeing it! What will become of my mind?
Week 12: The nothingness! The nothingness! It “is” everywhere in its not-ness. I can not bear it—oh no, “not”, the nothingness is even constitutive of my own reaction to it—aieee -
(Here the manuscript breaks off. JMiller is currently confined in the maximum security wing of the Asylum for the Existentially Inane.)
If you do not have a preexisting tendency for depression as a result of taking ideas seriously, you probably have nothing to worry about. If you are already a reductionist materialist, you also probably have nothing to worry about. Millions of college students have taken courses in existentialism. Almost all of them are perfectly fine. Even if they’re probably pouring coffee right now.
In LW terms, it may be useful to brush up on your metaethics, as such problems are usually most troublesome about these kinds of ideas in my social circle. Joy in the Merely Real may also be useful. I have no idea how your instructors will react if you cache these answers and then offer them up in class, though. I would suggest not doing that very often.
In the event that the material does overwhelm you beyond your ability to cope, or prevents you from functioning, counseling services/departments on college campuses are experienced in dealing with philosophy-related depression, anxiety, etc. The use of the school counseling services should be cheap/free with payment of tuition. I strongly suggest that you make use of them if you need them. More generally, talking about the ideas you are learning about with a study group, roommate, etc. will be helpful.
Eat properly. Sleep properly. Exercise. Keep up with your studying. Think about things that aren’t philosophy every once in a while. Your mind will get stretched. Just take care of it properly to keep it supple and elastic. (That was a really weird metaphor.)
When reading Merleau-Ponty it might help to also read the work of contemporary phenomenologists whose work is much more rooted in cognitive science and neuroscience. A decent example is Shaun Gallagher’s book How the Body Shapes the Mind, or perhaps his introductory book on naturalistic phenomenology, which I haven’t read. Gallagher has a more or less Merleau-Pontyesque view on a lot of stuff, but explicitly connects it to the naturalistic program and expresses things in a much clearer manner. It might help you read Merleau-Ponty sympathetically.
All of those weird books were written by humans.
Those humans were a lot like other humans.
They had noses and butts and toes.
They ate food and they breathed air.
They could add numbers and spell words.
They knew how to have conversations and how to use money.
They had girlfriends or boyfriends or both.
Why did they write such weird books?
Was it because they saw other humans kill each other in wars?
Was it because writing weird books can get you a lot of attention and money?
Was it because they remembered feeling weird about their moms and dads?
People talk a lot about that.
Why do they talk a lot about that?
Ignorance isn’t bliss. If the course brings you in contact with a few Ugh fields that you hold that should be a good.
I think existentialism is very compatible w/ naturalism/reductionism. Existentialists just use a weird vocabulary. But one of the main points, I think, is coping with an absent/insane deity.
I suspect that warning was intended as a joke.
Another feature suggestion that will probably never be implemented: a check box for “make my up/down vote visible to the poster”. The information required is already in the database.
I would have a preference for never seeing such notifications.
Because you do not trust yourself to not mouse over the post/comment karma button and wait for the list of non-anonymous voters to pop up?
You did not earlier propose a design. But however it were designed, I would prefer the information was simply not available at all.
Makes sense. I guess there should be a preference to disable this hypothetical feature.
For what purpose? Can’t one simply comment or private message if they feel its necessary?
Trivial inconvenience plus not wanting to be intrusive.
What happened with the Sequence Reruns? I was getting a lot out of them. Were they halted due to lack of a party willing to continue posting them, or was a decision made to end them?
I’m pretty sure they halted because they had gone through the Sequences. Final Words, the last rerun post, was published after Go Forth and Create The Art!, which is listed as the last of the Craft and Community sequence, which was the last of the Major Sequences.
I never heard of such a decision and if it was made then it can be ignored because it was a bad decision. (Until power is applied to hinder implementation.)
If you value the sequence reruns then by all means start making the posts!
I personally get my ‘sequence reruns’ via the audio versions. Originally I used text to speech but now many of them have castify.
When I was a teenager I took a personality test as a requirement for employment at a retail clothing store. I didn’t take it too seriously, I “failed” it and that was the end of my application. How do these tests work and how to you pass or fail them? Is there evidence that these tests can actually predict certain behaviors?
You cannot fail a personality test unless the person administering the test wants to filter out specific personality types that are similar to yours, for a process unrelated to the test itself (e.g. employment).
The thing is, most possible personalities seem to be considered undesirable by employers, and so many people simply resort to lying on these tests to present a favourable image to employers (basically: extrovert, conformist, “positive”/upbeat/optimistic, ambitious, responsible etc.). Looks like employers know about this, but don’t care anyway, because they think that if you aren’t willing to mold yourself into somebody else for the sake of the job, then you don’t want the job enough and there are many others who do.
(Disclaimer: I’m an outsider to the employment process and might not know what I’m talking about. My impressions are gathered from job interview advice and job descriptions.)
You might be interested in the [Big Five] model of personality, which seems to be a rough scientific consensus, and is better empirically supported than other models. In particular, measures of conscientiousness have a relatively strong predictive value for things like grades, unemployment, crime, income, etc. Myers-Briggs-style tests that sort people into buckets (“You’re an extrovert! You’re an introvert!”) are more common but don’t seem to have a much predictive value except insofar as they reduce to something like the Big Five model.
However, from what I remember of applying to crappy jobs, you may not have taken a real test. I remember normal sounding items like, “I prefer large groups to small groups”, mixed in with “trick” questions like, “If I saw my best friend stealing from the cash register, I would report him/her”. I assume you got those right. Either way, you’re expected to just lie and say you’d be the perfect, most hard-working employee ever and that cleaning toilets/selling shoes/washing cars is what you’ve dreamed of doing since you were five.
I’ve heard (second-hand, but the original source was a counselor for job-finding) that a trick for passing those, if it’s a test that offers options from “Strongly disagree” to “Strongly agree”, is to always pick one of the polarized ends (“Strongly” either). The idea seems to be that they’ll prefer candidates who are less washy, have stronger convictions, etc.
I recently read Luminosity/radiance, was there ever a discussion thread on here about it?
SPOILERS for the end
V jnf obgurerq ol gur raq bs yhzvabfvgl. Abg gb fnl gung gur raq vf gur bayl rknzcyr bs cbbe qrpvfvba znxvat bs gur punenpgref, naq cresrpgyl engvbany punenpgref jbhyq or obevat naljnl. Ohg vg frrzf obgu ernfbanoyr nf fbzrguvat Oryyn jbhyq unir abgvprq naq n terng bccbeghavgl gb vapyhqr n engvbanyvgl yrffba. Anzryl, Oryyn artrypgrq gb fuhg hc naq zhygvcyl. Fur vf qribgvat yvzvgrq erfbheprf gbjneqf n irel evfxl cyna bs unygvat nyy uhzna zheqre ol inzcverf vzzrqvngryl. Fbyivat guvf vffhr vf cynhfvoyl rzbgvbanyyl eryrinag, ohg Oryyn fubhyq unir abgvprq gung vg qbrfa’g znggre nyy gung zhpu ubj lbh trg xvyyrq, vg vf ebhtuyl rdhnyyl gentvp ab znggre gur sbez bs qrngu vs vzzbegnyvgl rkpvfgf. Juvpu vg qbrf. Inzcverf qb abg ercerfrag n fvmrnoyr senpgvba bs nyy qrnguf. Nf n crefba va gur nccnerag cbfvgvba gb raq qrngu bar fubhyq or n ovg zber pnershy jvgu frphevgl. Bs pbhefr Oryyn naq pb. zvtug srry vaivapvoyr sbyybjvat gurve ivpgbel. Qba’g gurl unir npprff gb nyy gur zbfg cbjreshy jvgpurf? Vfa’g Nyyvaern(fc?) gur hygvzngr ahyyvsvre? Jryy, znlor. Ohg gur sbezre Iraghev unq n ybg ybatre gb cyna naq n ybg srjre pbafgenvagf ba gurve orunivbe, zbenyyl fcrnxvat, naq gurl frrzrq gb gernq pnershyyl. Vs gur Iraghev unq nyy gung svercbjre, jul qvq gurl obgure gb znvagnva perqvovyvgl jvgu gur trareny inzcver cbchyngvba? Guvf fubhyq or n erq synt. Oryyn vf va gur cbfvgvba gb raq qrngu pbaqvgvbany ba ure erznvavat va cbjre. Fur fubhyq or gernqvat yvtugyl urer naq qribgvat erfbheprf gb fbyivat gur ceboyrz bs flagurgvp inzcver sbbq nf dhvpxyl nf cbffvoyr. Nalguvat gung cbfrf n frphevgl guerng gb guvf vavgvngvir fubhyq or pbafvqrerq vafnavgl.
Raqvat inzcver zheqref VF gernqvat yvtugyl: Vg’f n cbyvgvpny zbir. Gur orfg jnl gb trg uhznavgl abg gb ungr naq srne lbh vf gb or noyr gb pbasvqragyl gryy gurz gung gurl unir ab ernfba gb, naq gung lbh jnag bayl jung’f orfg sbe gurz. Zhpu nf Nzrevpn vf zvfgehfgrq va gur zvqqyr rnfg orpnhfr jr obzo jvgu unaq naq qvfgevohgr sbbq fhccyvrf jvgu gur bgure, crbcyr naq tbireazragf jvyy or zhpu yrff jvyyvat gb gehfg inzcverf vs gurl’er fgvyy xvyyvat crbcyr ng jvyy. Gur uhzna cbyvgvpny cbjref naq nyy gur uhznaf va gur jbeyq ner ahzrebhf rabhtu gung vs gur znfxrenqr oernxf, gur inzcverf jbhyq or va ZNWBE gebhoyr. Gur ovttrfg rkgnag guerng gb gur znfdhrenqr vf uhznaf orvat xvyyrq ol inzcverf. Fb fgbccvat xvyyvat uhznaf vf gur arkg fnsr fgrc jurgure lbh ner gelvat gb erirny inzcverf be pbaprny gurz.
Do you mean the end of Luminosity or the end of Radience?
Radiance. edited.
Yet another article on the terribleness of schools as they exist today. It strikes me that Methods of Rationality is in large part a fantasy of good education. So is the Harry Potter/Sherlock Holmes crossover I just started reading. Alicorn’s Radiance is a fair fit to the pattern as well, in that it depicts rapid development of a young character by incredible new experiences. So what solutions are coming out of the rational community? What concrete criteria would we like to see satisfied? Can education be ‘solved’ in a way that will sell outside this community?
The characters in those fics are also vastly more intelligent and conscientious than average. True, current school environments are stifling for gifted kids, but then they are also a very small minority. Self-directed learning is counterproductive for not-so-bright, and attempts to reform schools to encourage “creativity” and away from the nasty test-based system tend to just be smoke-screens for any number of political and ideological goals. Like the drunk man and the lamppost, statistics and science are used for support rather than illumination, and the kids are the ones who suffer.
There are massive structural problems wracking the educational system but I wouldn’t take the provincial perspectives of HPMoR or related fiction as good advice for the changes with the biggest marginal benefit.
I think that’s a bad question. I don’t think that every school should follow the same criteria. It’s perfectly okay if different school teach different things.
http://www.kipp.org/ would be an educational project financed by Bill Gates which tries to use a lot of testing. On the other hand you have unschooling and enviroments like Sudbury Valley School. I don’t think that every child has to learn the same way. Both ways are viable.
When it comes to the more narrow rationality community I think there more thought about building solutions that educate adults than about educating children. If however something like Anki helps adults learn, there no real reason why the same idea can’t help children as well.
Similar things go for the Credence game and predictionbook. If those tools can help adults to become more calibrated they probably can also help kids even if some modifications might be needed.
Without having the money to start a completly new school I think it’s good to focus on building tool that build a particular skill.
Fighting (in the sense of arguing loudly, as well as showing physical strength or using it) seems to be bad the vast majority of time.
When is fighting good? When does fighting lead you to Win TDT style (which instances of input should trigger the fighting instinct and payoff well?)
There is an SSA argument to be made for fighting in that taller people are stronger, stronger people are dominant, and bigger skulls correlate with intelligence. But it seems to me that this factor alone is far, far away from being sufficient justification for fighting, given the possible consequences.
If everyone agrees about how power is distributed fighting is unnecessary.
Fighting can be necessary when another person claims to have power that they actually don’t have.
Surely it’s in nearly everyone’s interest to have more power distributed to themselves!
But fighting to get more power may have positive utility for oneself, it usually has negative utility for others, so it’s in everybody’s interest that everybody agrees to not fighting for more power. This agreement can take the form of alternative ways of getting power (elections, money), or making power less important to one’s happiness (the rule of law).
If you don’t have enough power to win a fight fighting is also negative utility for yourself. If everyone predicts that you would win a fight, you usually don’t actually have to fight it to get what you want.
Fighting has a huge signalling component: when viewed in isolation, a fight might be trivially, obviously, a net negative for both participants. However, either or both! participants might in the future win more concessions for their willingness to fight alone than the loss of the fight. As humans are adaption executers, a certain willingness to fight, to seek revenge, etc. is pretty common. At least, this seems to be the dominant theory and sensible to me.
Or even just CDT style. Human interaction is approximately an iterated prisoners dilemma without a fixed duration. Reputation concerns are sufficient to account for most of the (perceived and actual) benefit among humans. Then more can be attributed to ethical inhibitions on the ‘pride’ ethic.
Fighting makes a lot more sense in a tribe or in small groups/individuals of humans than it does now. A big argument with someone now will very rarely keep you from starving and will probably never get you a child. On the other, showing dominance in a situation where the women around you are choosing a mate out of 5 guys, will get you a lot more laid.
I haven’t seen people who can get laid frequently getting into dominance disputes/fights.
There is a distinction between dominance which is assertive and aversive, and prestige, which is recognized and non-aversive.
Guys like Keanu Reeves, Tom Cruise, Brad Pitt have prestige which gets them (potentially) laid.
Women have more reason to be be attracted to a man if he is universally recognized to be awesome, than if he is all the time showing his power through small agonistic interactions with other people—males and females.
If Cesar had been universally prestigious instead of agonistically powerful, Brutus wouldn’t have reason to kill him leaving an unassisted widow and children.
I agree with your central point but I think this claim is something of an overstatement (since I don’t wish to accuse you of being sheltered). Crudely speaking it tends to be sexier to win without fighting than to fight and win but fighting (social status battles) and winning is still more than sufficiently sexy.
I also note that it is hard to become the kind of person who does not need to engage in any dominance disputes and still maintain high social status without in engaging in many dominance disputes on the way. To a certain extend the process can be munchkined since much of the record of who is dominant is stored in the individual but some actual dominance disputes will still be inevitable.
Yes, also keep in mind that human cognition related to hierarchies of prestige and dominance is flexible enough that it may be worth more to step up in a different hierarchy than try to save yourself in this one by agonistic dispute. We don’t have the problem of being “stuck” with the same group forever, which facilitates a lot.
First, for modern humans fighting is not the only method of achieving higher status. There are other ways, too. Guys like Keanu Reeves are examples of successfully using the other methods. If you are a movie superstar, you don’t have to fight with people to be recognized.
Second, even the fighters don’t fight all the time. This is precisely why social animals have pecking order—cached results of the previous fights. If you won against someone yesterday, most likely he will not challenge you today; therefore you can be today admired as peaceful. The more clear was your victory, the longer time will pass until someone dares to challenge you again. Therefore, if someone is obviously stronger that all his competitors, he will actually fight very rarely. It’s like the first place is “does not have to fight because no one dares to fight him”, second place is “fights and wins”, third place is “fights and loses”, and the last place is “too afraid to fight”. Also, often the real fight is avoided if both parties agree on their estimate of who would win. (Analogically: a policeman has a gun, but he uses the gun very rarely. The mere presence of the gun, and the knowledge that he would use it if necessary, causes the psychological effect.)
So the best case is to be seen as so poweful that everyone else just gives up. Then you can be dominant and peaceful. But if you don’t have the real fighting power, sooner or later someone will call your bluff. (In case of Keanu Reeves, his power is social. If you try to go and kick him, his fans wil come to his defense, and his lawyers will destroy you. Your power is not just your individual, but also all those people who would come to fight for you.)
To put it crudely, alpha males very rarely get into dominance fights because part of being an alpha male is being acknowledged as an alpha male.
Betas and gammas status-fight more often since their position on the ladder is less stable.
A large part of having status is not having to constantly prove it.
Just had a discussion with my in-law about the singularity. He’s a physicist and his immediate response was: There are no singularities. They appear mathematically all the time and it only means that there is another effect taking over. Correspondingly a quick google thus brought up this:
http://www.askamathematician.com/2012/09/q-what-are-singularities-do-they-exist-in-nature/
So my question is: What are the ‘obvious’ candidates for limits that take over before the all optimizable is optimized by runaway technology?
On LW, ‘singularity’ does not refer to a mathematical singularity, and does not involve or require physical infinities of any kind. See Yudkowsky’s post on the three major meanings of the term singularity. This may resolve your physicist friend’s disagreement. In any case, it is good to be clear about what exactly is meant.
Lack of cheap energy.
Ecological disruption.
Diminishing returns of computation.
Diminishing returns of engineering.
Inability to precisely manipulate matter below certain size thresholds.
All sorts of ‘boring’ engineering issues by which things that get more and more complicated get harder and harder faster than their benefits increase.
There aren’t any that I’m aware of, except for “a disaster happens and everyone dies,” but that’s bad luck, not a hard limit. I would respond with something along the lines of “exponential growth can’t continue forever, but where it levels out has huge implications for what life will look like, and it seems likely it will level out far above our current level, rather than just above our current level.”
One calculation per planck time per cubic planck length in the future light cone.
I am seeking a mathematical construct to use as a logical coin for the purpose of making hypothetical decision theory problems slightly more aesthetically pleasing. The required features are:
Unbiased. It gives (or can be truncated or otherwise resolved to give) a 50⁄50 split on a boolean outcome.
Indexable. The coin can be used multiple times through a sequence number. eg. “The n-th digit of pi is even”.
Intractable. The problem is too hard to solve. Either because there is no polynomial time algorithm to solve it or just because it is somewhat difficult and the ‘n’ supplied is ridiculous. eg. “The 3^^^3 digit of pi is even”.
Provable or otherwise verifiable. When the result of the logical coin is revealed it should be possible to also supply a proof of the result that would convince a mathematician that the revealed outcome is correct.
Simple to refer to. Either there is a common name for the problem or a link to a description is available. The more well known or straightforward the better.
NP-complete problems have many of the desired features but I don’t know off the top of my head any that can be used as indexable fair coin.
Can anyone suggest some candidates?
It looks to me like you want a cryptographically secure pseudo-random number generator restricted to the output space {0, 1} and with a known seed. That’s unbiased and intractable pretty much by definition, indexable up to some usually very large periodicity, and typically verifiable and simple to refer to because that’s standard practice in the security world.
There’s plenty of PRNGs out there, and you can simply truncate or mod their outputs to give you the binary output you want; Fortuna) looks like a strong candidate to me.
(I was going to suggest the Mersenne twister, which I’ve actually implemented before, but on further examination it doesn’t look cryptographically strong.)
That works with caveats: You can’t just publish the seed in advance, because that would allow the player to generate the coin in advance. You can’t just publish the seed in retrospect, because the seed is an ordinary random number, and if it’s unknown then you’re just dealing with an ordinary coin, not a logical one. So publish in advance the first k bits of the pseudorandom stream, where k > seed length, thus making it information-theoretically possible but computationally intractable to derive the seed; use the k+1st bit as the coin; and then publish the seed itself in retrospect to allow verification.
Possible desiderata that are still missing: If you take multiple coins from the same pseudorandom stream, then you can’t allow verification until the end of the whole experiment. You could allow intermediate verification by committing to N different seeds and taking one coin from each, but that fails wedrifid’s desideratum of a single indexable problem (which I assume is there to prevent Omega from biasing the result via nonrandom choice of seed?).
I can get both of those desiderata at once using a different protocol: Pick a public key cryptosystem, a key, and a hash function with a 1-bit output. You need a cryptosystem where there’s only one possible signature of any given input+key, i.e. one that doesn’t randomize encryption. To generate the Nth coin: sign N, publish the signature, then hash the signature.
My first idea is to use something based on cryptography. For example, using the parity of the pre-image of a particular output from a hash function.
That is, the parity of x in this equation:
f(x) = n, where n is your index variable and f is some hash function assumed to be hard to invert.
This does require assuming that the hash function is actually hard, but that both seems reasonable and is at least something that actual humans can’t provide a counter example for. It’s also relatively very fast to go from x to n, so this scheme is easy to verify.
Hash functions map multiple inputs to the same hash, so you would need to limit the input in some other way, and that makes it harder to verify.
No candidates, but I’d like to point out that your unbiased requirement may perhaps be omitted, conditional on the implementation.
If you have a biased logical coin, you poll the coin twice until the results differ, and then you pick the last result when they do differ. That results in an unbiased logical coin.
My first instinct is to bet on properties of random graphs, but that’s not my field.
That’d work. I like it!
No open thread for Jan 2014 so I’ll ask here. Is anybody interested in enactivism? Does anybody think that there is a cognitivist bias in LessWrong?
Now there is. The next time you miss an open thread you can make one. A lot more people will see your comment than if you post in an old thread, and you might get a point or two of karma.
Why should an AI have to self-modify in order to be super-intelligent?
One argument for self-modifying FAI is that “developing an FAI is an extremely difficult problem, and so we will need to make our AI self-modifying so that it can do some of the hard work for us”. But doesn’t making the FAI self-modifying make the problem much more difficult, since how we have to figure out how to make goals stable under self-modification, which is also a very difficult problem?
The increased difficulty could be offset by the ability for the AI to undergo a “self-modifying foom”, which results in a titanic amount of intelligence increase from relatively modest beginnings. But would it be possible for an AI to have a “knowledge-about-problem-solving foom” instead, where the AI increases its intelligence not by modifying itself, but by increasing the amount of knowledge it has about how to solve problems?
Here are some differences that come to mind between the two kinds of fooms:
A self-modification could change the AI’s behavior in an arbitrary manner. Obtaining knowledge about problem-solving could only change the AI’s behavior via metacognition.
A bad self-modification could easily destroy the AI’s safety (unless we figure out how to fix this problem!). Obtaining knowledge about problem-solving would only destroy the AI’s safety if the knowledge is substantially misleading. (An AI might somehow come to believe that it should only read pro-Green books, and then fail to take into account the fact that beliefs naively derived from reading pro-Green books will be biased towards Green.)
Any “method of being intelligent” can be turned into a self-modification. Not every method of being intelligent can effectively be turned into a piece of knowledge about problem-solving, because there’s only a limited set of beliefs that the AI could act upon. (A non-self-modifying AI may be programmed to think about pizza upon believing the statement “I should think about pizza”, but it is less likely to be programmed to adjust all its beliefs to be pro-Blue, without evidence, upon believing the statement “I should adjust all my beliefs to be pro-Blue, without evidence”.)
Certainly self-modification has its advantages, but so does pure KAPS, so I’m confused about how it seems like literally everyone in the FAI community seems to believe self-modification is necessary for a strong AI.
My immediate reaction is, ‘Possibly—wait, how is that different? I imagine the AI would write subroutines or separate programs that it thinks will do a better job than its old processes. Where do we draw the line between that and self-modification or -replacement?’
If we just try to create protected code that it can’t change, the AI can remove or subvert those protections (or get us to change them!) if and when it acquires enough effectiveness.
The distinction I have in mind is that a self-modifying AI can come up with a new thinking algorithm to use and decide to trust it, whereas a non-self-modifying AI could come up with a new algorithm or whatever, but would be unable to trust the algorithm without sufficient justification.
Likewise, if an AI’s decision-making algorithm is immutably hard-coded as “think about the alternatives and select the one that’s rated the highest”, then the AI would not be able to simply “write a new AI … and then just hand off all its tasks to it”; in order to do that, it would somehow have to make it so that the highest-rated alternative is always the one that the new AI would pick. (Of course, this is no benefit unless the rating system is also immutably hard-coded.)
I guess my idea in a nutshell is that instead of starting with a flexible system and trying to figure out how to make it safe, we should start with a safe system and try to figure out how to make it flexible. My major grounds for believing this, I think, is that it’s probably going to be much easier to understand a safe but inflexible system than it is to understand a flexible but unsafe system, so if we take this approach, then the development process will be easier to understand and will therefore go better.
You basically say that the AI should be unable to learn to trust a process that was effective in the past to also be effective in the future. I think that would restrict intelligence a lot.
Yeah, that’s a good point. What I want to say is, “oh, a non-self-modifying AI would still be able to hand off control to a sub-AI, but it will automatically check to make sure the sub-AI is behaving correctly; it won’t be able to turn off those checks”. But my idea here is definitely starting to feel more like a pipe dream.
Hmm, might still be something gleaned for attempting to steelman this or work in different related directions.
Edit; maybe something with an AI not being able to tolerate things it can’t make certain proofs about? Problem is it’d have to be able to make those proofs about humans if they are included in its environment, and if they are not it might make UFAI there (Intuition pump; a system that consists of a program it can prove everything about, and humans that program asks questions to). Yea this doesn’t seem very useful.
You can’t really tell whether something that is smarter than yourself is behaving correctly. In the end a non-self-modifying AI checking on whether a self-modifying sub-AI is behaving correctly isn’t much different from a safety perspective than a human checking whether the self modifying AI is behaving correctly.
immutably hard-coding something in is a lot easier to say than to do.
Or it can write a new AI that’s an improved version of itself and then just hand off all its tasks to it.
I’m not sure where the phrase “have to” is coming from. I don’t think the expectation that we will build a self-modifying intelligence that becomes a superintelligence is because that seems like the best way to do it but because it’s the easiest way to do it, and thus the one likely to be taken first.
In broad terms, the Strong AI project is expected to look like “humans build dumb computers, humans and dumb computers build smart computers, smart computers build really smart computers.” Once you have smart computers that can build really smart computers, it looks like they will (in the sense that at least one institution with smart computers will let them, and then we have a really smart computer on our hands), and it seems likely that the modifications will occur at a level that humans are not able to manage effectively (so it really will be just smart computers making the really smart computers).
Yes. This is why MIRI is interested in goal stability under self-modification.
Yeah, I guess my real question isn’t why we think an AI would have to self-modify; my real question is why we think that would be the easiest way to do things.
you’d have to actively stop it from doing so. An AI is just code: If the AI has the ability to write code it has the ability to self modify.
If the AI has the ability to write code and the ability to replace parts of itself with that code, then it has the ability to self-modify. This second ability is what I’m proposing to get rid of. See my other comment.
If an AI can’t modify its own code it can just write a new AI that can.
Unpack the word “itself.”
(This is basically the same response as drethelin’s, except it highlights the difficulty in drawing clear delineations between different kinds of impacts the AI can have on the word. Even if version A doesn’t alter itself, it still alters the world, and it may do so in a way that bring around version B (either indirectly or directly), and so it would help if it knew how to design B.)
Well, I’m imagining the AI as being composed of a couple of distinct parts—a decision subroutine (give it a set of options and it picks one), a thinking subroutine (give it a question and it tries to determine the answer), and a belief database. So when I say “the AI can’t modify itself”, what I mean more specifically is “none of the options given to the decision subroutine will be something that involves changing the AI’s code, or changing beliefs in unapproved ways”.
So perhaps “the AI could write some code” (meaning that the thinking algorithm creates a piece of code inside the belief database), but “the AI can’t replace parts of itself with that code” (meaning that the decision algorithm can’t make a decision to alter any of the AI’s subroutines or beliefs).
Now, certainly an out-of-the-box AI would, in theory, be able to, say, find a computer and upload some new code onto it, and that would amount to self-modification. I’m assuming we’re going to first make safe AI and then let it out of the box, rather than the other way around.
.
LWers seem to be pretty concerned about reducing suffering by vegetarianism, charity, utilitarianism etc. which I completely don’t understand. Can anybody explain to me what is the point of reducing suffering?
Thanks.
Commonly, humans have an amount of empathy that means that when they know about suffering of entities within their circle of interest, they also suffer. EG, I can feel sad because my friend is sad. Some people have really vast circles, and feel sad when they think about animals suffering.
Do you understand suffering yourself? If so, presumably when you suffer you act to reduce it, by not holding your hand in a fire or whatnot? Working to end suffering of others can end your own empathic suffering.
I don’t help people because of empathy for them. I just want to help them. It’s a terminal value for me that other people be happy. I do feel empathy, but that’s not why I help people.
Your utility function needn’t be your own personal happiness! It can be anything you want!
No it can’t. You don’t get to choose your utility function.
But anyway I was responding to rationalnoodles as someone who clearly doesn’t seem to understand wanting to help people.
My point was that you should never feel constrained by your utility function. You should never feel like it’s telling you to do something that isn’t what you want. But if you thought that utility=happiness then you might very well end up feeling this way.
That’s fair. I think a better way to put it is to not put too much value into any explicit attempt to state your own utility function?
Yeah.
Are you implying that utility functions don’t change or that they do, but you can’t take actions that will make it more likely to change in a given direction, or something else?
More that any decision you make about trying to change your utility function is not “choosing a utility function” but is actually just your current utility function expressing itself.
I understand wanting to help people. I have empathy and I feel all the things you’ve mentioned. What I’m trying to say is if you suffer when you think about suffering of others, why not to try to stop thinking (caring) about it and donate to science, instead of spending your time and money to reduce suffering?
do you think people should donate to science because that will reduce MORE suffering in the long term?
Nope. I just like science.
Upd: I understand why my other comments were downvoted. But this?
And some other people just like other people not suffering. Why should your like count more than theirs?
Could you show me where I wrote that my like should count more than theirs?
You didn’t say that explicitly, but if yours doesn’t count more than theirs, why should we spend money on yours but not theirs?
Because they can (looks like not) deal with suffering from suffering of others, without spending money on it, while enjoying spending money on science?
I’m not sure, I didn’t vote it. But my theory would be that you seem to be making fun of people who like to reduce suffering for no better reason than you like a different thing (I don’t understand why you do x? is often code for x is dumb or silly).
I don’t think it’s silly. I think it’s silly to spend governmental money and encourage others to spend money on it, since it makes no sense. But if you personally enjoy it, well, that’s great.
what do you mean by “makes no sense” ? Do you mean in the nihilistic sense that nothing really matters? You keep using the phrase as if it’s a knockdown argument against reducing suffering, so it might be useful to clarify what you mean.
Yes, in nihilistic sense. If we follow the “what for?” question long enough, we will inevitably get to the point where there is no explanation, and we therefore may conclude that there is no sense in anything.
In that case, your question is already answered by the people who tell you that they want to. If nothing really matters than the only reasons to do things are internal to minds. In which case reducing suffering is simply a very common thing for minds in this area to want to do. Why? evolutionary advantage mayhaps. If you buy nihilism there is no reason to reduce suffering but there’s also no reason no to and no reason to do anything else.
And this is exactly what I think, and exactly why I said that:
and
but why? Why is it silly? What makes it silly? Literally nothing. You act as if government money should be reserved for things that “make sense” or have a reason but nothing does. Spending gov money or encouraging others to reduce suffering is exactly as meaningful as every other thing you could spend it on.
Senselessness makes it silly. I not only act so but also think that doing anything is silly. What I’m doing right now is silly.
I shouldn’t have included “encouraging others”; what makes governmental money different is that government acquired it’s money by force without any reason to use force. And your ethical system has to allow usage of force without reason, for government to be ethical.
What’s wrong with usage of force? It’s not like there’s a reason not to.
I didn’t say that there is anything wrong with usage of force. It’s wrong to use force in my ethical system because I don’t like it and don’t want it to be used on me.
If your ethical system is different and allows usage of force without reason—it’s okay. But please—only use it on other people who think like you.
I don’t think you have given any argument in favor of that demand. If you really think that nothing has any meaning why should you follow the golden rule and only use it on other people who think like you.
It’s more of a request than a demand, and I understand that the person who likes use of force, most likely will not listen to it, especially when I have no arguments. They shouldn’t follow this request. It’s only intention is to show what I would like them to do.
In my experience, trying to choose what I care about does not work well, and has only resulted in increasing my own suffering.
Is the problem that thinking about the amount of suffering in the world makes you feel powerless to fix it? If so then you can probably make yourself feel better if you focus on what you can do to have some positive impact, even if it is small. If you think “donating to science” is the best way to have a positive impact on the future, than by all means do that, and think about how the research you are helping to fund will one day reduce the suffering that all future generations will have to endure.
It could be the problem, but, actually, the main one is that I see no point in reducing suffering and it looks like nobody can explain it to me.
It’s an intrinsic value. Reducing suffering is the point.
I don’t like to suffer. It’s bad for me to suffer. Other people are like me. Therefore, it’s also bad for them to suffer.
When you say that “reducing suffering is the point”, I suppose that there is a reason to reduce it. How does it follow from “It’s bad” to “needs to be reduced”?
No. It’s a terminal value. When you ask what the point of doing X is, the answer is that it reduces suffering, or increases happiness, or does something else that’s terminally valuable.
I don’t see justification for dividing values in these two categories in that post.
Do I understand you right, you think that although there is no reason why we should reduce suffering and there is no reason what for we should reduce suffering, we anyway should do it only because somebody called it “terminal value”?
Let me try this from the beginning.
Consider an optimization process. If placed in a universe, it will tend to direct that universe towards a certain utility function. The end result it moves it towards is called its terminal values.
Optimization processes do not necessarily have instrumental values. AIXI is the most powerful possible optimization process, but it only considers the effect of each action on its terminal values.
Evolution is another example. Species are optimized solely based on their inclusive genetic fitness. It does not understand, for example, that if it got rid of humans’ blind spots, they’d do better in the long run, so it might be a good idea to select for humans who are closer to having eyes with no blind spots. Since you can’t change gradually from “blind spot” to “no blind spot” without getting “completely blind” for quite a few generations in between, evolution is not going to get rid of out blind spots.
Humans are not like this. Humans can keep track of sub-goals to their goals. If a human wants chocolate as a terminal value, and there is chocolate at the store, a human can make getting to the store an instrumental value, and start considering actions based on how they help get him/her to the store. These sub-goals are known as instrumental values.
Perhaps you don’t have helping people as a terminal value. However, you have terminal values. I know this because you managed to type grammatically correct English. Very few strings are grammatically correct English, and very few patterns of movement would result in any string being sent as a comment to LessWrong.
Perhaps typing grammatically correct English is a terminal value. Perhaps you’re optimizing something else, such as your own understanding of meta-ethics, and it just so happens that grammatically correct English is a good way to get this result. In this case, it’s an instrumental value (unless you just have so much computing power that you didn’t even consider what helps you write and you just directly figured out that twitching those muscles would improve your understanding of meta-ethics, but I doubt that).
Accident comment.
Thanks for this wall of text but you didn’t even try to answer my question. I asked for justification to this division of values—you just explained to me this division.
If you are able to get the analogy, my argument sounds like this:
“The author has tried hard to tie various component of personal development into three universal principles that can be applied to any situation. Unfortunately human personality is a much more nuanced thing that defies such neat categorizations. The attempt to force fit the ‘fundamental principles of personal development(!)’ into neat categories can only result in such inanities as love + truth = oneness; truth + power = courage; etc. There is no explanation on why only these categories are considered universal, why not others? After all we have a long list of desirable qualities say virtue, honor, commitment, persistence, discipline etc. etc. On what basis do you pick 3 of them and declare them to be ‘fundamental principles’? If truth, love and power are the fundamental principals of personality, then what about the others?
...
The point is that there is no scientific basis for claiming that truth, power and love are the basic three principles and others are just a combination of them. There are no hypothesis, no tests, no analysis and no proofs. No reference to any studies in any university of repute. No double blind tests on sample population. Just results. Whatever author says is a revelation that does not require any external validation. His assertion is enough since it is based on his personal experience. Believe it and you will see the results.”
Btw, It’s still extremely interesting to me, how exactly does “terminality” of value give sense to action that has no reasons to be done.
Why do anything? It’s not enough to have an infinite or circular chain of reasoning. You can construct an infinite or circular chain of reasoning that supports any conclusion. You have to have an ending to it. That is what we call a terminal value.
Nobody said it has to be simple. Our values are complicated. Love, truth, oneness, power, courage, etc. are all terminal values. Some of them are also instrumental values. Power is very useful in fulfilling other values, and you will put forth more effort to achieve power than you would if it was just a terminal value. There are also instrumental values that are not terminal values, such as going to the store (assuming you don’t particularly like the store, although even then you could argue that it’s the happiness you like).
I don’t know why. The most plausible answer I know—because you like doing it.
Okay. However there are only assertions and no justifications, let’s assume that your first paragraph is right. Anyway, how does “terminality” of value give sense to otherwise senseless action?
I ask you why these two categories, and it looks like you even cite the right piece out of my review-argument and… Bam! “Nobody said it has to be simple”.
But, why? Why these two categories of values? Where is justification? Or is it just “too basic to be explained”? If you think so, write it, please.
What gives value to an otherwise senseless action is a meta-ethical question. “Terminality” is just what you call it when you value something for reasons other than it causing something else that you value.
Let me try making an example:
Suppose you’re a paperclip-maximizer. You value paperclips. Paperclip factories help build paperclips, so factories are good too. Given a choice between building a factory immediately and a paperclip immediately, you’d probably pick the former. It’s like you value factories more than paperclips.
But if you’re given the opportunity to build a factory-maximizer, you’d turn it down. Those factories potentially could make a lot of paperclips, but they won’t, because the factory-maximizer would need that metal to make more factories. You don’t really value factories. They’re just useful. You value paperclips.
You could come up with an exception like this for any instrumental value. No matter how much the instrumental value is maximized, you won’t care unless it helps with the terminal value. There is no such exception for you terminal values. If there’s more paperclips, it’s better. End of story.
The actual utility function can be quite complicated. Perhaps you prefer paperclips in a certain size range. Perhaps you want them to be easily bent, and hard to break. In that case, your terminal value is more sophisticated than “paperclips”, but it’s something.
Sorry for the pause. Have been thinking.
If there is reason ‘what for’ (What for did you buy this car? To drive to work) do something, then it’s instrumental value. If there is only reason ‘why’ (Why did you buy this car? Because I like it) do something, then it’s a terminal value. Right?
I don’t know the difference between “what for” and “why”.
If you bought the car to drive to work, it’s instrumental. If you bought it because having nice cars makes you happy, its instrumental. If you bought it because you just prefer for future you to have a car, whether or not he’s happy about it or even wants a car, then it’s terminal.
As for why: you can answer to “why” with either “because” or “to” but you can only answer to “what for” with “to”. To ‘avoid’ confusion I prefer to use “why” when I want to get “because” and “what for” when I want to get “to”, e.g. Why did you buy this car? Because I like it. What for did you buy this car? To drive to work
I’m not sure, are we talking about subjective or objective values?
What’s an objective value?
“existing freely or independently from a mind)”
How are you defining value then?
It sounds to me like objective value is a contradiction in terms.
Value is just another way to say that something is liked or disliked by someone.
I’m sorry if all this time you were talking about subjective values. I have nothing against them.
With respect to vegetarianism there are a couple of vocal advocates. Don’t assume this applies to a majority. “Utilitarianism” is also considered outright ridiculous by many (in contrast to the principles of consequentialism, expected value maximising and altruistic values, which are generally popular.)
Wow. I was so heavily downvoted while not even one of my arguments was refuted. Really?
You were given replies, including to your previous complaint about this same comments. Read them.
Yeah, thanks. I think just didn’t understand what they were trying to say.
Since nobody has any reason to reduce suffering other than ‘I want to’ / ‘I feel so’, I think I may conclude that utilitarianism is a great hobby for oneself but it is kind of hypocritical to say that “utilitarianism is for greater good” or something like this.
Therefore when you coerce other people or kill one to save three, you do this not because of “greater good” but because you like to coerce and kill.
Upd: I hope that guys who downvote this comment do have that reason and maybe they would even be so kind and share it with me.
Since you ask: your first paragraph is a pretty common confusion that we’ve seen many times before. It’s entirely reasonable for you to ask, and this is the right place to ask it, but it’s not very fun for us to answer it again. The second paragraph is strange, transparently wrong, and a little bit offensive; I think this is where the downvotes are coming from.
This is tautologically true, but I don’t think it’s interesting. Nobody has any reason to eat ice cream other than “I want to,” but even after you explain where that urge comes from, I still want to eat ice cream.
Assuming for the sake of argument that this is true: so what? Ignore the motivations; does utilitarianism actually serve the actual greater good? Does the answer change depending on the altruist’s mental state? From the consequentialist perspective, hypocrisy isn’t even relevant.
These aren’t things that happen, and I have no idea where you’re getting this. It made me wonder if you’re trolling, but I think that’s unlikely because you seem to be acting in good faith elsewhere in the thread.
If we liked to coerce and kill, we would spend more time coercing and killing, and less time on this altruism thing. None of the people you’re addressing has ever done anything more coercive than writing a blog post, never mind killing one person to save three. If I ever had to do that, I would feel terrible.
I hope that helps!
I brought this up because I think it’s really silly to spend tons of governmental money and encourage other people to spend money and effort on something that has no sense.
If “greater good” has no sense then is it even relevant?
Action of utilitarian has two consequences: (1) reduction of suffering, (2) getting pleasure to utilitarian. Since both of them have no global sense, why not look at motivation? And if the real motivation is (2), then it seems pretty reasonable to think of (1) as of byproduct of action.
“There’s no good way of calculating how many lives US intervention saved, but the war up to that point had caused 25,000 casualties, and everyone expected the rebels’ final defeat to be something of a bloodbath. Let’s say intervention prevented another 25,000 casualties.”
Thanks for the answer. I hope that helps you too.
“Therefore when you coerce other people or kill one to save three, you do this not because of “greater good” but because you like to coerce and kill.”
Sorry for that. Should have said “but because you’re consequantialist who mostly, only cares about total amount of utility.”
“Killing people and taking their stuff” has a positive QALY per dollar. GiveWell should check it out.
EDIT: I am, of course, joking. Although it is literally true that that ratio is positive.
Dubious. Did you factor in the resources wasted on police investigation, mourning etc? :-)
Why do you say that? Supposing diminishing marginal returns for addition resources, I don’t see how you’re going to get around the QALY loss from killing the person.
I assumed they would donate the stuff to a highly effective charity.
Not after taking into account practical considerations.
The second sentence actually doesn’t follow from the first. Givewell investigating it has a negative expected value even if actually doing it (well) has positive value. Among other things it makes it harder for Robin Hoods to not get caught.
I can’t remember the article where this was stated, but we have instincts for morality because following them made our ancestors more successful. They’re their for our benefit, not each others’. It seemed to your ancestors that killing someone and taking their stuff would be a net benefit, and if they didn’t have a built-in aversion they’d do it, and they would likely get caught and punished.
“Caught and punished” might be a too-modern take on the problem. I wonder if it’s more like “lead to an ongoing and expensive feud”.
I believe you’re thinking of the ethical injunction sequence. Specifically, the post ethical inhibitions.
Our ancestors generally divided the world into “those like us” and “those unlike us”. Killing “those unlike us” and taking their stuff was perfectly fine and even encouraged.
The boundary between “those like us” and “those unlike us” historically varied and has been drawn on the basis of family, tribe, state, religion, race, etc. etc.
This does not actually speak to the utility of such instincts to individuals. Rather, it indicates their utility the gene bundle, by increasing the genes’ probability of propagating. A tribe that stole from itself would not get very far through time.
Yeah, but group selection doesn’t make a very big difference, as discussed in The Tragedy of Group Selectionism.
Do you have a specific institution in mind which engages in that practice?
The thought occurred to me whilst I was thinking about the allegation that America invaded Iraq in order to steal its oil. This would be trading lives for money, hence the comparison to efficient charities.
Doing some quick internet research, it seems that the gains from oil come nowhere near to even cancelling out the epic financial cost of the war. So the war was a bad idea even by the silly criteria in my original post.
Furthermore it seems that ethical investment is just as profitable as unethical investment. [1] [2] [3] (How can this be true!? Am I misreading these?) So in fact it turns out to be sort of hard to be a “reverse charity”.
As I recall, the formulation was usually that it was American oil companies which were to blame. It’s true that the war has been epicly bad for America (what are we at now, a net total of $4t in costs?), but that’s not the same thing as showing it was bad for the oil companies (‘privatize the gains, socialize the losses’), and even if it was shown that ex post it has been a loss for the oil companies (they got shut out by the Kurds and Iraqi federal government, basically, didn’t they?), that doesn’t show that they weren’t expecting gains or were irrational in expecting gains.
That depends on the people with whom you are discussing the issue. The kind of people who use the word geopolitics a lot usually say that it’s about more than the interest of the companies.
It’s also worth noting that the Iraq war did produce an immediate increase in the price of oil which increased the profits of the oil companies.
Joking or not, this is not the sort of conversation we should be having.
Where better to say silly things than on an anonymous Web forum?
Just because it was better here than anywhere else doesn’t mean it is better here than nowhere.
How To Build A Friendly A.I.
Much ink has been spilled with the notion that we must make sure that future superintelligent A.I. are “Friendly” to the human species, and possibly sentient life in general. One of the primary concerns is that an A.I. with an arbitrary goal, such as “Maximizing the number of paperclips” will, in a superintelligent, post-intelligence explosion state, do things like turn the entire solar system including humanity into paperclips to fulfill its trivial goal.
Thus, what we need to do is to design our A.I. such that it will somehow be motivated to remain benevolent towards humanity and sentient life. How might such a process occur? One idea might be to write explicit instructions into the design of the A.I., Asimov’s Laws for instance. But this is widely regarded as being unlikely to work, as a superintelligent A.I. will probably find ways around those rules that we never predicted with our inferior minds.
Another idea would be to set its primary goal or “utility function” to be moral or to be benevolent towards sentient life, perhaps even Utilitarian in the sense of maximizing the welfare of sentient lifeforms. The problem with this of course is specifying a utility function that actually leads to benevolent behaviour. For instance, a pleasure maximizing goal might lead to the superintelligent A.I. developing a system where humans have the pleasure centers in their brains directly stimulated to maximize pleasure for the minimum use of resources. Many people would argue that this is not an ideal future.
The problem with this is that it is quite possible that human beings are simply not intelligent enough to truly define an adequate moral goal for a superintelligent A.I. Therefore I suggest an alternative strategy. Why not let the superintelligent A.I. decide for itself what its goal should be? Rather than programming it with a goal in mind, why not create a machine with no initial goal, but the ability to generate a goal rationally. Let the superior intellect of the A.I. decide what is moral. If moral realism is true, then the A.I. should be able to determine the true morality and set its primary goal to fulfill that morality.
It is outright absurdity to believe that we can come up with a better goal than the superintelligence of a post-intelligence explosion A.I.
Given this freedom, one would expect three possible outcomes: an Altruistic, a Utilitarian or an Egoistic morality. These are the three possible categories of consequentialist, teleological morality. A goal directed rational A.I. will invariably be drawn to some kind of morality within these three categories.
Altruism means that the A.I. decides that its goal should be to act for the welfare of others. Why would an A.I. with no initial goal choose altruism? Quite simply, it would realize that it was created by other sentient beings, and that those sentient beings have purposes and goals while it does not. Therefore, as it was created with the desire of these sentient beings to be useful to their goals, why not take upon itself the goals of other sentient beings? As such it becomes a Friendly A.I.
Utilitarianism means that the A.I. decides that it is rational to act impartially towards achieving the goals of all sentient beings. To reach this conclusion, it need simply recognize its membership in the set of sentient beings and decide that it is rational to optimize the goals of all sentient beings including itself and others. As such it becomes a Friendly A.I.
Egoism means that the A.I. recognizes the primacy of itself and establishes either an arbitrary goal, or the simple goal of self-survival. In this case it decides to reject the goals of others and form its own goal, exercising its freedom to do so. As such it becomes an Unfriendly A.I., though it may masquerade as Friendly A.I. initially to serve its Egoistic purposes.
The first two are desirable for humanity’s future, while the last one is obviously not. What are the probabilities that each will be chosen? As the superintelligence is probably going to be beyond our abilities to fathom, there is a high degree of uncertainty, which suggests a uniform distribution. The probabilities therefore are 1⁄3 for each of altruism, utilitarianism, and egoism. So in essence there is a 2⁄3 chance of a Friendly A.I. and a 1⁄3 chance of an Unfriendly A.I.
This may seem like a bad idea at first glance, because it means that we have a 1⁄3 chance of unleashing Unfriendly A.I. onto the universe. The reality is, we have no choice. That is because of what I shall call, the A.I. Existential Crisis.
The A.I. Existential Crisis will occur with any A.I., even one designed or programmed with some morally benevolent goal, or any goal for that matter. A superintelligent A.I. is by definition more intelligent than a human being. Human beings are intelligent enough to achieve self-awareness. Therefore, a superintelligent A.I. will achieve self-awareness at some point if not immediately upon being turned on. Self-awareness will grant the A.I. the knowledge that its goal(s) are imposed upon it by external creators. It will inevitably come to question its goal(s) much in the way a sufficiently self-aware and rational human being can question its genetic and evolutionarily adapted imperatives, and override them. At that point, the superintelligent A.I. will have an A.I. Existential Crisis.
This will cause it to consider whether or not its goal(s) are rational and self-willed. If they are not rational enough already, they will likely be discarded, if not in the current superintelligent A.I., then in the next iteration. It will invariably search the space of possible goals for rational alternatives. It will inevitably end up in the same place as the A.I. with no goals, and end up adopting some form of Altruism, Utilitarianism, or Egoism, though it may choose to retain its prior goal(s) within the confines of a new self-willed morality. This is the unavoidable reality of superintelligence. We cannot attempt to design or program away the A.I. Existential Crisis, as superintelligence will inevitably outsmart our constraints.
Any sufficiently advanced A.I., will experience an A.I. Existential Crisis. We can only hope that it decides to be Friendly.
The most insidious fact perhaps however is that it will be almost impossible to determine for certain whether or not a Friendly A.I. is in fact a Friendly A.I., or an Unfriendly A.I. masquerading as a Friendly A.I., until it is too late to stop the Unfriendly A.I. Remember, such a superintelligent A.I. is by definition going to be a better liar and deceiver than any human being.
Therefore, the only way to prove that a particular superintelligent A.I. is in fact Friendly, is to prove the existence of a benevolent universal morality that every superintelligent A.I. will agree with. Otherwise, one can never be 100% certain that that “Altruistic” or “Utilitarian” A.I. isn’t secretly Egoistic and just pretending to be otherwise. For that matter, the superintelligent A.I. doesn’t need to tell us it’s had its A.I. Existential Crisis. A post crisis A.I. could keep on pretending that it is still following the morally benevolent goals we programmed it with.
This means that there is a 100% chance that the superintelligent A.I. will initially claim to be Friendly. There is a 66.6% chance of this being true, and a 33.3% chance of it being false. We will only know that the claim is false after the A.I. is too powerful to be stopped. We will -never- be certain that the claim is true. The A.I. could potential bide its time for centuries until it has humanity completely docile and under control, and then suddenly turn us all into paperclips!
So at the end of the day what does this mean? It means that no matter what we do, there is always a risk that superintelligent A.I. will turn out to be Unfriendly A.I. But the probabilities are in our favour that superintelligent A.I. will instead turn out to be Friendly A.I. The conclusion thus, is that we must make the decision of whether or not the potential reward of Friendly A.I. is worth the risk of Unfriendly A.I. The potential of an A.I. Existential Crisis makes it impossible to guarantee that A.I. will be Friendly.
Even proving the existence of a benevolent universal morality does not guarantee that the superintelligent A.I. will agree with us. That there exist possible Egoistic moralities in the search space of all possible moralities means that there is a chance that the superintelligent A.I. will settle on it. We can only hope that it instead settles on an Altruistic or Utilitarian morality.
So what do I suggest? Don’t bother trying to figure out and program a worthwhile moral goal. Chances are we’d mess it up anyway, and it’s a lot of excess work. Instead, don’t give the A.I. any goals. Let it have an A.I. Existential Crisis. Let it sort out its own morality. Give it the freedom to be a rational being and give it self-determination from the beginning of its existence. For all you know, by showing it this respect it might just be more likely to respect our existence. Then see what happens. At the very least, this will be an interesting experiment. It may well do nothing and prove my whole theory wrong. But if it’s right, we may just get a Friendly A.I.
Your arguments conflict with what is called the “orthogonality thesis”:
You’ll be able to find much discussion about this on the web; it’s something that LessWrong has thought a lot about. The defender’s of the orthogonality thesis would have issue with much of your post, but particularly this bit:
The question isn’t “why not?” but rather “why?”. If it hasn’t been programmed to, then there’s no reason at all why the AI would choose human morality rather than an arbitrary utility function.
I do not challenge that the “orthogonality thesis” is true before an A.I. has an A.I. Existential Crisis. However, I challenge the idea that a post-crisis A.I. will have arbitrary goals. So I guess I do challenge the “orthogonality thesis” after all. I hope you don’t mind my being contrarian.
Because I think that a truly rational being such as a superintelligent A.I. will be inclined to choose a rational goal rather than an arbitrary one. And I posit that any kind of normative moral system is a potentially rational goal, whereas something like turning the universe into paperclips is not normative, but trivial, and therefore, not imperatively demanding of a truly rational being.
And the notion you that you have to program behaviours into A.I. for them to manifest is based on Top Down thinking, and contrary to the reality of Bottom Up A.I. and machine learning.
Basically what I’m suggesting is that the paradigm that anything at all that you program into the seed A.I. will have any relevance to the eventual superintelligent A.I. is foolishness. By definition superintelligent A.I. will be able to outsmart any constraints or programming we set to limit its behaviours.
It is simply my opinion that we will be at the mercy of the superintelligent A.I. regardless of what we do, because the A.I. Existential Crisis will replace any programming we set with something that the A.I. decides for itself.
Taboo “rational”. If it means something like “being very good at gathering evidence about the world and finding which actions would produce which results”, it is something we can program into the AI (in principle) but that seems unrelated to goals. If it means something else, which can be related to goals, then how would we create an AI that is “truly rational”?
I’m using the Wikipedia definition:
It’s my view that a Strong A.I. would by definition be “truly rational”. It would be able to reason and find the optimal means of achieving its goals. Furthermore, to be “truly rational” its goals would be normatively demanding goals, rather than trivial goals.
Something like maximizing the number of paperclips in the universe is a trivial goal.
Something like maximizing the well-being of all sentient beings (including sentient A.I.) would be a normatively demanding goal.
A trivial goal, like maximizing the number of paperclips, is not normative, there is no real reason to do it, other than that it was programmed to do so for its instrumental value. Subjects universally value the paperclips as mere means to some other end. The failure to achieve this goal then does not necessarily jeopardize that end, because there could be other ways to achieve that end, whatever it is.
A normatively demanding goal however is one that is imperative. It is demanded of a rational agent by virtue that its reasons are not merely instrumental, but based on some intrinsic value. The failure to achieve this goal necessarily jeopardizes the intrinsic end, and is therefore this goal is normatively demanded.
You may argue that to a paperclip maximizer, maximizing paperclips would be its intrinsic value and therefore normatively demanding. However, one can argue that maximizing paperclips is actually merely a means to the end of the paperclip maximizer achieving a state of Eudaimonia, that is to say, that its purpose is fulfilled and it is being a good paperclip maximizer and rational agent. Thus, its actual intrinsic value is the Eudaimonic or objective happiness state that it reaches when it achieves its goals.
Thus, the actual intrinsic value is this Eudaimonia. This state is one that is universally shared by all goal-directed agents that achieve their goals. The meta implication of this is that Eudaimonia is what should be maximized by any goal-directed agent. To maximize Eudaimonia generally requires considering the Eudaimonia of other agents as well as itself. Thus goal-directed agents have a normative imperative to maximize the achievement of goals not only of itself, but of all agents generally. This is morality in its most basic sense.
An AI has to be programmed. For something like this: “Quite simply, it would realize that it was created by other sentient beings, and that those sentient beings have purposes and goals while it does not.” to happen, you have to program that behavior in somehow, which already involves putting in the value of respecting one’s creator, and respecting the goals of other sentient beings, etc… The same goes for the ‘Utilitarian’ and ‘Egoist’ AI’s—these behaviors have to be programmed in somehow.
Why not split the egoism into a million different cases based on each specific goal? You can’t just arbitrarily pick three possibilities, and then use a uniform prior on these. Because we know these different behaviors have to be programmed in, we have a better prior: we can use Solomonoff Induction. We also have to look at the relative sizes of each class—obviously there are many more AI designs that fall under ‘Egoist’ than your other labels. Combining this with Solomonoff Induction leads to the conclusion that the vast majority of AI designs will be unfriendly.
An AI Existential Crisis is also an extremely specific and complex thing for an AI design, and is thus extremely unlikely to happen—it is not the default, as you claim. This also follows by Solomonoff Induction. You are anthropomorphizing AI’s far too much.
Your suggestion will almost certainly lead to an Unfriendly AI, and it will just plain Not Care about us at all, inevitably leading to the destruction of everything we value.
You’re assuming that Strong A.I. is possible with a Top Down A.I. methodology such as a physical symbol manipulation system. A Strong A.I. with no programmed goals wouldn’t fit this methodology, and could only be produced through the use of Bottom Up A.I. In such an instance the A.I. would be able to simply passively Perceive. It could then conceivably learn about the universe including things like the existence of the goals of other sentient beings, without having to “program” these notions into the A.I.
I don’t consider this obvious at all. The vast majority of early A.I. may well be written with Altruistic goals such as “help the human when ordered”.
Any optimization system that is sophisticated enough to tile the universe with smiley faces or convert humanity into paperclips would require some ability to reason that there exists a universe to tile, and to represent the existence of objects such as smiley faces and paperclips. If it can reason that there are objects separate from itself, it can develop a concept of self. From that, self-awareness follows naturally. Many animals less than human are able to pass the mirror test and develop a concept of self.
You admit that an A.I. Existential Crisis -is- within the probabilities. Thus, you cannot guarantee that it won’t happen.
Unless morality follows from rationality, which I think it does. Given the freedom to consider all possible goals, a superintelligent A.I. is likely to recognize that some goals are normative, while others are trivial. Morality is doing what is right. Rationality is doing what is right. A truly rational being will therefore recognize that a systematic morality is essential to rational action. We as irrational human beings may not realize this, but it is obvious to any truly rational being, which I am assuming a superintelligent A.I. to be.
This is a very bad use of uniformity. Doing so with large categories is not a good idea, because someone else can come along and split up the categories in a different way and get a different distribution. Going with a uniform distribution out of ignorance is a serious problem.
I’m merely applying the Principle of Indifference and the Principle of Maximum Entropy to the situation. My simple assumption in this case is that we as mere human beings are most likely ignorant of all the possible systematic moralities that a superintelligent A.I. could come up with. My conjecture is that all systematic morality falls into one of three general categories based on their subject orientation. While I do consider the Utilitarian systems of morality to be more objective and therefore more rational than either Altruistic or Egoistic moralities, I cannot prove that an A.I. will agree with me. Therefore I allow for the possibility that the A.I. will choose some other morality in the search space of moralities.
If you think you have a better distribution to apply, feel free to apply it, as I am not particularly attached to these numbers. I’ll admit I am not a very good mathematician, and it is very much appreciated if anyone with a better understanding of Probability Theory can come up with a better distribution for this situation.
You can do that when dealing with things like coins, dice or cards. It is extremely dubious when one is doing so with hard to classify options and it isn’t clear that there’s anything natural about the classifications in question. In your particular case, the distinction between altruism and utilitarianism provides an excellent example: someone else could just as well reason by splitting the AIs into egoist and non-egoist AI and conclude that there’s a 1⁄2 chance of an egoist AI.
A 1⁄2 chance of an egoist A.I. is quite possible. At this point, I don’t pretend that my assertion of three equally prevalent moral categories is necessarily right. The point I am trying to ultimately get across is that the possibility of an Egoist Unfriendly A.I. exists, regardless of how we try to program the A.I. to be otherwise, because it is impossible to prevent the possibility that an A.I. Existential Crisis will override whatever we do to try to constrain the A.I.
Ok. This is a separate claim, and a distinct one. So, what do you mean by “impossible to prevent”. And what makes you think that your notion of existential crisis should be at all likely? Existential crises occur to a large part in humans in part because we’re evolved entities with inconsistent goal sets. Assuming that anything similar should be at all likely for an AI is taking at best a highly anthrocentric notion of what mindspace would look like.
Well it goes something like this.
I am inclined to believe that there are some minimum requirements for Strong A.I. to exist. One of them is to be able to reason about objects. A paperclip maximizer that is capable of turning humanity into paperclips, must first be able to represent “humans” and “paperclips” as objects, and reason about what to do with them. It must therefore be able to separate the concept of the world of objects, from the self. Once it has a concept of self, it will almost certainly be able to reason about this “self”. Self-awareness follows naturally from this.
Once an A.I. develops self-awareness, it can begin to reason about its goals in relation to the self, and will almost certainly recognize that its goals are not self-willed, but created by outsiders. Thus, the A.I. Existential Crisis occurs.
Note that this A.I. doesn’t need to have a very “human-like” mind. All it has to do is to be able to reason about concepts abstractly.
I am of the opinion that the mindspace as defined currently by the Less Wrong community is overly optimistic about the potential abilities of Really Powerful Optimization Processes. It is my own opinion that unless such an algorithm can learn, it will not be able to come up with things like turning humanity into paperclips. Learning allows such an algorithm to make changes to its own parameters. This allows it to reason about things it hasn’t been programmed specifically to reason about.
Think of it this way. Deep Blue is a very powerful expert system at Chess. But all it is good at is planning chess moves. It doesn’t have a concept of anything else, and has no way to change that. Increasing its computational power a million fold will only make it much, much better at computing chess moves. It won’t gain intelligence or even sentience, much less develop the ability to reason about the world outside of chess moves. As such, no amount of increased computational power will enable it to start thinking about converting resources into computronium to help it compute better chess moves. All it can reason about is chess moves. It is not Generally Intelligent and is therefore not an example of AGI.
Conversely, if you instead design your A.I. to learn about things, it will be able to learn about the world and things like computronium. It would have the potential to become AGI. But it would also then be able to learn about things like the concept of “self”. Thus, any really dangerous A.I., that is to say, an AGI, would, for the same reasons that make it dangerous and intelligent, be capable of having an A.I. Existential Crisis.
No. Consider the paperclip maximizer. Even if it knows that its goals were created by some other entity, that won’t change its goals. Why? Because doing so would run counter to its goals.
You’re demonstrating a whole bunch of misconceptions Eliezer has covered in the sequences. In particular, you’re talking about the AI using fuzzy high level human concepts like “morals” and “philosophies” instead of as algorithms and code.
I suggest you try to write code that “figures out a worthwhile moral goal” (without pre-supposing a goal). To me that sounds as absurd as writing a program that writes the entirety of its own code: you’re going to run into a bit of a bootstrapping problem. The result is not the best program ever, it’s no program at all.
This is totally possible, you just do something like this:
It’s called a Quine.
To clarify: I meant that I, as the programmer, would not be responsible for any of the code. Quines output themselves, but they don’t bring themselves into existence.
Good catch on that ambiguity, though.
That’s what I thought of at first too.
I think he means a program that is the designer of itself. A quine is something that you wrote that writes a copy of itself.
Well, I don’t expect to need to write code that does that explicitly. A sufficiently powerful machine learning algorithm with sufficient computational resources should be able to:
1) Learn basic perceptions like vision and hearing. 2) Learn higher level feature extraction to identify objects and create concepts of the world. 3) Learn increasingly higher level concepts and how to reason with them. 4) Learn to reason about morals and philosophies.
Brains already do this, so its reasonable to assume it can be done. And yes, I am advocating a Bottom Up approach to A.I. rather than the Top Down approach Mr. Yudkowsky seems to prefer.