Building toward a Friendly AI team
Series: How to Purchase AI Risk Reduction
A key part of SI’s strategy for AI risk reduction is to build toward hosting a Friendly AI development team at the Singularity Institute.
I don’t take it to be obvious that an SI-hosted FAI team is the correct path toward the endgame of humanity “winning.” That is a matter for much strategic research and debate.
Either way, I think that building toward an FAI team is good for AI risk reduction, even if we decide (later) that an SI-hosted FAI team is not the best thing to do. Why is this so?
Building toward an SI-hosted FAI team means:
Growing SI into a tighter, larger, and more effective organization in general.
Attracting and creating people who are trustworthy, altruistic, hard-working, highly capable, extremely intelligent, and deeply concerned about AI risk. (We’ll call these people “superhero mathematicians.”)
Both (1) and (2) are useful for AI risk reduction even if an SI-hosted FAI team turns out not to be the best strategy.
This is because: Achieving part (1) would make SI more effective at whatever it is doing to reduce AI risk, and achieving part (2) would bring great human resources to the cause of AI risk reduction, which will be useful to a wide range of purposes (FAI team or otherwise).
So, how do we accomplish both these things?
Growing SI into a better organization
Like many (most?) non-profits with less than $1m/yr in funding, SI has had difficulty attracting the top-level executive talent often required to build a highly efficient and effective organization. Luckily, we have made rapid progress on this front in the past 9 months. For example we now have (1) a comprehensive donor database, (2) a strategic plan, (3) a team of remote contractors used to more efficiently complete large and varied projects requiring many different skillsets, (4) an increasingly “best practices” implementation of central management, (5) an office we actually use to work together on projects, and many other improvements.
What else can SI do to become a tighter, larger, and more effective organization?
Hire a professional bookkeeper, implement additional bookkeeping and accounting best practices. (Currently underway.)
Create a more navigable and up-to-date website. (Currently underway.)
Improve our fundraising strategy, e.g. by creating a deck of slides for major donors which explains what we’re doing and what we can do with more funding. (Currently underway.)
Create standard policy documents that lower our risk of being distracted by an IRS audit. (Currently underway.)
Shift the Singularity Summit toward being more directly useful for AI risk reduction, and also toward greater profitability—so that we have at least one funding source that is not donations. (Currently underway.)
Spin off the Center for Applied Rationality so that SI is more solely focused on AI safety. (Currently underway.)
Build a fundraising/investment-focused Board of Trustees (ala IAS or SU) in addition to our Board of Directors and Board of Advisors.
Create an endowment to ensure ongoing funding for core researchers.
Consult with the most relevant university department heads and experienced principal investigators (e.g. at IAS and Santa Fe) about how to start and run an effective team for advanced technical research.
Do the things recommended by these experts (that are relevant to SI’s mission).
They key point, of course, is that all these things cost money. They may be “boring,” but they are incredibly important.
Attracting and creating superhero mathematicians
The kind of people we’d need for an FAI team are:
Highly intelligent, and especially skilled in maths, probably at the IMO medal-winning level. (FAI team members will need to create lots of new math during the course of the FAI research initiative.)
Trustworthy. (Most FAI work is not “Friendliness theory” but instead AI architectures work that could be made more dangerous if released to a wider community that is less concerned with AI safety.)
Altruistic. (Since the fate of humanity may be in their hands, they need to be robustly altruistic.)
Hard-working, determined. (FAI is a very difficult research problem and will require lots of hard work and also an attitude of “shut up and do the impossible.”)
Deeply committed to AI risk reduction. (It would be risky to have people who could be pulled off the team—with all their potentially dangerous knowledge—by offers from hedge funds or Google.)
Unusually rational. (To avoid philosophical confusions, to promote general effectiveness and group cohesion, and more.)
There are other criteria, too, but those are some of the biggest.
We can attract some of the people meeting these criteria by using the methods described in Reaching young math/compsci talent. The trouble is that the number of people on Earth who qualify may be very close to 0 (especially given the “committed to AI risk reduction” criterion).
Thus, we’ll need to create some superhero mathematicians.
Math ability seems to be even more “fixed” than the other criteria, so a (very rough) strategy for creating superhero mathematicians might look like this:
Find people with the required level of math ability.
Train them on AI risk and rationality.
Focus on the few who become deeply committed to AI risk reduction and rationality.
Select from among those people the ones who are most altruistic, trustworthy, hard-working, and determined. (Some training may be possible for these features, too.)
Try them out for 3 months and select the best few candidates for the FAI team.
All these steps, too, cost money.
- Reply to Holden on The Singularity Institute by 10 Jul 2012 23:20 UTC; 69 points) (
- How to Purchase AI Risk Reduction by 1 Jun 2012 3:13 UTC; 22 points) (
- Mind Hacks by 3 Feb 2014 19:32 UTC; 19 points) (
- 21 Jun 2012 7:45 UTC; 4 points) 's comment on Help me make SI’s arguments clear by (
- 4 Sep 2012 1:02 UTC; 0 points) 's comment on Less Wrong articles categorized by reading difficulty by (
Some comments on the recruiting plan:
I think a highly rational person would have high moral uncertainty at this point and not necessarily be described as “altruistic”. For example I consider Eliezer’s apparent high certainty in utilitarianism (assuming it’s not just a front for PR purposes) as evidence against his rationality. Given a choice between a more altruistic candidate and a more rational candidate, I think SI ought to choose the latter.
Similarly for “deeply committed to AI risk reduction”. I think a highly rational person would think that working on AI risk reduction is probably the best thing to do at this point but would be pretty uncertain about this and be ready to change their mind if new evidence or theories come along.
What does “trustworthy” mean, apart from “rationality”? Something like psychological stability?
It seems like the plan is to have one Eliezer-type (philosophy oriented) person in the team with the rest being math focused. I don’t understand why it isn’t more like half and half, or aiming for a balance of skills in all recruits. If there is only one philosophy oriented person in the team, how will the others catch his mistakes? If the reason is that you don’t expect to be able to recruit more than one Eliezer-type (of sufficient skill), then I think that’s enough reason to not build an FAI team.
I think a highly desirable trait in an FAI team member is having a strong suspicion that flaws lurk in every idea. This seems to work better in motivating one to try to find flaws than just “having something to protect”.
Regarding 5, I would think an important subskill would be recognizing arbitrarity in conceptual distinctions, e.g. between belief and preference, agent and environment, computation and context, ethics and meta-ethics, et cetera. Relatedly, not taking existing conceptual frameworks and their distinctions as word of God. Word of von Neumann is a lot like word of God but still not quite.
By the way I love comments like yours here that emphasize moral uncertainty.
Do you think the correct level of moral uncertainty would place so much probability on egoism-like hypotheses that the behavior it outputs, even after taking into account various game-theoretical concerns about cooperation as well as the surprisingly large apparent asymmetry between the size of altruistic returns available vs. the size of egoistic returns available, doesn’t end up behaving substantially more altruistically than a typical human or a typical math genius is likely to behave? It seems implausible to me, but I’m not that confident, and as I’ve been saying earlier, the topic is weirdly neglected here for one with such high import.
Surely it depends on how much more altruistic and how much more rational.
Most people have some pre-theoretic intuitions about cooperation, which game theory may merely formalize. It’s not clear to me that familiarity with such theoretical concerns implies one ought to be more “altruistic” than average.
If someone is altruistic because they’ve maxed out their own egoistic values (or has gotten to severely diminishing returns), I certainly wouldn’t count that against their rationality. But if “egoistic returns” include abstract values that the rest of humanity doesn’t necessarily share, “large apparent asymmetry” is unclear to me.
Where did you say that? (I wrote Shut Up and Divide? which may or may not be relevant depending on what you mean by “the topic”.)
Why “surely”, given that I’m not a random member of humanity, and may have more values in common with a less altruistic candidate than a more altruistic candidate?
I just meant that it seems to be possible to improve a lot of other people’s expected quality of life at the expense of relatively small decreases to one’s own (but that people are generally not doing so), and that this seems like it should cause the outcome of a process with moral uncertainty between egoism and altruism to skew more toward the altruist side in some sense, though I don’t understand how to deal with moral uncertainty (if anyone else does, I’d be interested in your answers to this). If by “abstract values” you mean something like making the universe as simple as possible by setting all the bits to zero, then I agree there’s no asymmetry, but I wouldn’t call that “egoistic” as such.
Here. Yes, SUAD was a good and relevant contribution.
You’re right that it’s not certain that altruism in a FAI team candidate is, all else equal, more desirable. I guess I’m just saying that if it is, then sufficiently large differences in altruism outweigh sufficiently small differences in rationality.
I have written a few more posts that are relevant to the “egoism vs altruism” question:
http://lesswrong.com/lw/8gk/where_do_selfish_values_come_from/
http://lesswrong.com/lw/6ta/what_if_sympathy_depends_on_anthropomorphizing/
http://lesswrong.com/lw/2b7/hacking_the_cev_for_fun_and_profit/
http://lesswrong.com/lw/1mo/the_preference_utilitarians_time_inconsistency/
I guess we don’t have more discussions of altruism vs egoism because making progress on the problem is hard. Typical debates about moral philosophy are not very productive, and it’s probably fortunate that LW is good at avoiding them.
Do you agree? Do you think there are good arguments to be had that we’re not having for some reason? Does it seem to you that most LWers are just not very interested in the problem?
From what I understand from past utterances, core SingInst folks tend to extend their “elite math” obsession to very nearly equating it with capability for philosophy.
Can you give some examples of such utterances?
One somewhat close quote that popped to mind (from lukeprog’s article on philosophy):
My view is that if you take someone with philosophical talents and interests (presumably inherited or caused by the environment in a hard-to-control manner) , you can make a better philosopher out of them by having them study more math and science than the typical education for a philosopher. But if you take someone with little philosophical talent and interest and do the same, they’ll just become mathematicians and scientists.
I think this is probably similar to the views of SIAI people, and your quote doesn’t contradict my understanding.
Do you have ideas about how to find philosophical talent, especially the kind relevant for Friendliness philosophy? I don’t think SingInst folk have worked very thoroughly on the problem, but someone might have. Geoff Anders has spent a lot of time thinking about the problem and he runs summer programs teaching philosophy. Dunno how much progress he’s made. (Um, for whatever it’s worth, he seems to think I have philosophical aptitude—modus ponens or modus tollens, take your pick.)
Unfortunately none of core singinst guys seem to have any interesting accomplishments in math or have actually studied that math in depth; it is a very insightful remark by Luke but it’d be great if they have applied it to themselves; otherwise it just looks like Dunning-Kruger effect. I don’t see any reason to think that the elite math references are anything but lame signaling that is usually done by those whom don’t know math enough to properly signal the knowledge (by actually doing something new in math). Sadly it works: if you use jargon and you say something like what Luke said, then some of the people whom can’t independently evaluate your math skills, will assume it must be very high. Meanwhile I will assume it to be rather low because those with genuinely high skill will signal such skill in different way.
The most recent example is hard to give—it was in person from Anna. Other examples I would have to search through Eliezer’s comments from years back to find.
I think “trustworthy” here means something along the lines of “committed to the organization/project”, in the sense that they’re not going to take the ideas/code used in SI conversations and ventures to Google or some other project. In other words, they’re not going to be bribed away.
Thanks for this. I’m writing a followup to this post that incorporates the points you’ve raised here.
What is with you guys and the math olympiad?
Are successes at the IMO a reliable and objective measure of the skills you need?
Well, of course, this is a major filter for intelligence, creativity and plain math “basic front kick” proficiency. However, you also exclude lots and lots of people who may possess the necessary skills but did not choose to enter IMO, were not aware of it, had other personal commitments or simply procrastinated too much etc. Also, the skills IMO medalists have acquired so far are in no way a guarantee that more skills will follow, and may be the result of good teachers or enthusiastic parents. Let me present you some weak evidence, behold, a personal anecdote!
On behalf of my chemistry teacher and owed to my school’s policy in general, I entered one of these sciency olympiads a while ago. I procrastinated over the first round, which was a homework assignment, got it in just in time not quite completed, but was allowed to the next round. From there on, I made it to the national team and won a medal at the international competition. Looking back on what I did back then, I’d say the questions were quite easy, not at a level I’d call requiring serious skill. Of course I’ve continued to learn much since then, but it could’ve gone another way. I would not say such medal winners are overly altruistic or more determined to their cause than others are (the two of my teammates that I know about are now studying medicine, “to make money” as they told me).
I’d conclude that if you want bright mathematicians, you might be well off taking IMO medalists. But if you want people who are also trustworthy, altruistic and deeply committed to AGI, let alone especially rational, you should probably widen the filter for intelligence and math skills (if IMO medals are a good measure of this) a lot. Perhaps ask Mensa, take applications and filter from those, reach out to the best 1% on some college entry test or something. Just wild, uneducated guesses. But focussing on IMO medalists, which should give you less than 500 potential candidates a year, doesn’t sound like a good strategy. More so since those people are approached not only by you but also by quite a few companies.
Perhaps I misunderstood what you mean with “at medal-winning level”, but since you’re doing this SPARC camp and I can’t think of another way of attracting such people, I assumed you were reaching out to people who actually compete.
Unless we have different understandings of what it means to be ‘approached’ this statement clashes quite strongly with my experience in the year since I won my IMO silver.
Another data point: I have a gold IMO medal (only 4 people in my country ever had it), and in the following 20 years I never had a phone call or an e-mail saying: “We have this project where we need math talents, and I saw your name online, so if you are interested I would like to tell you more details.”
No one cares. Seems to me the only predictor companies use is: “Did you do the same kind of work at some previous company? How many years?” and then the greater number wins. (Which effectively means that one is paid for their age and their ability to pick the right technology when they finish university and stick with it. Kinda depressing.)
EDIT: As a sidenote, I did not use my math skills significantly since university, so in my case the IMO medal is probably not so good predictor now.
Cynic. You’re neglecting the influence of a pretty face, good clothes, and a bit of charm!
I’m an IOI silver medalist and have been approached many times by companies, mostly quantitative finance / tech start ups.
Interesting. If I may ask, how many of those were within the first year?
I have been pretty continuously approached throughout my four years of undergrad. Probably this also has a good deal to do with social connections formed during contest participation, not just the contest performance itself.
I said “the IMO medal-winning level.”
IMO performance isn’t the only metric which strongly predicts elite math ability.
It probably has something to do with the fact that Paul Christiano was a silver medalist. (Paul is currently just a Research Associate, but I think he is much more involved with SI than the others. See here for recent discussion of some of his work.)
No, we were IMO fetishists before we met Paul.
If people know of stronger predictors of raw math ability than IMO performance, Putnam performance, and early-age 800 on Math SAT, I’d like to know what they are.
Past achievement in original math research, of course.
This is generally not an indicator available for young people (and I think it’s reasonable for MIRI recruiting efforts to target young people), and when it is, it isn’t obviously a good idea to use. There are research opportunities available at the high school and undergraduate level, but they are not universally available, and I have been told by people who sit on graduate admissions committees that most of the research that gets produced by these opportunities is bad. Among the research that is not bad, I think it’s likely to be unclear to what extent quality of the work is due to the student’s efforts or the program / advisor’s. (I have heard rumors that in one particular such program, the advisor plots out the course of the research in advance and leads students through it as something like a series of guided exercises.) Edit: I worded that somewhat poorly; I didn’t mean to suggest that the work has been done by someone other than the student, but a nontrivial portion of the success of a research project at this level is due to the careful selection of a research problem and careful guidance on the part of the program / advisor, which is not what we want to select for.
By contrast, tests like the AMC (which leads to the IMO in the US), the Putnam, and the SAT are widely available and standardized.
The fact that you are looking for “raw” math ability seems questionable. If their most recent achievements are IMO/SAT, you’re looking at high schoolers or early undergrads (Putnam winners have their tickets punched at top grad schools and will be very hard to recruit). Given that, you’ll have at least a 5-10 year lag while they continue learning enough to do basic research.
Yes. So? During that time, you can get them interested in rationality and x-risk.
The IMO/IOI and qualification processes for them seem to be useful as early indicators of general intelligence; they obviously don’t capture everyone or even a huge fraction of all comparably smart people, but they seem to have fewer false positives by far than almost any other external indicators until research careers begin in earnest.
We used contests heavily in the screening process for SPARC in part for this reason, and in part because there is a community surrounding contests which the SPARC instructors understand and have credibility with, and which looks like it could actually benefit from exposure to (something like) rationality, which seems like an awesome opportunity.
“IMO medal-winning level” is (I presume) intended to refer to a level of general intelligence / affinity for math. As I said, the majority of people at this level don’t in fact have IMO medals, and some IMO medalists aren’t at this level. The fact that this descriptor gets used, instead of something like “top 0.01%”, probably comes down to a combination of wanting to avoid precision (both about what is being measured and how high the bar is), and wanting to use a measure which reflects well on the current state of affairs. There may be similar, but I expect (and hope) much smaller, effects on thinking in addition to talking.
I don’t think I’ve had a large effect on the way SI folks view possible researchers. I don’t know how large an effect I’ve had on the way Luke talks about possible researchers.
Also note: I’m a very marginal IMO medalist. People analogous to me don’t have medals in most nearby possible worlds.
What’s your evidence that you’re a marginal IMO medalist?
I only ask because I’ve noticed that my perception of a person’s actual ability and my perception of their ego seem to be negatively correlated among the people I’ve met, including Less Wrong users. For example, I once met a guy at a party who told me he wasn’t much of a coder; next semester he left undergrad to be the CTO of a highly technical Y Combinator startup.
This is part of the reason why I’m a little skeptical of SI’s of telling people “send us an e-mail if you did well on the Putnam”—I would guess a large fraction of those who did well on the Putnam think they did well by pure luck. (Imposter syndrome.) SI might be better off trying to collect info on everyone who thinks they might want to work on FAI, no matter how untalented, and judge relative competence for themselves instead of letting FAI contributor wannabes judge themselves. (Or at least specify a score above which one should definitely contact them, regardless of how lucky one feels one got.)
Less Wrong post on mathematicians and status:
http://lesswrong.com/lw/2vb/vanity_and_ambition_in_mathematics/
IAWYC, and so does Wikipedia:
(I personally am a very good example of this, because although I think I’m not terribly bright, I am in fact a genius.)
Sure, but with the obsession as the antecedent.
Seriously. What is with the IMO fetish?
For what it’s worth, I have no maths (well, at best I had Maths A level and now I have simple calculus), but my experience from university was that it seemed that most of those selected to study maths (at a maths-oriented college in Cambridge, so this is highly selective) were Maths Olympiads. So it’s obviously not just a Singularity Institute thing. In the case of unis, I suspect it’s used largely because the general academic qualifications don’t differentiate enough for the best people. I suppose the question is whether it remains a useful indicator later on in life.
The weirder question is why they think the potential donors shouldn’t apply same sort of criteria to the sellers of the AI risk reduction. Say, you want photo-realistic ray traced photon mapped mmorpg. The hardware doesn’t support this, but I’m sure there’s a few startups making something like that, talking to investors, drafting plans how they must hire topcoder and other programming contest winners and optics experts… hell I myself have been approached by such folks at least 4 times that I remember, with ‘job offers’. The common feature of such startups? They aren’t founded by awesome tech guys themselves, because those tech guys not only are tech guys but also have much better grasp of the big-picture issues and see that its too early.
Hi, I’m new here, so I’m not quite familiar with all the ideas here. However, I am a young mathematician who has some familiarity with how mathematical theories are developed.
It might be much cheaper to accept more average mathematicians who meet the other criteria. Generally, to build a new theory, you’ll need a few people who can come up with lots of creative ideas, and lots of people who are capable of understanding the ideas, and then taking those ideas and building them into a fleshed out theory. Many mathematicians accept that they are of the second type, and work towards developing a theory to the point where a new creative type can clearly see what new ideas are needed.
Shouldn’t this just be a subset of number 5? I’m sure you would rather have someone who would lie to keep AI risk low than someone who would tell the truth no matter what the cost.
On mathematician personalities:
http://lesswrong.com/lw/2z7/draft_three_intellectual_temperaments_birds_frogs/
Seems likely that the distribution of personalities among math competition winners isn’t the same as the distribution of personalities you’d want in an FAI team.
More potential problems with math competitions. Quote by a Fields medalist:
Under ideal conditions, maybe SI would identify “safe” problems that seemed representative of the problem space as a whole and farm these problems out (in a way similar to decision theory has been farmed out some to Less Wrong), inviting the best performers on the safe problems to work on more dangerous problems.
Or SI could simply court proven mathematical researchers.
It should be noted that I’m not a mathematician.
Yeah, that was the analogy I had in mind. I wasn’t sure if people here would be familiar with it though.
And yeah, I agree that math competition winners wouldn’t have the ideal distribution, although it probably wouldn’t hurt to recruit from them as well. Also, I may have some bias here, since I never liked competitions and avoided participating in them. But I agree with the points made in that article.
That seems like a good idea, although it’s hard to know what the problem space looks like without going there. My intuition says that it would be a good idea to try to have a good amount of diversity in whatever team is chosen.
One other issue is that a near precondition for IMO-type recognition is coming from at least a middle class family and having either an immediate family member or early teacher able to recognize and direct that talent. Worse, as these competitions have increased in stature, you have an increasing number of the students pushed by parents and provided regular tutoring and preparation. Those sorts of hothouse personalities would seem to be some of the more risky to put on an FAI team.
Are you sure about this? I don’t know of that many people who did super-well in contests as a result of being tutored from an early age (although I would agree that many that do well in contests took advanced math classes at an early age; however, others did not). Many top-scorers train on their own or in local communities. Now that there are websites like AoPS, it is easier to do well even without a local community, although I agree that being in a better socioeconomic situation is likely to help.
I think we can safely stipulate that there is no universal route to contest success or Luke’s other example of 800 math SATs.
But, I can answer your question that, yes, I’m sure that at least some of the students are receiving supplemental tutoring. Not necessarily contest-focused, but still.
Anecdotally: the two friends I had from undergrad who were IMO medalists (about 10 years ago) had both gone through early math tutoring programs (and both had a parent who was a math professor). All of my undergrad friends who had 800 math SAT had either received tutoring or had their parents buy them study materials (most of them did not look back fondly on the experience).
Remember, for any of these tests, there’s a point where even a small amount of training to the test overwhelms a good deal of talent. Familiarity with problem types, patterns, etc can vastly improve performance.
I have no way to evaluate the scope of your restrictions on doing “super-well” or the particular that the tutoring start at an “early age” (although at least one of the anecdotal IMO cases did a Kumon-type program that started at pre-school).
Are there some people who don’t follow that route? Certainly. However, I do think that it’s important to be aware of other factors that may be present.
This sounds terribly arrogant until you realize that the requirement of “Deeply committed to AI risk reduction” is on the list. Should probably be emphasized around this statement in the post.
Updated.
Or it may be exactly zero if the person with the relevant abilities can’t be “Deeply committed to AI risk reduction” now because the risk is sufficiently low or there’s no means of reducing it, or what you consider “AI risk reduction” doesn’t reduce the risk to mankind, or the like (and abilities make that person see this straight). I would expect such good team—if the effort is at all useful, and when the effort is at all useful—to be started by some genius that had a lot of inventions by age 20, maybe 25 at most.
(Should distinguish raw intelligence, contest training and research math training. Raw intelligence is crucial for good performance in both contests and math research, but getting good at math takes many years of training that IMO winners won’t automatically have.)
Strongly agree. I would also make explicit what is implied above, namely that IMO (etc.) winners will in fact tend to have years of training of a different sort: solving (artificially-devised) contest problems, which may not be as relevant of a skill for SI’s purposes.
It seems to me that what SI really wants/needs is a mathematically-sophisticated version of Yudkowsky. Unfortunately, I’m not sure where one goes to find such people. IMO may not be a bad place to start, but one is probably going to have to look elsewhere as well.
To me this seems naive. Having someone with actually worked in SI on FAI going to Google might be a good thing. It creates connection between Google and SI. If he sees major issues inside Google that invalidate your work on FAI he might be able to alert you. If Google does something that dangerous according to the SI consensus then he’s around to tell them about the danger.
Being open is a good thing.
This.
At this point most of my belief in SI’s chance of success lays in its ability to influence more likely AGI developers or teams towards friendliness.
And, if they’re relying on perfect secrecy/commitment over a group of even a half-dozen researchers as the key to their safety strategy, then by their own standards they should not be trying to build an FAI.
Being specific is a better thing.
I’m going to open my clueless mouth again: Many of the problems associated with FAI haven’t been defined to that well yet. Maybe solving them will require new math, but it seems possible that existing math already provides the necessary tools. Perhaps it would be a good idea to have a generalist who has limited familiarity with a large variety of mathematical tools and can direct the team towards existing tools that might solve their problem. See the section called “The Right Way To Learn Math” in this post for more:
http://steve-yegge.blogspot.com/2006/03/math-for-programmers.html
And a metalevel comment: Presumably folks at SI are discussing these issues independently of the discussion on Less Wrong; they don’t seem to be posting here much. I’m curious why this is considered optimal. It seems to me that posting your arguments on Internet is a good way to get falsifying evidence for them. If the box does not contain a diamond, I wish to believe the box does not contain a diamond and whatnot.
I’ve been complaining about this too. But it does seem that SI is more open than before (e.g., lukeprog’s recent series of posts on future SI plans), which we ought to give them credit for.
Strongly agreed.
Not when you’re the bastard that makes a living selling those boxes. Then to know boxes are empty would be to know you are scamming people, so you wouldn’t want to know.
There seems to be far more commitment to a particular approach than is justified by the evidence (at least what they’ve publicly revealed).
I question this assumption. I think that building an FAI team may damage your overall goal of AI risk reduction for several reasons:
By setting yourself up as a competitor to other AGI research efforts, you strongly decrease the chance that they will listen to you. It will be far easier for them to write off your calls for consideration of friendliness issues as self-serving.
You risk undermining your credibility on risk reduction by tarring yourselves as crackpots. In particular, looking for good mathematicians to work out your theories comes off as “we already know the truth, now we just need people to prove it.”
You’re a small organization. Splitting your focus is not a recipe for greater effectiveness.
On the other hand, SI might get taken more seriously if it is able to demonstrate that it actually does know something about AGI design and isn’t just a bunch of outsiders to the field doing idle philosophizing.
Of course, this requires that SI is ready to publish part of its AGI research.
I agree but, as I’ve understood it, they’re explicitly saying they won’t release any AGI advances they make. What will it do to their credibility to be funding a “secret” AI project?
I honestly worry that this could kill funding for the organization which doesn’t seem optimal in any scenario.
Potential Donor: I’ve been impressed with your work on AI risk. Now, I hear you’re also trying to build an AI yourselves. Who do you have working on your team?
SI: Well, we decided to train high schoolers since we couldn’t find any researchers we could trust.
PD: Hm, so what about the project lead?
SI: Well, he’s done brilliant work on rationality training and wrote a really fantastic Harry Potter fanfic that helped us recruit the high schoolers.
PD: Huh. So, how has the work gone so far?
SI: That’s the best part, we’re keeping it all secret so that our advances don’t fall into the wrong hands. You wouldn’t want that, would you?
PD: [backing away slowly] No, of course not… Well, I need to do a little more reading about your organization, but this sounds, um, good...
Indeed.
“Wish You Were Here”—R. Waters, D. Gilmour
That also requires that SI really isn’t just a bunch of outsiders to the field doing idle philosophizing about infinitely powerful fully general purpose minds that would be so general purpose they’d be naturally psychopathic (seeing the psychopathy as type of intelligent behaviour that fully general intelligence should do).
If the SI is that, the best course of action for SI is to claim that it does or would have to do such awesome research that to publish it would be to risk the mankind survival, and so to protect the mankind it only does philosophizing.
The idea of getting FAI contributors who are unlikely to ever switch jobs seems like it might be the most stringent hiring requirement. It might be worthwhile to look into people who gain government clearances and then move to a nongovernment job, to see if they abuse the top-secret information they had access to.
There are two issues here; you’ve only described one. The first is someone moving to a team that builds a non-Friendly AGI. The second is simply someone moving away—no matter what they go on to do, SI has lost the benefit of their contribution. Someone who is really “deeply committed to AI risk reduction”, would not leave the FAI effort for “mere money” offered by Google. Or so the OP suggests.
This in particular seems to be a good subgoal, and I would be interested in the details of what a more directly useful Singularity Summit looks like, and how you get there. (I attended in 2010, and found it to be fun, somewhat educational, but unfocused. (And somewhat useful in that it attracted attention to SIAI.))
If FAI is or can be made tractable, it will be a technological system: some combination of hardware and software, an actual practical invention. The parenthetical comment in your second point indicates you seem to acknowledge that FAI work mainly consists of safe AI architecture work.
If you look back on the mountain of historical evidence concerning invention, there is a rough general pattern or archetype for inventors. They may have more names or synonyms today: entrepreneur, hacker, programmer, engineer, etc, but the historical pattern remains.
The inventor mentality is characterized by relentless curiosity, creativity, dedication, knowledge, and intelligence. Formal education appears to be of little importance, and may in fact slightly negatively correlate with invention capability. Bill Gates dropped out of college, but Orville Wright dropped out of high school, and this trend appears across the technological landscape. Early mathematical ability may correlate with later inventorhood, but studying formal mathematics (being a mathematician) has a strong negative correlation with later invention. One immediate explanation is that any time spent studying formal mathematics is a waste of extremely precious higher cortical capacity which rivals are entirely devoting to pure technological study.
The Wright brothers didn’t need much or any mathematics to innovate in flight. They needed to understand flight, understand the landscape of prior art, and then quickly iterate and innovate within that space. Their most powerful tool was not mathematics or ‘rationality’, but rather the wind tunnel.
You don’t need accomplished mathematicians, if anything they would actually reduce your chances of success.
You need to attract the specific people who are going to or would develop AGI before you. They will almost certainly not be mathematicians (think Wright, Farnsworth, Edison, Tesla, Bill Gates, and not Terrance Tao). The historical evidence says they are unlikely to have any existing record of high status work.
How could you identify a future AGI-inventor? That should be the question, and the answer is clearly not “recruit math contest winners”.
Aren’t Turing and von Neumann (surely they invented “computers” as much as anyone) counterexamples to your thesis?
No, not if you actually read into the history.
Turing published some conceptual math papers that would eventually get the field of computability and thus computer science started, but by no means did he invent the computer.
Computer evolution was already well under way when Turing published his paper on computability introducing Turing Machines in 1936.
The early British programmable digital computer, the Colossus, was developed by colleagues/contemporaries of Turing, but Turing was not involved, and at the time his abstract Turing Machine concept was not viewed as important:
Colossus was designed by the engineer Tommy Flowers.
The first Turing complete computer was the Z3, developed in germany by the engineer Konrad Zuse. Zuse is unlikely to have even heard of Turing, and the Z3 wasn’t proven Turing Complete until many decades later.
Concerning Von Neumman’s architecture:
Eckert was an electrical engineer, Mauchly a physicist.
Turing and von Neumman both made lasting contributions in the world of ideas, but they did not invent computers, not even close.
Imagine the world without something as basic as public-private key cryptography, which is pure math (or used to be pure before computer engineers hijacked it). Suppose there is at least one essential technology on the way to constructing AGI that requires advanced math skills, and your team is ill-equipped to recognize it. Result: you lose.
Hardly.
Mathematicians publish their work, it is freely available. It doesn’t need to be purchased and privately developed.
Engineering builds on conceptual advances and mathematical tools, but typically said tools are developed long before the engineering work begins.
But if you don’t have mathematicians on your team, you might never realize the importance of the work that the other mathematicians publish, presuming that you even hear about it.
In the world I live in, results in one field that are actually important in other fields have a funny way of becoming known.
In the world I live in, inventors use and read about math, without the services of some personal conduit to the higher math gods.
Who determines whats important? The actual inventors, period.
The historical example shows that inventors don’t have this problem. Perhaps you believe otherwise, that invention has proceeded sub-optimally to date and would have been faster if only mathematicians and their ideas had more status. I don’t see evidence for this.
Actually I see evidence that our society tends to overrate the historical contributions of mathematicians to technical inventions.
Also, like I said in the other thread, it depends what one means by math.
LW-folk in particular (and perhaps lukeprog in extra particular), appear to have an especially strange mathematician fetish.
This is often true in the regular circumstances, but SI is clearly in a rush to avert the x-risk from UFAI, and the relevant math is apparently not yet available, so they have to develop it as they go along. I would compare it to theoretical physics, where available math is often a limiting factor in constructing better models.
This is actually a really interesting and potentially apt comparison. FAI may end up being something like String theory: a region in math space that has zero practical applications. (but given the published work in FAI to date, String Theorists may take offense at such a comparison)
Earlier I said:
SI’s conception of ‘FAI’ as math (whatever that means) is competing with the growing number of pragmatic mainstream approaches, most of which are loosely brain inspired. Humans have internal mechanisms for empathy and altruism which could be reverse engineered and magnified in machines.
But it all depends on what one means by “math”. If you count algorithms as new math, then the vast numbers of computer scientists and programmers, and most of the folks working on AGI designs, are thus mathematicians. If by “math”, you mean the stuff that academic mathematicians typically work on, then one is hard pressed to find any connection to AGI (friendly or not).
The other part of the pattern is that the competent inventive ones are the ones doing recruiting, not other way around, especially as vast majority of inventions are not someone’s first inventions, and inventions tend to make money.
Hopefully you have also considered extracting specific limited-scope math problems and farming them out with grants, like you do for papers. This would increase the pool of available talent and not require training them in AI or rationality.
Shouldn’t the very first goal be to fully define an ethical theory of friendliness before even starting on a goal system to implement the theory? I have some doubts that an acceptable theory can be formalized. Our ethical systems are so human-centric that formalizing them in rational terms will likely lead to either a very weak human-centric theory with potential loopholes or a general theory of ethics that places no particular importance on the concerns of humans, even large groups of them. For instance, I find it much more likely that a general theory of ethics would have little ability to differentiate between the desires of rocks and dirt, bacteria, and humans. What measurable properties of humans makes our actions worthy of more attention than the actions of bacteria or of a solar system or galaxy? Is it the ability for self-referential thought, a consciousness, or conscience? Even assuming a suitable definition for a class of beings whose utilities are valued can be found, the weighting of utility for each being will almost certainly not be in our favor. There are potentially infinitely many more potential future beings whose values and existence depend on actions in the present. There are potentially existing extraterrestrial beings whose values must be considered in case we ever affect them.
Maybe I have missed important discussions where the details of friendliness theory have already been hashed out. If so, I apologize and would appreciate links to them.
I think the other strategy you mentioned, about promoting FAI research by paying known academics to write papers on the topic, is a better idea. It is more plausible, more direct, less cultish, etc.
Ah, so this explains why there is no source code visible. To be honest, I was a little worried that all the effort was purely going to (very useful) essays and not actually building things, but it is clear that it is kept under wraps. That is such a shame, but I suppose necessary.
One caution worth noting here is that “trustworthiness” and “altruism” may not be traits that are stable across different situations. As I noted in this post, there’s good reason to think human behavior evolved to follow conditional rules, so observed trustworthiness and altruism under some conditions may be very poor evidence of Friendliness for superintelligence-coding purposes.
Professional cultivation of big donors
I’ve seen a couple sources argue that intelligence enhancement will ideally come before AGI. This could deal with the math ability constraint, which seems to be your strongest. Maybe you feel that sponsoring an intelligence enhancement effort would be beyond SI’s organizational scope?
If IA (intelligence augmentation) comes before AGI, we will need FIA—the IA equivalent of Friendliness theory. When a human self-modifies using IA, how do we ensure value stability?
To create FIA, we may need a full understanding of human intelligence—which, apart from gathering data we don’t yet have, may prove to be a hard problem. Because IA involves modifying existing human brains, it might be developed before anyone fully understands human intelligence. In addition, there is the problem of causing everyone who uses IA to use the FIA theory.
FIA is analogous in these ways to FAI. If you think IA is likely to exist before AGI, then uFAI and uFIA may be comparably dangerous (for instance, successful IA may jump-start AGI development by the intelligence-augmented humans).
Are there organizations, forums, etc. dedicated to building FIA the way SIAI etc. are dedicated to building FAI?
ETA: the standard usage may be “Intelligence Amplification”, still abbreviated as IA. The meaning is the same.
Some important aspects of the future AI ‘friendliness’ would probably link up with the greater economy surrounding us; and more importantly, it would depend upon the nature of AI interaction with people as well as their behaviour. So, besides the obvious component of mathematics, I feel that some members of the FAI team should also have some background in subjects such as psychology; and also a generic perspective on global issues such as resource management.
Some members of an FAI team should have a background in human psychology, as this is highly relevant to figuring out the Friendly utility function. However, global resource management seems like the sort of problem that could left to the FAI to figure out.
My current opinion is that it’s completely irrelevant. The typical tools developed around the study of human psychology are vastly less accurate than necessary to do the job. Background in mathematics, physics or machine learning seems potentially much more relevant, specifically for the problem of figuring out human goals and not just for other AI-related problems.
No matter how smart you are, looking at the data is essential. Cognitive scientists have spent a long time looking at the data of how humans think / behave, and can probably appreciate subtleties that would be missed by even the most clever mathematicians (unless those mathematicians looked at the same set of data).
I believe Vladimir is thinking in terms of a general theory which could, say, take an arbitrary computational state-machine, interpret it as a decision-theoretic agent, and deduce the “state-machine it would want to be”, according to its “values”, where the phrases in quotes represent imprecise or even misleading designations for rigorous concepts yet to be identified. This would be a form of the long-sought “reflective decision theory” that gets talked about.
From this perspective, the coherent extrapolation of human volition is a matter of reconstructing the human state machine through first-principles physical and computational analysis of the human brain, identifying what type of agent it is, and reflectively idealizing it according to its type and its traits. (An examples of type-and-traits analysis would be 1) identifying an agent as an expected-utility maximizer—that’s its “type” − 2) identifying its specific utility function—that’s a “trait”. But the cognitive architecture underlying human decision-making is expected to be a lot more complicated to specify.)
So the paradigm really is one in which one hopes to skip over all the piecemeal ideas and empirical analysis that cognitive scientists have produced, by coming up with an analytical and extrapolative method of perfect rigor and great generality. In my opinion, people trying to develop this perfect a-priori method can still derive inspiration and knowledge from science that has already been done. But the idea is not “we can neglect existing science because our team will be smarter”, the idea is that a universal method—in the spirit of Solomonoff induction, but tractable—can be identified, which will then allow the problem to be solved with a minimum of prior knowledge.
From an outside view, such a plan seems unlikely to succeed. Science moves forward by data, engineering moves forward by trying things out. This is just intuition though, I would guess there is a reasonable amount of empirical evidence to be gained by looking at theoretical work and seeing how often it runs awry of unexpected facts about the world (I’m embarrassingly unsure of what the answer would be here; added to my list of things to try to figure out).
I agree that the “typical tools developed around the study of human psychology are vastly less accurate than necessary to do the job”, but it still seems like figuring out what humans value is a problem of human psychology. I don’t see how theoretical physics has anything to do with it.
Whether it’s a “problem of human psychology” is a question of assigning an area-of-study label to the problem. The area-of-study characteristic doesn’t seem to particularly help with finding methods appropriate for solving the problem in this case. So I propose to focus on the other characteristics of the problem, namely the necessary rigor in an acceptable solution and the potential difficulty of the concepts necessary to formulate the solution (in the study of a real-world phenomenon). These characteristics match mathematics and physics best (probably more mathematics than physics).
I would expect all FAI team members to have strong math skills in addition to whatever other background they may have, and I expect them to approach the psychological aspects of the problem with greater rigor than is typical of mainstream psychology, and that their math backgrounds will contribute to this. But I think that mainstream psychology would be of some use to them, even if just to provide some concepts to be explored more rigorously.
As I see it, there might be considerable difficulty of concepts in formulating even the exact problem statement. For instance, given that we want a ‘friendly’ AI; our problem statement very much depends on our notion of friendliness; hence the necessity of including psychology.
Going further, considering that SI aims to minimize AI risk, we need to be clear on which AI behavior is said to constitute a ‘risk’. If I remember correctly, the AI in the movie “I-robot” inevitably concludes that killing the human race is the only way to save the planet. The definition of risk in such a scenario is a very delicate problem.
Somehow absent from the objectives is “finding out if SI’s existence is at all warranted”. You also want people “deeply committed to AI risk reduction”, not people deeply committed to e.g. actually finding the truth (which could be that all properties of AI you take for true were a wrong wild guess, which should be considered highly likely due to shoot in the dark nature of those guesses). This puts nail in your coffin completely. A former theologist building a religious organization. Jesus Christ, man.
Also, something else: in the game development there’s the recurring pattern of a newbie or newbies with MMORPG ideas who only needs to get super tech guys to implement it. Thing is, tech guys do not need those newbies at all, they with their ideas are a net negative (because the ideas are misinformed) and represent the category of people to be kept out of development entirely, and the worst that can be done is channelling the money through such newbies. I don’t see why you guys (SI, MMORPG development newbies, all sorts of startup founders whom only have ideas) even think you (instead of others) having the money to hire people to develop your ideas is not a net negative. You think you can have incredibly useful ideas without having powers to develop anything, while the fact is that the power to have useful ideas stems directly from the power to build those ideas bottom up. I think the best recommendation for those with money would be to hold on and not give them to SI, so that SI would not put incompetent ideas ahead and waste people time. All chances are AI risk reduction doesn’t even need your money, that kind of work won’t be someone’s first innovative work, and innovation pays off pretty damn well. You want to decrease AI risk, or you want awesome MMORPG made, or something—what ever you want, please do not funnel your money through this kind of idea guys. It really is net negative.
If I understand this article correctly, it is exactly about SIAI trying to attract the people with powers to develop and analyze FAI.
That seems like a negative self-fulfilling prophecy—don’t give money to SIAI; SIAI can’t pay developers; SIAI doesn’t develop anything; say: “I told you”.
SI had money enough to do far more than it did. To give money to SI is same as to give money to an organization that would buy coal and just burn it—the SI would put money into wasting R&D effort on ideas of incompetent.