CFAR’s new focus, and AI Safety
A bit about our last few months:
We’ve been working on getting a simple clear mission and an organization that actually works. We think of our goal as analogous to the transition that the old Singularity Institute underwent under Lukeprog (during which chaos was replaced by a simple, intelligible structure that made it easier to turn effort into forward motion).
As part of that, we’ll need to find a way to be intelligible.
This is the first of several blog posts aimed at causing our new form to be visible from outside. (If you’re in the Bay Area, you can also come meet us at tonight’s open house.) (We’ll be talking more about the causes of this mission-change; the extent to which it is in fact a change, etc. in an upcoming post.)
-
We care a lot about AI Safety efforts in particular, and about otherwise increasing the odds that humanity reaches the stars.
-
Also, we[1] believe such efforts are bottlenecked more by our collective epistemology, than by the number of people who verbally endorse or act on “AI Safety”, or any other “spreadable viewpoint” disconnected from its derivation.
-
Our aim is therefore to find ways of improving both individual thinking skill, and the modes of thinking and social fabric that allow people to think together. And to do this among the relatively small sets of people tackling existential risk.
Existential wins and AI safety
Who we’re focusing on, why
AI and machine learning graduate students, researchers, project-managers, etc. who care; who can think; and who are interested in thinking better;
Students and others affiliated with the “Effective Altruism” movement, who are looking to direct their careers in ways that can do the most good;
Rationality geeks, who are interested in seriously working to understand how the heck thinking works when it works, and how to make it work even in domains as confusing as AI safety.
Brier-boosting, not Signal-boosting
CFAR’s mission statement (link post, linking to our website).
- The Craft & The Community—A Post-Mortem & Resurrection by 2 Nov 2017 3:45 UTC; 76 points) (
- Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality” by 12 Dec 2016 19:39 UTC; 65 points) (
- What exactly is the “Rationality Community?” by 9 Apr 2017 0:11 UTC; 62 points) (
- 2016 AI Risk Literature Review and Charity Comparison by 13 Dec 2016 4:36 UTC; 57 points) (EA Forum;
- The Story CFAR by 25 Dec 2017 15:10 UTC; 20 points) (
- 15 Jul 2019 17:11 UTC; 6 points) 's comment on Debrief: “cash prizes for the best arguments against psychedelics” by (EA Forum;
- 11 Jun 2022 16:11 UTC; 4 points) 's comment on Comment reply: my low-quality thoughts on why CFAR didn’t get farther with a “real/efficacious art of rationality” by (
- 16 Jul 2019 14:05 UTC; 3 points) 's comment on Debrief: “cash prizes for the best arguments against psychedelics” by (EA Forum;
- 16 Jul 2019 14:08 UTC; 2 points) 's comment on Debrief: “cash prizes for the best arguments against psychedelics” by (EA Forum;
- 10 Dec 2016 18:40 UTC; 2 points) 's comment on CFAR’s new mission statement (on our website) by (
This is just a guess, but I think CFAR and the CFAR-sphere would be more effective if they focused more on hypothesis generation (or “imagination”, although that term is very broad). Eg., a year or so ago, a friend of mine in the Thiel-sphere proposed starting a new country by hauling nuclear power plants to Antarctica, and then just putting heaters on the ground to melt all the ice. As it happens, I think this is a stupid idea (hot air rises, so the newly heated air would just blow away, pulling in more cold air from the surroundings). But it is an idea, and the same person came up with (and implemented) a profitable business plan six months or so later. I can imagine HPJEV coming up with that idea, or Elon Musk, or von Neumann, or Google X; I don’t think most people in the CFAR-sphere would, it’s just not the kind of thing I think they’ve focused on practicing.
There’s a difference between optimizing for truth and optimizing for interestingness. Interestingness is valuable for truth in the long run because the more hypotheses you have, the better your odds of stumbling on the correct hypothesis. But naively optimizing for truth can decrease creativity, which is critical for interestingness.
I suspect “having ideas” is a skill you can develop, kind of like making clay pots. In the same way your first clay pots will be lousy, your first ideas will be lousy, but they will get better with practice.
Source.
If this is correct, this also gives us clues about how to solve Less Wrong’s content problem.
Online communities do not have a strong comparative advantage in compiling and presenting facts that are well understood. That’s the sort of thing academics and journalists are already paid to do. If online communities have a comparative advantage, it’s in exploring ideas that are neglected by the mainstream—things like AI risk, or CFARish techniques for being more effective.
Unfortunately, LW’s culture has historically been pretty antithetical to creativity. It’s hard to tell in advance whether an idea you have is a good one or not. And LW has often been hard on posts it considers bad. This made the already-scary process of sharing new ideas even more fraught with the possibility of embarrassment.
Same source.
I recommend recording ideas in a private notebook. I’ve been doing this for a few years, and I now have way more ideas than I know what to do with.
Relevant: http://waitbutwhy.com/2015/11/the-cook-and-the-chef-musks-secret-sauce.html
Oh yes. For example, Physical Review Letters is mostly interested in the former, while HuffPo—in the latter.
That’s not true because you must also evaluate all these hypotheses and that’s costly. For a trivial example, given a question X, would you find it easier to identify a correct hypothesis if I presented you with five candidates or with five million candidates?
Yes, subject to native ability. I suspect it’s more like music than like clay pots: some people find it effortless, most can improve with training, and some won’t do well regardless of how much time they spend practicing.
Kinda. On the one hand, pop-sci continues to be popular. On the other hand, journalists are very very bad at it.
I would like to suggest attaching less self-worth and less status to ideas you throw out. Accept that it’s fine that most of them will be shot down.
I don’t like the kindergarten alternative: Oh, little Johnny said something stupid, like he usually does! He is such a creative child! Here is a gold star!
I concur. Note that LW is not that private notebook.
OK, so I told you the other day that I find you a difficult person to have discussions with. I think I might find your comments less frustrating if you made an effort to think of things I would say in response to your points, and then wrote in anticipation of those things. If you’re interested in trying this, I converted all my responses using rot13 so you can try to guess what they will be before reading them.
UhssCb vf gelvat gb znkvzvmr nq erirahr ol jevgvat negvpyrf gung nccrny gb gur fbeg bs crbcyr jub pyvpx ba nqf. Gur rkvfgrapr bs pyvpxonvg gryyf hf onfvpnyyl abguvat nobhg ubj hfrshy vg jbhyq or sbe lbhe nirentr Yrff Jebatre gb fcraq zber gvzr trarengvat ulcbgurfrf. Vg’f na nethzrag ol nanybtl, naq gur nanybtl vf dhvgr ybbfr.
V jbhyq thrff Culfvpny Erivrj Yrggref cevbevgvmrf cncref gung unir vagrerfgvat naq abiry erfhygf bire cncref gung grfg naq pbasvez rkvfgvat gurbevrf va jnlf gung nera’g vagrerfgvat. Shegurezber, V fhfcrpg gung gur orfg culfvpvfgf gel gb qb erfrnepu gung’f vagrerfgvat, naq crre erivrj npgf nf n zber gehgu-sbphfrq svygre nsgrejneqf.
Gur nafjre gb lbhe dhrfgvba vf gung V jbhyq cersre svir zvyyvba pnaqvqngrf. Vs svir ulcbgurfrf jrer nyy V unq gvzr gb rinyhngr, V pbhyq fvzcyl qvfpneq rirelguvat nsgre gur svefg svir.
Ohg ulcbgurfvf rinyhngvba unccraf va fgntrf. Gur vavgvny fgntr vf n onfvp cynhfvovyvgl purpx juvpu pna unccra va whfg n srj frpbaqf. Vs n ulcbgurfvf znxrf vg cnfg gung fgntr, lbh pna vairfg zber rssbeg va grfgvat vg. Jvgu n ynetre ahzore bs ulcbgurfrf, V pna or zber fryrpgvir nobhg juvpu barf tb gb gur evtbebhf grfgvat fgntr, naq erfgevpg vg gb ulcbgurfrf gung ner rvgure uvtuyl cynhfvoyr naq/be ulcbgurfvf gung jbhyq pnhfr zr gb hcqngr n ybg vs gurl jrer gehr.
Gurer frrzf gb or cerggl jvqrfcernq nterrzrag gung YJ ynpxf pbagrag. Jr qba’g frrz gb unir gur ceboyrz bs gbb znal vagrerfgvat ulcbgurfrf.
V pvgrq fbzrbar V pbafvqre na rkcreg ba gur gbcvp bs perngvivgl, Vfnnp Nfvzbi, ba gur fbeg bs raivebazrag gung ur guvaxf jbexf orfg sbe vg. Ner gurer ernfbaf jr fubhyq pbafvqre lbh zber xabjyrqtrnoyr guna Nfvzbi ba guvf gbcvp? (Qvq lbh gnxr gur gvzr gb ernq Nfvzbi’f rffnl?)
Urer’f nabgure rkcreg ba gur gbcvp bs perngvivgl: uggcf://ivzrb.pbz/89936101
V frr n ybg bs nterrzrag jvgu Nfvzbi urer. Lbhe xvaqretnegra nanybtl zvtug or zber ncg guna lbh ernyvmr—V guvax zbfg crbcyr ner ng gurve zbfg perngvir jura gurl ner srryvat cynlshy.
uggc://jjj.birepbzvatovnf.pbz/2016/11/zlcynl.ugzy
Lbh unir rvtugrra gubhfnaq xnezn ba Yrff Jebat. Naq lrg lbh unira’g fhozvggrq nalguvat ng nyy gb Qvfphffvba be Znva. Lbh’er abg gur bayl bar—gur infg znwbevgl bs Yrff Jebat hfref nibvq znxvat gbc-yriry fhozvffvbaf. Jul vf gung? Gurer vf jvqrfcernq nterrzrag gung YJ fhssref sebz n qrsvpvg bs pbagrag. V fhttrfg perngvat n srj gbc-yriry cbfgf lbhefrys orsber gnxvat lbhe bja bcvavba ba gurfr gbcvpf frevbhfyl.
Yes. This is unfortunate, but I cannot help you here.
I think it’s a bad idea. I can’t anticipate your responses well enough (in other words, I don’t have a good model of you) -- for example, I did not expect you to take five million candidate hypotheses. And if I want to have a conversation with myself, why, there is no reason to involve you in the process.
We didn’t get to an average Lesswronger generating hypotheses yet. You’ve introduced a new term—“interestingness” and set it in opposition to truth (or should it have been truthiness?) As far as I can see, clickbait is just a subtype of “interestingness”—and if you want to optimize for “interestingness”, you would tend to end up with clickbait of some sort. And I’m not quite sure what does it have to do with the propensity to generate hypotheses.
If a correct hypothesis was guaranteed to be included in your set, you would discard the true one in 99.9999% of the cases, then.
Let’s try it. “Earth rotates around the Sun”—ha-ha, what do I look like, an idiot? Implausible. Next!
Where “it” is “writing fiction”?
LOL. Kids are naturally playful—the don’t need a kindergarten for it. In fact, kindergartens tend to use their best efforts to shut down kids creativity and make them “less disruptive”, “respectful”, “calm”, and all the things required of a docile shee… err… member of society.
I neither see much reason to do so, nor do I take my own opinion seriously, anyway :-P
Do you want playfulness or seriousness? Pick a side.
Is this due to lack of ability or lack of desire? If lack of ability, why do you think you lack this ability?
I lack the ability to change you and I lack the desire to change myself.
But academics write for other academics, and journalists don’t and can’t. (They’ve tried. They can’t. Remember Vox?)
AFAIK, there isn’t a good outlet for compilations of facts intended for and easily accessible by a general audience, reviews of books that weren’t just written, etc. Since LW isn’t run for profit and is run as outreach for, among other things, CFAR, whose target demographic would be interested in such an outlet, this could be a valuable direction for either LW or a spinoff site; but, given the reputational risk (both personally and institutionally) inherent in the process of generating new ideas, we may be better served by pivoting LW toward the niche I’m thinking of—a cross between a review journal, SSC, and, I don’t know, maybe CIA (think World Factbook) or RAND—and moving the generation and refinement of ideas into a separate container, maybe an anonymous blog or forum.
Would that be Vox, Vox, or Vox?
Edit, 5 minutes later: a bit more seriously, I’m not sure I’d agree that “academics write for other academics” holds as a strong generalization. Many academics focus on writing for academics, but many don’t. I think the (relatively) low level of information flow from academia to general audiences is at least as much a demand-side phenomenon as a supply-side one.
Given “publish or perish”, usually the latter won’t stay in academia for long.
I’d be reluctant to go as far as “usually”, but yes, publish-or-perish norms are playing a role here too.
Academics write textbooks, popular books, and articles that are intended for a lay audience.
Nevertheless, I think it’s great if LW users want to compile & present facts that are well understood. I just don’t think we have a strong comparative advantage.
LW already has a reputation for exploring non-mainstream ideas. That attracts some and repels others. If we tried to sanitize ourselves, we probably would not get back the people who have been repulsed, and we might lose the interest of some of the people we’ve attracted.
Definitely agree with the importance of hypothesis generation and the general lack of it–at least for me, I would classify this as my main business-related weakness, relative to successful people I know.
Interesting idea; shall consider.
headline: CFAR considering colonizing Antarctica.
History repeats itself. Seafarers have always been fond of colonizing distant lands.
...first time as a tragedy and second time as a farce.
Colonizing Antarctica and making a whole slew of new countries is actually a good idea IMO, but it doesn’t have enough appeal. The value to humanity of creating new countries that can innovate on institutions is large.
You can think of Mars colonization as a more difficult version of Antarctic colonization which is actually going to be attempted because it sounds cooler.
“which is actually going to be attempted”
I’m not convinced yet. Yes that is why people are talking about it instead of talking about attempting to colonize Antarctica, or the bottom of the ocean, or whatever. But they currently aren’t attempting to colonize any of those places, including Mars, and we have yet to see them attempt any of them, including Mars.
Well, I suppose it depends where you draw the line.
SpaceX has built real physical components for its Interplanetary Transport System, which is specifically designed for missions to Mars. That’s more than just talk.
But I suppose there was seasteading… though that actually did fully close down.
For the sake of counter factual historical accuracy, if anyone came up with it, it would be Leo Szilard.
I can imagine thinking of such an idea. If you start with the assumption that colonizing Mars is really hard it’s the next step to think about what we could colonize on earth.
There’s much empty land in Australia that could be colonized easier than the Arctic.
This reminds me of a silly plan I put together in high school, I put it here just for the amusement value (because it’s in the same league of absurdity in the plan outlined above): collect from eBay enough balls for Geiger counter testing to make a critical mass and with that seize control of San Marino (a terribly small but wealthy independent state inside Italy).
I think Paul Christiano is an example of someone in the CFAR-sphere who is good at doing this. Might be a useful example to learn from.
A few nitpicks on choice of “Brier-boosting” as a description of CFAR’s approach:
Predictive power is maximized when Brier score is minimized
Brier score is the sum of differences between probabilities assigned to events and indicator variables that are are 1 or 0 according to whether the event did or did not occur. Good calibration therefore corresponds to minimizing Brier score rather than maximizing it, and “Brier-boosting” suggests maximization.
What’s referred to as “quadratic score” is essentially the same as the negative of Brier score, and so maximizing quadratic score corresponds to maximizing predictive power.
Brier score fails to capture our intuitions about assignment of small probabilities
A more substantive point is that even though the Brier score is minimized by being well-calibrated, the way in which it varies with the probability assigned to an event does not correspond to our intuitions about how good a probabilistic prediction is. For example, suppose four observers A, B, C and D assigned probabilities 0.5, 0.4, 0.01 and 0.000001 (respectively) to an event E occurring and the event turns out to occur. Intuitively, B’s prediction is only slightly worse than A’s prediction, whereas D’s prediction is much worse than C’s prediction. But the difference between the increase in B’s Brier score and A’s Brier score is 0.36 − 0.25 = 0.11, which is much larger than corresponding difference for D and C, which is approximately 0.02.
Brier score is not constant across mathematically equivalent formulations of the same prediction
Suppose that a basketball player is to make three free throws, observer A predicts that the player makes each one with probability p and suppose that observer B accepts observer A’s estimate and notes that this implies that the probability that the player makes all three free throws is p^3, and so makes that prediction.
Then if the player makes all three free throws, observer A’s Brier score increases by
3*(1 - p)^2
while observer B’s Brier score increases by
(1 - p^3)^2
But these two expressions are not equal in general, e.g. for p = 0.9 the first is 0.03 and the second is 0.073441. So changes to Brier score depend on the formulation of a prediction as opposed to the prediction itself.
======
The logarithmic scoring rule handles small probabilities well, and is invariant under changing the representation of a prediction, and so is preferred. I first learned of this from Eliezer’s essay A Technical Explanation of a Technical Explanation.
Minimizing logarithmic score is equivalent to maximizing the likelihood function for logistic regression / binary classification. Unfortunately, the phrase “likelihood boosting” has one more syllable than “Brier boosting” and doesn’t have same alliterative ring to it, so I don’t have an actionable alternative suggestion :P.
Good point!
(And thanks for explaining clearly and noting where you learned about logarithmic scoring.)
I would suggest that “helping people think more clearly so that they’ll find truth better, instead of telling them what to believe” already has a name, and it’s “the Socratic method.” It’s unfortunate that this has the connotation of “do everything in a Q&A format”, though.
“Brier scoring” is not a very natural scoring rule (log scoring is better; Jonah and Eliezer already covered the main reasons, and it’s what I used when designing the Credence Game for similar reasons). It also sets off a negative reaction in me when I see someone naming their world-changing strategy after it. It makes me think the people naming their strategy don’t have enough mathematician friends to advise them otherwise… which, as evidenced by these comments, is not the case for CFAR ;) Possible re-naming options that contrast well with “signal boosting”
Score boosting
Signal filtering
Signal vetting
Got any that contrast with “raising awareness” or “outreach”?
“Accuracy-boosting” or “raising accuracy”?
Brainstormy words in that corner of concept-space:
Raising the sanity waterline
Downstream effects
Giving someone a footstool so that they can see for themselves, instead of you telling them what’s on the other side of the wall
Critical masshivemind Compounding thinktank intelligenceDoing thinks better
[switches framing]
Signal boosting means sending more signal so that you it arrives better on the other side. There’s more ways of doing so though;
Noise reduction
(The entire big field of) error correction methods
Specifying the signal’s constraints clearly so that the other side can run a fit to it
Stop sending the signal and instead build the generator on the other side
I don’t think the first problem is a big deal. No-one worries about “I boosted that from a Priority 3 to a Priority 1 bug”.
If CFAR will be discontinuing/de-emphasizing rationality workshops for the general educated public, then I’d like to see someone else take up that mantle, and I’d hope that CFAR would make it easy for such a startup to build on what they’ve learned so far.
We’ll be continuing the workshops, at least for now, with less direct focus, but with probably a similar amount of net development time going into them even if the emphasis is on more targeted programs. This is partly because we value the existence of an independent rationality community (varied folks doing varied things adds to the art and increases its integrity), and partly because we’re still dependent on the workshop revenue for part of our operating budget.
Re: others taking up the mantel: we are working to bootstrap an instructor training; have long been encouraging our mentors and alumni to run their own thingies; and are glad to help others do so. Also Kaj Sotala seems to be developing some interesting training thingies designed to be shared.
Feedback from someone who really enjoyed your May workshop (and I gave this same feedback then, too): Part of the reason I was willing to go to CFAR was that it is separate (or at least pretends to be separate, even though they share personnel and office space) from MIRI. I am 100% behind rationality as a project but super skeptical of a lot of the AI stuff that MIRI does (although I still follow it because I do find it interesting, and a lot of smart people clearly believe strongly in it so I’m prepared to be convinced.) I doubt I’m the only one in this boat.
Also, I’m super uncomfortable being associated with AI safety stuff on a social level because it has a huge image problem. I’m barely comfortable being associated with “rationality” at all because of how closely associated it is (in my social group, at least) with AI safety’s image problem. (I don’t exaggerate when I say that my most-feared reaction to telling people I’m associated with “rationalists” is “oh, the basilisk people?”)
I had mixed feelings towards this post, and I’ve been trying to process them.
On the positive side:
I think AI safety is important, and that collective epistemology is important for this, so I’m happy to know that there will be some attention going to this.
There may be synergies to doing some of this alongside more traditional rationality work in the same org.
On the negative side:
I think there is an important role for pursuing rationality qua rationality, and that this will be harder to do consistently under an umbrella with AI safety as an explicit aim. For example one concern is that there will be even stronger pressure to accept community consensus that AI safety is important rather than getting people to think this through for themselves. Since I agree with you that the epistemology matters, this is concerning to me.
With a growing community, my first inclination would be that one could support both organisations, and that it would be better to have something new focus on epistemology-for-AI, while CFAR in a more traditional form continues to focus more directly on rationality (just as Open Phil split off from GiveWell rather than replacing the direction of GiveWell). I imagine you thought about this; hopefully you’ll address it in one of the subsequent posts.
There is potential reputational damage by having these things too far linked. (Though also potential reputational benefits. I put this in “mild negative” for now.)
On the confused side:
I thought the post did an interesting job of saying more reasonable things than the implicature. In particular I thought it was extremely interesting that it didn’t say that AI safety was a new focus. Then in the ETA you said “Even though our aim is explicitly AI Safety...”
I think framing matters a lot here. I’d feel much happier about a CFAR whose aim was developing and promoting individual and group rationality in general and particularly for important questions, one of whose projects was focusing on AI safety, than I do about a CFAR whose explicit focus is AI safety, even if the basket of activities they might pursue in the short term would look very similar. I wonder if you considered this?
Thanks for the thoughts; I appreciate it.
I agree with you that framing is important; I just deleted the old ETA. (For anyone interested, it used to read:
I’m curious where our two new docs leave you; I think they make clearer that we will still be doing some rationality qua rationality.
Will comment later re: separate organizations; I agree this is an interesting idea; my guess is that there isn’t enough money and staff firepower to run a good standalone rationality organization in CFAR’s stead, and also that CFAR retains quite an interest in a standalone rationality community and should therefore support it… but I’m definitely interested in thoughts on this.
Julia will be launching a small spinoff organization called Convergence, facilitating double crux conversations between EAs and EA-adjacent people in, e.g., tech and academia. It’ll be under the auspices of CFAR for now but will not have opinions on AI. I’m not sure if that hits any of what you’re after.
Thanks for engaging. Further thoughts:
For what it’s worth I think even without saying that your aim is explicitly AI safety, a lot of people reading this post will take that away unless you do more to cancel the implicature. Even the title does this! It’s a slightly odd grammatical construction which looks an awful lot like CFAR’s new focus: AI Safety; I think without being more up-front about alternative interpretation it will sometimes be read that way.
Me too! (I assume that these have not been posted yet, but if I’m just failing to find them please let me know.)
Great. Just to highlight that I think there are two important aspects of doing rationality qua rationality:
Have the people pursuing the activity have this as their goal. (I’m less worried about you failing on this one.)
Have external perceptions be that this is what you’re doing. I have some concern that rationality-qua-rationality activities pursued by an AI safety org will be perceived as having an underlying agenda relating to that. And that this could e.g. make some people less inclined to engage, even relative to if they’re run by a rationality org which has a significant project on AI safety.
I feel pretty uncertain about this, but my guess goes the other way. Also, I think if there are two separate orgs, the standalone rationality one should probably retain the CFAR brand! (as it seems more valuable there)
I do worry about transition costs and losing synergies of working together from splitting off a new org. Though these might be cheaper earlier than later, and even if it’s borderline right now whether there’s enough money and staff to do both I think it won’t be borderline within a small number of years.
This sounds interesting! That’s a specialised enough remit that it (mostly) doesn’t negate my above concerns, but I’m happy to hear about it anyway.
Datapoint: it wasn’t until reading your comment that I realized that the title actually doesn’t read “CFAR’s new focus: AI safety”.
+1
Oh, sorry, the two new docs are posted and were in the new ETA:
http://lesswrong.com/lw/o9h/further_discussion_of_cfars_focus_on_ai_safety/ and http://lesswrong.com/r/discussion/lw/o9j/cfars_new_mission_statement_on_our_website/
Thanks. I’ll dwell more on these. Quick thoughts from a first read:
I generally liked the “further discussion” doc.
I do think it’s important to strongly signal the aspects of cause neutrality that you do intend to pursue (as well as pursuing them). These are unusual and important.
I found the mission statement generally opaque and extremely jargony. I think I could follow what you were saying, but in some cases this required a bit of work and in some cases I felt like it was perhaps only because I’d had conversations with you. (The FAQ at the top was relatively clear, but an odd thing to lead with.)
I was bemused by the fact that there didn’t appear to be a clear mission statement highlighted anywhere on the page!
ETA: Added some more in depth comments on the relevant comment threads: here on “further thoughts”, and here and here on the mission statement.
To get a better idea of your model of what you expect the new focus to do, here’s a hypothetical. Say we have a rationality-qua-rationality CFAR (CFAR-1) and an AI-Safety CFAR (CFAR-2). Each starts with the same team, works independently of each other, and they can’t share work. Two years later, we ask them to write a curriculum for the other organization, to the best of their abilities. This is along the lines of having them do an Ideological Turing Test on each other. How well do they match? In addition, is the newly written version better in any case? Is CFAR-1′s CFAR-2 curriculum better than CFAR-2′s CFAR-2 curriculum?
I’m treating curriculum quality as a proxy for research progress, and somewhat ignoring things like funding and operations quality. The question is only meant to address worries of research slowdowns.
I support this, whole-heartedly :) CFAR has already created a great deal of value without focusing specifically on AI x-risk, and I think it’s high time to start trading the breadth of perspective CFAR has gained from being fairly generalist for some more direct impact on saving the world.
I am annoyed by this post because you describe it as, “we had a really good idea and then we decided to post this instead of getting to that idea”.
I don’t see the point of building anticipation. I like to quote, “start as close to the end, then go forward”
To coordinate we need a leader that many of us would sacrifice for. The obvious candidates are Eliezer Yudkowsky, Peter Thiel, and Scott Alexander. Perhaps we should develop a process by which a legitimate, high-quality leader could be chosen.
Edit: I see mankind as walking towards a minefield. We are almost certainly not in the minefield yet, at our current rate we will almost certainly hit the minefield this century, lots of people don’t think the minefield exists or think that fate or God will protect us from the minefield, and competitive pressures (Moloch) make lots of people individually better off if they push us a bit faster towards this minefield.
I disagree. The LW community already has capable high-status people who many others in the community look up to and listen to suggestions from. It’s not clear to me what the benefit is from picking a single leader. I’m not sure what kinds of coordination problems you had in mind, but I’d expect that most such problems that could be solved by a leader issuing a decree could also be solved by high-status figures coordinating with each other on how to encourage others to coordinate. High-status people and organizations in the LW community communicate with each other a fair amount, so they should be able to do that.
And there are significant costs to picking a leader. It creates a single point of failure, making the leader’s mistakes more costly, and inhibiting innovation in leadership style. It also creates PR problems; in fact, LW already has faced PR problems regarding being an Eliezer Yudkowsky personality cult.
Also, if we were to pick a leader, Peter Thiel strikes me as an exceptionally terrible choice.
The ten up-votes you have for this post is a signal that either we shouldn’t have a leader or if we should it would be difficult for him/her to overcome the opposition in the rationality movement to having a leader.
Speaking for myself (one of the upvotes), I think that having a single leader is bad, but having a relatively small group of leaders is good.
With one leader, it means anything they do or say (or did or said years or decades ago) becomes interpreted as “this is what the whole rationalist community is about”. Also, I feel like focusing on one person too much could make others feel like followers, instead of striving to become stronger.
But if we have a small team of people who are highly respected by the community, and publicly acknowledge each other, and can cooperate with each other… then all we need for coordination is if they meet in the same room once in a while, and publish a common statement afterwards.
I don’t want to choose between Eliezer Yudkowsky, Peter Thiel, and Scott Alexander (and other possible candidates, e.g. Anna Salamon and Julia Galef). Each of these people is really impressive in some areas, but neither is impressive at everything. Choosing one of them feels like deciding which aspects we should sacrifice. Also, some competition is good, and a person who is great today may become less great tomorrow.
Or maybe the leader does not have to be great at everything, as long as they are great at “being a great rationalist leader”, whatever that means. But maybe we actually don’t have this kind of a person yet. (Weak evidence: if a person with such skills would exist, the person would probably already be informally accepted as the leader of rationalists. They wouldn’t wait until a comment on LW tells them to step forward.) Peter Thiel doesn’t seem to communicate with the rationalist community. Eliezer Yudkowsky is hiding on facebook. Scott Alexander has an unrelated full-time job. Maybe none of them actually has enough time and energy to do the job of the “rationalist leader”, whatever that might be.
Also, I feel like asking for a “leader” is the instinctive, un-narrow, halo-effect approach typically generated by the corrupted human hardware. What specific problem are we trying to solve? Lack of communication and coordination in the rationalist community? I suggest Community Coordinator as a job title, and it doesn’t have to be any of these high-status people, as long as it is a person with good people skills and cooperates with them (uhm, maybe Cat Lavigne?). Maybe even a Media Speaker who would once in a week or once in a month collect information about “what’s new in the rationalist community”, and compose an official article.
tl;dr—we don’t need a “leader”, but we need people who will do a few specific things which are missing; coordination of the community being one of them
Part of the advantage of having a leader is that he/she could specialize in leading us and we could pay him/her a full-time salary. “Also, I feel like asking for a “leader” is the instinctive, un-narrow, halo-effect approach typically generated by the corrupted human hardware.” Yes, but this is what works.
Please taboo “leading us”. What is the actual job description for the leader you imagine? What is the expected outcome of having such leader?
And, depeding on your previous answer, could we achieve a similar outcome by simply having a specialist for given task? I mean, even actual leaders employ specialists, so why not skip the middleman? (Or do you believe that the leader would be better at finding the specialists? That sounds almost like a job description… of a specialist.)
Or is the leader supposed to be a symbol? A speaker for the movement?
Or perhaps a person who chooses an arbitrary goal (a meaningful one, but ultimately it would be an arbitrary choice among a few meaningful candidates) under the assumption that if we all focus on one goal, we are more likely to achieve it than if everyone follows a different goal (i.e. a suboptimal choice is still much better than no choice)?
I want someone who could effectively give orders/strong suggestions saying “give to this cause”, “write to your congressman saying this”, “if you have this skill please do this”, “person A should help person B get this job”, “person C is toxic and should be excluded from our community”, “person D is fantastic, let’s recruit her to our community”, “everyone please read this and discuss”, “person E is great, everyone thank her”, “person F has made great contributions to our community but has suffered some recent bad news so let’s help her out”.
I agree that all of this could be useful in many situations.
I just suspect there may be no person fit for this role and willing to take it, and that choosing an unfit person could be harmful. Essentially, people who are sufficiently sane and uncontroversial, are probably not interested in this role, because they believe they have better things to do. Otherwise, they could have already taken it.
All it would need at the beginning would be to privately ask other “rationalist celebrities” whether they think that X is a good idea and whether they are willing to endorse it publicly, and if they say yes, post X in the Main with the list of celebrities who endorse it. If the same person would do this 5 times in the row, people would automatically start accepting them as the leader. Most wouldn’t notice if for the sixth time the endorsements from the other “rationalist celebrities” would be absent, as long as none of them opposes the post directly.
Telling you what to think and what to do, of course. Without a Glorious Leader you would just wander around, lost and confused.
Who is that “we”?
I agree we shouldn’t pick a leader, but I’m curious why you think this. He’s the only person on the list who’s actually got leadership experience (CEO of Paypal), and he did a pretty good job.
Leading a business and leading a social movement require different skill sets, and Peter Thiel is also the only person on the list who isn’t even part of the LW community. Bringing in someone only tangentially associated with a community as its leader doesn’t seem like a good idea.
The key to deciding if we need a leader is to look at historically similar situations and see if they benefited from having a leader. Given that we would very much like to influence government policy, Peter Thiel strikes me as the best possible choice if he would accept. I read somewhere that when Julius Caesar was going to attack Rome several Senators approached Pompey the Great, handed him a sword, and said “save Rome.” I seriously think we should try something like this with Thiel.
How would the position of leader of the LW community help Peter Thiel do this? Also, Peter Thiel’s policy priorities seem to differ a fair amount from those of the average lesswronger, and I’d be pretty surprised if he agreed to change priorities substantially in order to fit with his role as LW leader.
Is this actually a thing that we would want? It seems to me like this line of reasoning depends on a lot of assumptions that don’t seem all that shared.
(I do think that rationalists should coordinate more, but I don’t think rationalists executing the “just obey authority” action is likely to succeed. That seems like a recipe for losing a lot of people from the ‘rationalist’ label. I think there are other approaches that are better suited to the range of rationalist personalities, that still has enough tradition behind it for it to be likely to work; the main inspirations here are Norse þings and Quaker meetings.)
At the moment Peter Thiel should spent all his available time at recruiting people for the Trump administration to fill those 4000 places that are opened. Asking him to spend any time elsewhere is likely not effective.
If I remember correctly, history records Caesar as having been relentlessly successful in that campaign?
If Alyssa Vance is correct that the community is bottlenecked on idea generation, I think this is exactly the wrong way to respond. My current view is that increasing hierarchy has the advantage of helping people coordinate better, but it has the disadvantage that people are less creative in a hierarchical context. Isaac Asimov on brainstorming:
I believe this has already happened to the community through the quasi-deification of people like Eliezer, Scott, and Gwern. It’s odd, because I generally view the LW community as quite nontraditional. But when I look at academia, I get the impression that college professors are significantly closer in status to their students than our intellectual leadership.
This is my steelman of people who say LW is a cult. It’s not a cult, but large status differences might be a sociological “code smell” for intellectual communities. Think of the professor who insists that they always be addressed as “Dr. Jones” instead of being called by their first name. This is rarely the sort of earnest, energetic, independent-minded person who makes important discoveries. “The people I know who do great work think that they suck, but that everyone else sucks even more.”
The problem is compounded by the fact that Eliezer, Scott, and Gwern are not actually leaders. They’re high status, but they aren’t giving people orders. This leads to leadership vacuums.
My current guess is that we should work on idea generation at present, then transform into a more hierarchical community when it’s obvious what needs to be done. I don’t know what the best community structure for idea generation is, but I suspect the university model is a good one: have a selective admissions process, while keeping the culture egalitarian for people who are accepted. At least this approach is proven.
I shall preface by saying that I am neither a rationalist nor an aspiring rationalist. Instead, I would classify myself as a “rationality consumer”—I enjoy debating philosophy and reading good competence/insight porn. My life is good enough that I don’t anticipate much subjective value from optimizing my decisionmaking.
I don’t know how representative I am. But I think if you want to reach “people who have something to protect” you need to use different approaches from “people who like competence porn”, and I think while a site like LW can serve both groups we are to some extent running into issues where we may have a population that is largely the latter instead of the former—people admire Gwern, but who wants to be Gwern? Who wants to be like Eliezer or lukeprog? We may not want leaders, but we don’t even have heroes.
I think possibly what’s missing, and this is especially relevant in the case of CFAR, is a solid, empirical, visceral case for the benefit of putting the techniques into action. At the risk of being branded outreach, and at the very real risk of significantly skewing their post-workshop stats gathering, CFAR should possibly put more effort into documenting stories of success through applying the techniques. I think the main focus of research should be full System-1 integration, not just for the techniques themselves but also for CFAR’s advertisement. I believe it’s possible to do this responsibly if one combines it with transparency and System-2 relevant statistics. Contingent, of course, on CFAR delivering the proportionate value.
I realize that there is a chicken-and-egg problem here where for reasons of honesty, you want to use System-1-appealing techniques that only work if the case is solid, which is exactly the thing that System-1 is traditionally bad at! I’m not sure how to solve that, but I think it needs to be solved. To my intuition, rationality won’t take off until it’s value-positive for S1 as well as S2. If you have something to protect you can push against S1 in the short-term, but the default engagement must be one of playful ease if you want to capture people in a state of idle interest.
They do put effort into this; I do wonder how communicable it is, though.
For example, at one point Anna described a series of people all saying something like “well, I don’t know if it had any relationship to the workshop, but I did X, Y, and Z” during followups that, across many followups, seemed obviously due to the workshop. But it might be a vague thing that’s easier to see when you’re actually doing the followups rather than communicating statistics about followups.
Thanks so much for saying this! Thinking about this distinction you made, I feel there may be actually four groups of LW readers, with different needs or expectations from the website:
“Science/Tech Fans”—want more articles about new scientific research and new technologies. “Has anyone recently discovered a new particle, or built a new machine? Give me a popular science article about it!”
“Competence/Insight Consumers”—want more articles about pop psychology theories and life hacks. They feel they are already doing great, and only want to improve small details. “What do you believe is the true source of human motivation, and how do you organize your to-do lists? But first, give me your credentials: are you a successful person?”
“Already Solving a Problem”—want feedback on their progress, and information speficially useful for them. Highly specific; two people in the same category working on completely different problems probably wouldn’t benefit too much from talking to each other. If they achieve critical mass, it would be best to make a subgroup for them (except that LW currently does not support creating subgroups).
“Not Started Yet”—inspired by the Sequences, they would like to optimize their lives and the universe, but… they are stuck in place, or advancing very very slowly. They hope for some good advice that would make something “click”, and help them leave the ground.
Maybe it’s poll time… what do you want to read about?
[pollid:1176]
If anyone’s mind is in a place where they think they’d be more productive or helpful if they sacrificed themselves for a leader, then, with respect, I think the best thing they can do for protecting humanity’s future is to fix that problem in themselves.
The way people normally solve big problems is to have a leader people respect, follow, and are willing to sacrifice for. If there is something in rationalists that prevents us from accepting leadership then the barbarians will almost certainly beat us.
I see two unrelated sub-problems: one of prediction and one of coordination.
We already know that experts are better than layman, but differentiated groups perform better at prediction than experts. Thus, a decisions took by an aggregated prediction market will be better in terms of accuracy, but the problem here is that people in general do not coordinate well in a horizontal structure.
Hi Anna, could you please explain how CFAR decided to focus on AI safety, as opposed to other plausible existential risks like totalitarain governments or nuclear war?
Coming up. Working on a blog post about it; will probably have it up in ~4 days.
Is this an admission that CFAR cannot effectively help people with problems other than AI safety?
Or an admission that this was their endgame all along now that they have built a base of people who like them… I’ve been expecting that for quite some time. It fits the modus operendi.
I intend to donate to MIRI this year; do you anticipate that upcoming posts or other reasoning/resources might or should persuade people like myself to donate to CFAR instead?
I think we’ll have it all posted by Dec 18 or so, if you want to wait and see. My personal impression is that MIRI and CFAR are both very good buys this year and that best would be for each to receive some donation (collectively, not from each person); I expect the care for MIRI to be somewhat more straight-forward, though.
I’d be happy to Skype/email with you or anyone re: the likely effects of donating to CFAR, especially after we get our posts up.
This post makes me very happy. It emphasizes points I wanted to discuss here a while ago (e.g. collective thinking and the change of focus) but didn’t have the confidence to.
In my opinion, we should devote more time to hypothesis testing on both individual and collective rationality. Many suggestions to improve individual rationality have been advanced on LW. The problem is we don’t know how effective these techniques are. Is it possible to test them at CFAR or at LW meetings ? I’ve seen posts about rationality drugs—to take an example—and even though some people shared their experience, making a study about it would enable us to collect data, avoid experimental asymmetry and response bias. Of course, one obvious bias is selection bias but our goal is to produce relevant results for this community. So picking individuals roughly representative of the whole community is the next problem. (Rationality drugs are besides the point.) More importantly, does this general idea seem stupid to you, is it one of those “good in theory, bad in practice” ideas ?
I am very interested in collective problem-solving but I did not find insightful resources about it, do you know of any ?
P.S : English is not my mother tongue ; don’t be surprised if this post is imprecise and full of grammatical mistakes.
I get the impression that ‘new ways of improving thinking skill’ is a task that has mostly been saturated. The reasons people perhaps don’t have great thinking skill might be because
1) Reality provides extremely sparse feedback on ‘the quality of your/our thinking skills’ so people don’t see it as very important.
2) For a human, who represents 1⁄7 billionth of our species, thinking rationally is often a worse option than thinking irrationally in the same way as a particular group of humans, so as to better facilitate group membership via shared opinions. It’s very hard to ‘go it alone’.
3) (related to 2) Most decisions that a human has to make have already been faced by innumerable previous humans and do not require a lot of deep, fundamental-level thought.
These effects seem to present challenges to level-headed, rational thinking about the future of humanity. I see a lot of #2 in bad, broken thinking about AI risk, where the topic is treated as a proxy war for prosecuting various political/tribal conflicts.
Actually it is possible that the worst is yet to come in terms of political/tribal conflict influencing AI risk thinking.
great post. i like it. feeling great when reading your post
<a href=”http://supersmashflash2s.com“>Super Smash Flash</a>
I do guess then that this effort is guided by an ideal that has been already outlined? Do you define “improving” in relation to, e.g., Bayesian reasoning?
What do you mean by “our collective epistemology”?
The catch-22 I would expect with CFAR’s efforts is that anyone buying their services is already demonstrating a willingness to actually improve his/her rationality/epistemology, and is looking for effective tools to do so.
The bottleneck, however, is probably not the unavailability of such tools, but rather the introspectivity (or lack thereof) that results in a desire to actually pursue change, rather than simply virtue-signal the typical “I always try to learn from my mistakes and improve my thinking”.
The latter mindset is the one most urgently needing actual improvements, but its bearers won’t flock to CFAR unless it has gained acceptance as an institution with which you can virtue-signal (which can confer status). While some universities manage to walk that line (providing status affirmation while actually conferring knowledge), CFAR’s mode of operation would optimally entail “virtue-signalling ML students in on one side”, “rationality-improved ML students out on the other side”, which is a hard sell, since signalling an improvement in rationality will always be cheaper than the real thing (as it is quite non-obvious to tell the difference for the uninitiated).
What remains is helping those who have already taken that most important step of effective self-reflection and are looking for further improvement. A laudable service to the community, but probably far from changing general attitudes in the field.
Taking off the black hat, I don’t have a solution to this perceived conundrum.
The self-help industry (as well as, say, gyms or fat farms) mostly sells what I’d call “willpower assists”—motivation and/or structure which will push you to do what you want to do but lack sufficient willpower for.
To the extent that this is true, I would say that they are failing abismally.
You’re a bit confused: they are selling willpower assists, but what they want is to get money for them. They are not failing at collecting the money, and as to willpower assists turning out to be not quite like the advertisements, well, that’s a standard caveat emptor issue.
Ha, you’re absolutely right!
Do you believe that the Briers score is definitely the best way to model predictive accuracy or do you just point to it because it’s a good way to model predictive accuracy?
No, it was just a pun. I believe trying to improve predictive accuracy is better than trying to promote view X (for basically any value of “X”); which I was hoping the pun of “Brier Boosting” off “Signal Boosting” would point to; but not Briers Score as such.
Edited “tomorrow’s open house” to “tonight’s open house” to minimize confusion.
Thanks!