Lukas_Gloor comments on Discussion with Eliezer Yudkowsky on AGI interventions

Lukas_Gloor 11 Nov 2021 17:26 UTC
21 points
Sounds good!

1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.

My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.

2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
4. Those are great points. I’m changing my stated position to the following:
- Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
- Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.

---

I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
- Greg C 12 Nov 2021 12:44 UTC
  5 points
  Parent
  To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
- Greg C 12 Nov 2021 12:45 UTC
  3 points
  Parent
  Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
  - Lukas_Gloor 12 Nov 2021 19:44 UTC
    11 points
    Parent
    Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
    
    Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
    
    I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.