I disagree. Predicting who will make the most progress on AI safety is hard. But the research is very close to existing mathematical/theoretical CS/theoretical physics/AI research. Getting the greatest mathematical minds on the planet to work on this problem seems like an obvious high EV bet.
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I don’t think this is true at all. Like, even prosaic alignment researchers care about things like corrigibility, which is an Eliezer-idea.
That doesn’t update me, but to prevent misunderstandings let me clarify that I’m not saying it’s a bad idea to offer lots of money to great mathematicians (presumably with some kind of test-of-fit trial project). It might still be worth it given that we’re talent bottlenecked and the skill does correlate with mathematical ability. I’m just saying that, to me, people seem to overestimate the correlation and that the biggest problem is elsewhere, and the fact that people don’t seem to realize where the biggest problem lies is itself a part of the problem. (Also you can’t easily exchange money for talent because to evaluate an output of someone’s test-of-fit trial period you need competent researcher time. You also need competent researcher time to give someone new to alignment research a fair shot at succeeding with the trial, by advising them and with mentoring. So everything is costly and the ideas you want to pursue have to be above a certain bar.)
I’m open to have a double-crux high-bandwitth talk about this. Would you be up for that?
***************************
I think
you are underestimating how much Very Smart Conventional People in Academia are Generically Smart and how much they know about philosophy/big picture/many different topics.
overestimating how novel some of the insights due to prominent people in the rationality community are; how correlated believing and acting on Weirdo Beliefs is with ability to find novel solutions to (technical) problems—i.e. the WeirdoPoints=g-factor belief prevalent in Rationalist circles.
underestimating how much better a world-class mathematician is than the average researcher, i.e. there is the proverbial 10x programmer. Depending on how one measures this, some of the top people might easily be >1000x.
“By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. ” this jumps out to me. The most famous mathematicains are famous precisely because they came up with novel domains of thought. Although good forecasting is an important skill and an obvious sign of intelligence & competence it is not necessarily a sign of a highly creative researcher. Much of forecasting is about aggregating data and expert opinion; being “too creative” may even be a detriment. Similarly, many of the famous mathematical minds of the past century often had rather naive political views; this is almost completely, even anti-correlated, with their ability to come up with novel solutions to technical problems.
“test-of-fit trial project” also jumps out to me. Nobody has succesfully aligned a general artificial intelligence. The field of AGI safety is in its infancy, many people disagree on the right approach. It is absolutely laughable to me that in the scenario where after much work we get on Terry Tao on board, some group of AI safety researchers (who?) decide he’s not “a good fit for the team”, or even that the research time of existing AGI safety researchers is so valuable that they couldn’t find the time to evaluate his output.
1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.
My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.
2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
4. Those are great points. I’m changing my stated position to the following:
Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.
---
I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.
I disagree. Predicting who will make the most progress on AI safety is hard. But the research is very close to existing mathematical/theoretical CS/theoretical physics/AI research. Getting the greatest mathematical minds on the planet to work on this problem seems like an obvious high EV bet.
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I don’t think this is true at all. Like, even prosaic alignment researchers care about things like corrigibility, which is an Eliezer-idea.
That doesn’t update me, but to prevent misunderstandings let me clarify that I’m not saying it’s a bad idea to offer lots of money to great mathematicians (presumably with some kind of test-of-fit trial project). It might still be worth it given that we’re talent bottlenecked and the skill does correlate with mathematical ability. I’m just saying that, to me, people seem to overestimate the correlation and that the biggest problem is elsewhere, and the fact that people don’t seem to realize where the biggest problem lies is itself a part of the problem. (Also you can’t easily exchange money for talent because to evaluate an output of someone’s test-of-fit trial period you need competent researcher time. You also need competent researcher time to give someone new to alignment research a fair shot at succeeding with the trial, by advising them and with mentoring. So everything is costly and the ideas you want to pursue have to be above a certain bar.)
I’m open to have a double-crux high-bandwitth talk about this. Would you be up for that?
***************************
I think
you are underestimating how much Very Smart Conventional People in Academia are Generically Smart and how much they know about philosophy/big picture/many different topics.
overestimating how novel some of the insights due to prominent people in the rationality community are; how correlated believing and acting on Weirdo Beliefs is with ability to find novel solutions to (technical) problems—i.e. the WeirdoPoints=g-factor belief prevalent in Rationalist circles.
underestimating how much better a world-class mathematician is than the average researcher, i.e. there is the proverbial 10x programmer. Depending on how one measures this, some of the top people might easily be >1000x.
“By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. ” this jumps out to me. The most famous mathematicains are famous precisely because they came up with novel domains of thought. Although good forecasting is an important skill and an obvious sign of intelligence & competence it is not necessarily a sign of a highly creative researcher. Much of forecasting is about aggregating data and expert opinion; being “too creative” may even be a detriment. Similarly, many of the famous mathematical minds of the past century often had rather naive political views; this is almost completely, even anti-correlated, with their ability to come up with novel solutions to technical problems.
“test-of-fit trial project” also jumps out to me. Nobody has succesfully aligned a general artificial intelligence. The field of AGI safety is in its infancy, many people disagree on the right approach. It is absolutely laughable to me that in the scenario where after much work we get on Terry Tao on board, some group of AI safety researchers (who?) decide he’s not “a good fit for the team”, or even that the research time of existing AGI safety researchers is so valuable that they couldn’t find the time to evaluate his output.
Sounds good!
1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.
My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.
2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
4. Those are great points. I’m changing my stated position to the following:
Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.
---
I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.