But what’s bottlenecking alignment isn’t mathematical cognition. The people contributing interesting ideas to AI alignment, of the sort that Eliezer finds valuable, tend to have a history of deep curiosity about philosophy and big-picture thinking. They have made interesting comments on a number of fields (even if from the status of a layperson).
To make progress in AI alignment you need to be good at the skill “apply existing knowledge to form mental models that let you predict in new domains.” By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. (If Terence Tao happens to have a forecasting hobby, I’d become more excited about the proposal.)
It’s possible that a super-smart mathematician also excels at coming up with alignment solutions (the likelihood is probably a lot higher than for the typical person), but the fact that they spent their career focused on math, as opposed to stronger “polymath profile,” makes me think “probably would’t be close to the very top of the distribution for that particular skill.”
Quote by Eliezer:
Similarly, the sort of person who was like “But how do you know superintelligences will be able to build nanotech?” in 2008, will probably not be persuaded by the demonstration of AlphaFold 2, because it was already clear to anyone sensible in 2008, and so anyone who can’t see sensible points in 2008 probably also can’t see them after they become even clearer. There are some people on the margins of sensibility who fall through and change state, but mostly people are not on the exact margins of sanity like that.
I also share the impression that a lot of otherwise smart people fall into this category. If Eliezer is generally right, a big part of the problem is “too many people are too bad at thinking to see it.” When forming opinions based on others’ views, many don’t filter experts by their thinking style (not: “this person seems unusually likely to have the sort of cognition that lets them make accurate predictions in novel domains”), but rather look for credentials and/or existing status within the larger epistemic community. Costly actions are unlikely without a somewhat broad epistemic consensus. The more we think costly actions are going to be needed, the more important it seems to establish a broad(er) consensus on whose reasoning can be trusted most.
I disagree. Predicting who will make the most progress on AI safety is hard. But the research is very close to existing mathematical/theoretical CS/theoretical physics/AI research. Getting the greatest mathematical minds on the planet to work on this problem seems like an obvious high EV bet.
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I don’t think this is true at all. Like, even prosaic alignment researchers care about things like corrigibility, which is an Eliezer-idea.
That doesn’t update me, but to prevent misunderstandings let me clarify that I’m not saying it’s a bad idea to offer lots of money to great mathematicians (presumably with some kind of test-of-fit trial project). It might still be worth it given that we’re talent bottlenecked and the skill does correlate with mathematical ability. I’m just saying that, to me, people seem to overestimate the correlation and that the biggest problem is elsewhere, and the fact that people don’t seem to realize where the biggest problem lies is itself a part of the problem. (Also you can’t easily exchange money for talent because to evaluate an output of someone’s test-of-fit trial period you need competent researcher time. You also need competent researcher time to give someone new to alignment research a fair shot at succeeding with the trial, by advising them and with mentoring. So everything is costly and the ideas you want to pursue have to be above a certain bar.)
I’m open to have a double-crux high-bandwitth talk about this. Would you be up for that?
***************************
I think
you are underestimating how much Very Smart Conventional People in Academia are Generically Smart and how much they know about philosophy/big picture/many different topics.
overestimating how novel some of the insights due to prominent people in the rationality community are; how correlated believing and acting on Weirdo Beliefs is with ability to find novel solutions to (technical) problems—i.e. the WeirdoPoints=g-factor belief prevalent in Rationalist circles.
underestimating how much better a world-class mathematician is than the average researcher, i.e. there is the proverbial 10x programmer. Depending on how one measures this, some of the top people might easily be >1000x.
“By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. ” this jumps out to me. The most famous mathematicains are famous precisely because they came up with novel domains of thought. Although good forecasting is an important skill and an obvious sign of intelligence & competence it is not necessarily a sign of a highly creative researcher. Much of forecasting is about aggregating data and expert opinion; being “too creative” may even be a detriment. Similarly, many of the famous mathematical minds of the past century often had rather naive political views; this is almost completely, even anti-correlated, with their ability to come up with novel solutions to technical problems.
“test-of-fit trial project” also jumps out to me. Nobody has succesfully aligned a general artificial intelligence. The field of AGI safety is in its infancy, many people disagree on the right approach. It is absolutely laughable to me that in the scenario where after much work we get on Terry Tao on board, some group of AI safety researchers (who?) decide he’s not “a good fit for the team”, or even that the research time of existing AGI safety researchers is so valuable that they couldn’t find the time to evaluate his output.
1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.
My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.
2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
4. Those are great points. I’m changing my stated position to the following:
Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.
---
I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.
People disagree about to what degree formal methods will be effective/quick enough to arrive. I’d like to point out that Paul Christiano, one of the most well-known proponents of more non-formal thinking & focus on existing ML-methods, still has a very strong traditional math/CS background - (i.e. Putnam Fellow, a series of very solid math/CS papers). His research methods/thinking is also very close to how theoretical physicists might think about problems.
Even a nontraditional thinker like EY did very well on math contests in his youth.
But what’s bottlenecking alignment isn’t mathematical cognition. The people contributing interesting ideas to AI alignment, of the sort that Eliezer finds valuable, tend to have a history of deep curiosity about philosophy and big-picture thinking. They have made interesting comments on a number of fields (even if from the status of a layperson).
To make progress in AI alignment you need to be good at the skill “apply existing knowledge to form mental models that let you predict in new domains.” By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. (If Terence Tao happens to have a forecasting hobby, I’d become more excited about the proposal.)
It’s possible that a super-smart mathematician also excels at coming up with alignment solutions (the likelihood is probably a lot higher than for the typical person), but the fact that they spent their career focused on math, as opposed to stronger “polymath profile,” makes me think “probably would’t be close to the very top of the distribution for that particular skill.”
Quote by Eliezer:
I also share the impression that a lot of otherwise smart people fall into this category. If Eliezer is generally right, a big part of the problem is “too many people are too bad at thinking to see it.” When forming opinions based on others’ views, many don’t filter experts by their thinking style (not: “this person seems unusually likely to have the sort of cognition that lets them make accurate predictions in novel domains”), but rather look for credentials and/or existing status within the larger epistemic community. Costly actions are unlikely without a somewhat broad epistemic consensus. The more we think costly actions are going to be needed, the more important it seems to establish a broad(er) consensus on whose reasoning can be trusted most.
Tao is also great at building mathematical models of messy phenomena—here’s an article where he does a beautiful analysis of sailing: https://terrytao.wordpress.com/2009/03/23/sailing-into-the-wind-or-faster-than-the-wind
I’d be surprised if he didn’t have some good insights about AI and alignment after thinking about it for a while.
+1000, that’s one of the main skills I really care about in conceptual alignment research, and Tao is great at it.
I disagree. Predicting who will make the most progress on AI safety is hard. But the research is very close to existing mathematical/theoretical CS/theoretical physics/AI research. Getting the greatest mathematical minds on the planet to work on this problem seems like an obvious high EV bet.
I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
I don’t think this is true at all. Like, even prosaic alignment researchers care about things like corrigibility, which is an Eliezer-idea.
That doesn’t update me, but to prevent misunderstandings let me clarify that I’m not saying it’s a bad idea to offer lots of money to great mathematicians (presumably with some kind of test-of-fit trial project). It might still be worth it given that we’re talent bottlenecked and the skill does correlate with mathematical ability. I’m just saying that, to me, people seem to overestimate the correlation and that the biggest problem is elsewhere, and the fact that people don’t seem to realize where the biggest problem lies is itself a part of the problem. (Also you can’t easily exchange money for talent because to evaluate an output of someone’s test-of-fit trial period you need competent researcher time. You also need competent researcher time to give someone new to alignment research a fair shot at succeeding with the trial, by advising them and with mentoring. So everything is costly and the ideas you want to pursue have to be above a certain bar.)
I’m open to have a double-crux high-bandwitth talk about this. Would you be up for that?
***************************
I think
you are underestimating how much Very Smart Conventional People in Academia are Generically Smart and how much they know about philosophy/big picture/many different topics.
overestimating how novel some of the insights due to prominent people in the rationality community are; how correlated believing and acting on Weirdo Beliefs is with ability to find novel solutions to (technical) problems—i.e. the WeirdoPoints=g-factor belief prevalent in Rationalist circles.
underestimating how much better a world-class mathematician is than the average researcher, i.e. there is the proverbial 10x programmer. Depending on how one measures this, some of the top people might easily be >1000x.
“By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. ” this jumps out to me. The most famous mathematicains are famous precisely because they came up with novel domains of thought. Although good forecasting is an important skill and an obvious sign of intelligence & competence it is not necessarily a sign of a highly creative researcher. Much of forecasting is about aggregating data and expert opinion; being “too creative” may even be a detriment. Similarly, many of the famous mathematical minds of the past century often had rather naive political views; this is almost completely, even anti-correlated, with their ability to come up with novel solutions to technical problems.
“test-of-fit trial project” also jumps out to me. Nobody has succesfully aligned a general artificial intelligence. The field of AGI safety is in its infancy, many people disagree on the right approach. It is absolutely laughable to me that in the scenario where after much work we get on Terry Tao on board, some group of AI safety researchers (who?) decide he’s not “a good fit for the team”, or even that the research time of existing AGI safety researchers is so valuable that they couldn’t find the time to evaluate his output.
Sounds good!
1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.
My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.
2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
4. Those are great points. I’m changing my stated position to the following:
Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.
---
I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.
People disagree about to what degree formal methods will be effective/quick enough to arrive. I’d like to point out that Paul Christiano, one of the most well-known proponents of more non-formal thinking & focus on existing ML-methods, still has a very strong traditional math/CS background - (i.e. Putnam Fellow, a series of very solid math/CS papers). His research methods/thinking is also very close to how theoretical physicists might think about problems.
Even a nontraditional thinker like EY did very well on math contests in his youth.