Sure, math is not an example of a hard-to-verify task, but I think you’re getting unnecessarily hung up on these things. It does sound like it may be a new and in a narrow sense unexpected technical development, and it’s unclear how significant it is. I wouldn’t try to read into their communications much more.
Amalthea
I would not characterize Tao’s usual takes on AI as particularly good (unless you compare with a relatively low baseline).
He’s been overall pretty conservative and mostly stuck to reasonable claims about current AI. So there’s not much to criticize in particular, but it has come at the cost of him not appreciating the possible/likely trajectories of where things are going, which I think misses the forest for the trees.
In my experience with magh, to be obviously excellent you need to be more like top 10 % of all grad students, possibly even higher, but might vary a lot on the field.
I’d agree that this is to some extent playing the respectability game, but personally I’d be very happy for Eliezer and people to risk doing this too much rather than too little for once.
This is definitely baked in for many people (e.g. me, but also see the discussion here for example).
The most concerning mundane risks that come to mind are unemployment, concentration of power, and adversarial forms of RL (I’m missing a better phrase here, basically what TikTok/Meta/the recent o4 model were already doing). The problems in education are partially downstream of that (what’s the point if it’s not going to help prepare you for work) and otherwise honestly don’t seem too serious on absolute terms? Granted, the system may completely fail to adapt, but that seems more to be an issue with the system already being broken and not about AI in particular.
“Approaching human level” seems reasonable. I think one should just read this as her updating towards short timelines in general based on what experts say rather than her trying to make a prediction.
In the Sydney case, this was probably less Sydney ending the conversation and more the conversation being terminated in order to hide Sydney going off the rails.
I’m not saying that it’s not worth pursuing as an agenda, but I also am not convinced it is promising enough to justify pursuing math related AI capabilities, compared to e.g. creating safety guarantees into which you can plug in AI capabilities once they arise anyway.
I think the “guaranteed safe AI” framework is just super speculative. Enough to basically not matter as an argument given any other salient points.
This leaves us with the baseline, which is that this kind of prize re-directs potentially a lot of brainpower from more math-adjacent people towards thinking about AI capabilities. Even worse, I expect it’s mostly going to attract the un-reflective “full-steam-ahead” type of people.
Mostly, I’m not sure it matters at all except maybe slightly accelerating some inevitable development before e.g. deep mind takes another shot at it to finish things off.
Agreed, I would love to see more careful engagement with this question.
You’re putting quite a lot of weight on what “mathematicians say”. Probably these people just haven’t thought very hard about it?
I believe the confusion comes from assuming the current board follows rules rather than doing whatever is most convenient.
The old board was trying to follow the rules, and the people in question were removed (technically were pressured to remove themselves).
I’d agree the OpenAI product line is net positive (though not super hung up on that). Sam Altman demonstrating what kind of actions you can get away with in front of everyone’s eyes seems problematic.
Or simply when scaling becomes too expensive.
There’s a lot of problems with linking to manifold and calling it “the expert consensus”!
It’s not the right source. The survey you linked elsewhere would be better.
Even for the survey, it’s unclear whether these are the “right” experts for the question. This at least needs clarification.
It’s not a consensus, this is a median or mean of a pretty wide distribution.
I wouldn’t belabor it, but you’re putting quite a lot of weight on this one point.
I mean it only suggests that they’re highly correlated. I agree that it seems likely they represent the views of the average “AI expert” in this case. (I should take a look to check who was actually sampled)
My main point regarding this is that we probably shouldn’t be paying this particular prediction market too much attention in place of e.g. the survey you mention. I probably also wouldn’t give the survey too much weight compared to opinions of particularly thoughtful people, but I agree that this needs to be argued.
In general, yes—but see the above (I.e. we don’t have a properly functioning prediction market on the issue).
Who do you consider thoughtful on this issue?
It’s more like “here are some people who seem to have good opinions”, and that would certainly move the needle for me.
Got it! I’m more inclined to generally expect that various half-decent ideas may unlock surprising advances (for no good reason in particular), so I’m less skeptical that this may be true.
Also, while math is of course easy to verify, assuming they haven’t significantly used verification in the training process, it makes their claims more reasonable.