Great question!
Short answer: I’m optimistic about muddling through with partial alignment combined with AI control and AI governance (limiting peak AI capabilities, global enforcement of anti-rogue-AI, anti-self-improving-AI, and anti-self-replicating-weapons laws). See my post “A Path to Human Autonomy” for more details.
I also don’t have money for big bets. I’m more interested in mostly-reputation-wagers about the very near future. So that I might get my reputational returns in time for them to pay off in respectful-attention-from-powerful-decisionmakers, which in turn I would hope might pay off in better outcomes for me, my loved ones, and humanity.
If I am incorrect, then I want to not be given the ear of decision makers, and I want them to instead pay more attention to someone with better models than me. Thus, seems to me like a fairly win-win situation to be making short term reputational bets.
Gotcha. I’m happy to offer 600 of my reputation points vs. 200 of yours on your description of 2026-2028 not panning out. (In general if it becomes obvious[1] that we’re racing toward ASI in the next few years, then people should probably not take me seriously anymore.)
Not that one; I would not be shocked if this market resolves Yes. I don’t have an alternative operationalization on hand; would have to be about AI doing serious intellectual work on real problems without any human input. (My model permits AI to be very useful in assisting humans.)
Hmm, yes. I agree that there’s something about self-guiding /self-correcting on complex lengthy open-ended tasks where current AIs seem at near-zero performance.
I do expect this to improve dramatically in the next 12 months. I think this current lack is more about limitations in the training regimes so far, rather than limitations in algorithms/architectures.
Contrast this with the challengingness of ARC-AGI, which seems like maybe an architecture weakness?
Great question! Short answer: I’m optimistic about muddling through with partial alignment combined with AI control and AI governance (limiting peak AI capabilities, global enforcement of anti-rogue-AI, anti-self-improving-AI, and anti-self-replicating-weapons laws). See my post “A Path to Human Autonomy” for more details.
I also don’t have money for big bets. I’m more interested in mostly-reputation-wagers about the very near future. So that I might get my reputational returns in time for them to pay off in respectful-attention-from-powerful-decisionmakers, which in turn I would hope might pay off in better outcomes for me, my loved ones, and humanity.
If I am incorrect, then I want to not be given the ear of decision makers, and I want them to instead pay more attention to someone with better models than me. Thus, seems to me like a fairly win-win situation to be making short term reputational bets.
Gotcha. I’m happy to offer 600 of my reputation points vs. 200 of yours on your description of 2026-2028 not panning out. (In general if it becomes obvious[1] that we’re racing toward ASI in the next few years, then people should probably not take me seriously anymore.)
well, so obvious that I agree, anyway; apparently it’s already obvious to some people.
I’ll happily accept that bet, but maybe we could also come up with something more specific about the next 12 months?
Example: https://manifold.markets/MaxHarms/will-ai-be-recursively-self-improvi
Not that one; I would not be shocked if this market resolves Yes. I don’t have an alternative operationalization on hand; would have to be about AI doing serious intellectual work on real problems without any human input. (My model permits AI to be very useful in assisting humans.)
Hmm, yes. I agree that there’s something about self-guiding /self-correcting on complex lengthy open-ended tasks where current AIs seem at near-zero performance.
I do expect this to improve dramatically in the next 12 months. I think this current lack is more about limitations in the training regimes so far, rather than limitations in algorithms/architectures.
Contrast this with the challengingness of ARC-AGI, which seems like maybe an architecture weakness?
Can we bet karma?
Edit: sarcasm