shared a review in some private channels, might as well share it here:
The book positions itself as a middle ground between optimistic capabilities researchers striding blithely into near-certain catastrophe and pessimistic alignment researchers too concerned with dramatic abstract doom scenarios to address more realistic harms that can still be averted. When addressing the latter, Chapman constructs a hypothetical “AI goes FOOM and unleashes nanomachine death” scenario and argues that while alignment researchers are correct that we have no capacity to prevent this awful scenario, it relies on many leaps (very fast boostrapped self-optimization, solving physics in seconds, nanomachines) that provoke skepticism. I’m inclined to agree: I know that the common line is that “nanomachines are just one example of how TAI can accomplish its goals, FOOM doom scenarios still work if you substitute it with a more plausible technology”, but I’m not sure that they do! “Superdangerous virus synthesis” is the best substitute I’ve heard, but I’m skeptical of even that causing total human extinction (tho the mass suffering that it’d cause is grounds enough for extreme concern).
Chapman also suggests a doom scenario based on a mild extrapolation of current capabilities, where generative models optimized for engagement provoke humans into political activism that leads to world war. Preventing this scenario is a more tractable problem than the former. Instead crafting complex game-theoretic theories, we can discencentivize actors at the forefront of capabilities research from developing and deploying general models. Chapman suggests strengthening data collection regulation and framing generative content as a consumer hazard that deserves both legal and social penalty, like putting carcinogens or slave-labor-derived substances in products.
I think that he’s too quick to dismiss alignment theory work as overly-abstract and unconcerned with plausibility. This dismissal is rhetorically useful in selling AI safety to readers hesitant to accept extreme pessimism based on heavily deductive arguments, but this doesn’t win points with me because I’m not a fan of strategic distortion of fact. On the other hand, I really like that he proposes an overlooked strategy for addressing AI risk that not only addresses current harms, but is accessible to people with skills disjoint from those required for theoretical alignment work. Consumer protection is a well-established field with a numer of historical wins, and adopting its techniques sounds promising.
shared a review in some private channels, might as well share it here:
The book positions itself as a middle ground between optimistic capabilities researchers striding blithely into near-certain catastrophe and pessimistic alignment researchers too concerned with dramatic abstract doom scenarios to address more realistic harms that can still be averted. When addressing the latter, Chapman constructs a hypothetical “AI goes FOOM and unleashes nanomachine death” scenario and argues that while alignment researchers are correct that we have no capacity to prevent this awful scenario, it relies on many leaps (very fast boostrapped self-optimization, solving physics in seconds, nanomachines) that provoke skepticism. I’m inclined to agree: I know that the common line is that “nanomachines are just one example of how TAI can accomplish its goals, FOOM doom scenarios still work if you substitute it with a more plausible technology”, but I’m not sure that they do! “Superdangerous virus synthesis” is the best substitute I’ve heard, but I’m skeptical of even that causing total human extinction (tho the mass suffering that it’d cause is grounds enough for extreme concern).
Chapman also suggests a doom scenario based on a mild extrapolation of current capabilities, where generative models optimized for engagement provoke humans into political activism that leads to world war. Preventing this scenario is a more tractable problem than the former. Instead crafting complex game-theoretic theories, we can discencentivize actors at the forefront of capabilities research from developing and deploying general models. Chapman suggests strengthening data collection regulation and framing generative content as a consumer hazard that deserves both legal and social penalty, like putting carcinogens or slave-labor-derived substances in products.
I think that he’s too quick to dismiss alignment theory work as overly-abstract and unconcerned with plausibility. This dismissal is rhetorically useful in selling AI safety to readers hesitant to accept extreme pessimism based on heavily deductive arguments, but this doesn’t win points with me because I’m not a fan of strategic distortion of fact. On the other hand, I really like that he proposes an overlooked strategy for addressing AI risk that not only addresses current harms, but is accessible to people with skills disjoint from those required for theoretical alignment work. Consumer protection is a well-established field with a numer of historical wins, and adopting its techniques sounds promising.