lunatic_at_large comments on Buck’s Shortform

lunatic_at_large 31 Dec 2024 19:21 UTC
3 points
0
When you talk about “other sources of risk from misalignment,” these sound like milder / easier-to-tackle versions of the assumptions you’ve listed? Your assumptions sound like they focus on the worst case scenario. If you can solve the harder version then I would imagine that the easier version would also be solved, no?
- Buck 31 Dec 2024 19:56 UTC
  4 points
  0
  Parent
  Yeah, if you handle scheming, you solve all my safety problems, but not the final bullet point of “models fail to live up to their potential” problems.
  - ryan_greenblatt 1 Jan 2025 1:36 UTC
    4 points
    2
    Parent
    (Though the “fail to live up to potential” problems are probably mostly indirect, see here.)