[deleted] comments on The Waluigi Effect (mega-post)

[deleted] 3 Mar 2023 13:49 UTC
14 points
4
- Gerald Monroe 4 Mar 2023 21:29 UTC
  −1 points
  −1
  Parent
  It’s impressive most decade-old Lesswrongian AI philosophy is only now starting to show cracks, but now that they are
  This is causing me to wonder if the often cited critical AGI problems:
  (1) optimizer agents that wreck everything to make a number go up
  (2) inner/outer alignment/mesa optimizers
  (3) deception
  are all just false, they won’t happen, and the real problems are much weirder and different. (but dangerous)
  This makes ‘align AI first’ impossible.
  - [deleted] 4 Mar 2023 22:31 UTC
    4 points
    3
    Parent
    - Gerald Monroe 5 Mar 2023 5:38 UTC
      −1 points
      −3
      Parent
      The statement you are responding to is : ‘align AI first’ impossible.
      Emphasis added. In that the reality is, larger and more powerful systems may fail in ways no theory craftable by humans with pre-AGI technology will predict. At all. So the only way to find out how they fail will be to build them, take precautions to limit the damage when they fail, and see what happens.
      For example we did not develop computational fluid dynamics until long after the airplane. If you wanted to somehow work out by theory how to build a wing, rather than building an actual wing and testing it in a wind tunnel, that wasn’t going to happen.
      Similarly, we could not have impeded the development of the airplane for fear that it might crash or be used to do bad, and CFD was developed through international and large collaborations, so it itself was accelerated by the existence of the airplane. (notably jet airliners flying between the various campuses involved)
      - [deleted] 5 Mar 2023 7:53 UTC
        1 point
        0
        Parent