Chris_Leong comments on Can we get an AI to “do our alignment homework for us”?

Chris_Leong 26 Feb 2024 16:14 UTC
10 points
1
Second, it is also possible to robustly verify the outputs of a superhuman intelligence without superhuman intelligence.

Why do you believe that a superhuman intelligence wouldn’t be able to deceive you by producing outputs that look correct instead of outputs that are correct?
- Mateusz Bagiński 26 Feb 2024 18:59 UTC
  3 points
  3
  Parent
  Davidad’s plan involves one plausible way of doing that
- O O 27 Feb 2024 17:05 UTC
  1 point
  0
  Parent
  I don’t have the specifics but this is just a natural tendency of many problems—verification is easier than coming up with the solution. Also maybe there are systems where we can require the output to be mathematically verified or reject solutions whose outcomes are hard to understand.