CarlShulman comments on Buck’s Shortform

CarlShulman 11 Apr 2022 1:07 UTC
LW: 14 AF: 11
AF
Agreed, and versions of them exist in human governments trying to maintain control (where non-cooordination of revolts is central). A lot of the differences are about exploiting new capabilities like copying and digital neuroscience or changing reward hookups.

In ye olde times of the early 2010s people (such as I) would formulate questions about what kind of institutional setups you’d use to get answers out of untrusted AIs (asking them separately to point out vulnerabilities in your security arrangement, having multiple AIs face fake opportunities to whistleblow on bad behavior, randomized richer human evaluations to incentivize behavior on a larger scale).
- Buck 11 Apr 2022 10:18 UTC
  LW: 11 AF: 4
  AF Parent
  Are any of these ancient discussions available anywhere?