Noosphere89 comments on A case for capabilities work on AI as net positive

Noosphere89 27 Feb 2023 22:04 UTC
6 points
1
Good question. The major reason I updated so strongly here relates to the fact that once I realized that deceptive alignment was much more unlikely than I thought, I realized that I needed to up weight hard the possibility of alignment by default, and deceptive alignment was my key variable for alignment not being solved by default.

The other update is that as AI capabilities increase, we can point to natural abstractions/categories more by default, which neutralizes the pointers problem, in that we can point our AI to what goal we actually want.
- Noosphere89 27 Feb 2023 22:59 UTC
  4 points
  6
  Parent
  I have now edited the post to be somewhat less confident of the probability of alignment by default success.