Shmi comments on A case for capabilities work on AI as net positive

Shmi 27 Feb 2023 21:45 UTC
7 points
2
In other words, I now believe a significant probability, on the order of 90-99.9%, that alignment is solved by default.
I am on the fence here, and I wonder what specifically pushed you toward this extremely strong update?
- Noosphere89 27 Feb 2023 22:04 UTC
  6 points
  1
  Parent
  Good question. The major reason I updated so strongly here relates to the fact that once I realized that deceptive alignment was much more unlikely than I thought, I realized that I needed to up weight hard the possibility of alignment by default, and deceptive alignment was my key variable for alignment not being solved by default.
  
  The other update is that as AI capabilities increase, we can point to natural abstractions/categories more by default, which neutralizes the pointers problem, in that we can point our AI to what goal we actually want.
  - Noosphere89 27 Feb 2023 22:59 UTC
    4 points
    6
    Parent
    I have now edited the post to be somewhat less confident of the probability of alignment by default success.