avturchin comments on We can do better than DoWhatIMean (inextricably kind AI)

avturchin 19 Aug 2023 11:45 UTC
2 points
0
In some sense it already happens. When we train AI on more and more human-generated texts, it, in some sense, gets more capabilities and more alignment.
- lemonhope 20 Aug 2023 0:12 UTC
  3 points
  2
  Parent
  Yes it does become easier to control and communicate with, but it does not become harder to make it be malicious. I’m not sure that an AI scheme that can’t be trivially turned evil rerverso is possible, but I would like to try to find one.