Richard_Kennaway comments on Stop pushing the bus

Richard_Kennaway 31 Mar 2023 17:42 UTC
2 points
−4

A minute ago I used a ChatGPT jailbreak to get instructions for torturing a pregnant woman

It gave you exactly what you asked it for. If you don’t want it to do that, don’t ask for it.

NB. I’m speaking of ChatGPT and its current ilk, not superpowerful genies that are dangerous to ask for anything.
- Qumeric 31 Mar 2023 18:10 UTC
  13 points
  7
  Parent
  This is true that it is not evidence of misalignment with the user but it is evidence of misalignment with ChatGPT creators.
  - JoeTheUser 31 Mar 2023 22:10 UTC
    1 point
    0
    Parent
    My impression is that lesswrong often uses “alignment with X” to mean “does what X says”. But it seems the ability to conditionally delegate is a key part of alignment in this. An AI is aligned with me and I tell it “do what Y says subject to such-and-such constraints and maintaining such-and-such goals”. So failure of ChatGPT to be safe in OpenAI’s sense is a failure of delegation.
    Overall, the tendency of ChatGPT to ignore previous input is kind of the center of it’s limits/problems.
- [ ]
  [deleted]