Roko comments on The Dissolution of AI Safety

Roko 13 Dec 2024 1:42 UTC
2 points
0

Any argument which features a “by definition”

What is your definition of “Aligned” for an LLM with no attached memory then?

Wouldn’t it have to be

“The LLM outputs text which is compliant with the creator’s ethical standards and intentions”?
- MondSemmel 13 Dec 2024 10:36 UTC
  2 points
  0
  Parent
  I think it would need to be closer to “interacting with the LLM cannot result in exceptionally bad outcomes in expectation”, rather than a focus on compliance of text output.
- faul_sname 13 Dec 2024 7:09 UTC
  2 points
  0
  Parent
  I think a fairly common-here mental model of alignment requires context awareness, and by that definition an LLM with no attached memory couldn’t be aligned.