DanielLC comments on Approval-directed agents

DanielLC 13 Dec 2014 19:08 UTC
1 point
I feel like if you give the AI enough freedom for its intelligence to be helpful, you’d have the same pitfalls as having the AI pick a goal you’d approve of. I also feel like it’s not clear exactly which decisions you’d oversee. What if the AI convinces you that it’s actions are fine, because you’d approve of its method of choosing them, and that it’s method is fine, because you’d approve of the individual action?