SteveG comments on Recent AI safety work

SteveG 1 Jan 2015 22:23 UTC
0 points
0
In addition to determining whether an action would be approved using a priori reasoning, an approval-directed AI could also reference a large database of past actions which have either been approved or disapproved.

Alternatively, in advance of ever making any real-world decision, the approval-directed AI could generate example scenarios and propose actions to people deemed effective moral reasoners many thousands of times. Their responses would greatly assist the system in constructing a model of whether an action is approvable, and by whom.

A lot of approval data could be created fairly readily. The AI can train on this data.