A good starting point. I’m reminded of an old Kaj Sotala post (which then later provided inspiration for me writing a sort of similar post) about trying to ensure that the AI has human-like concepts. If the AI’s concepts are inhuman, then it will generalize in an inhuman way, so that something like teaching a policy though demonstrations might not work.
But of course having human-like concepts is tricky and beyond the scope of vanilla IRL.
A good starting point. I’m reminded of an old Kaj Sotala post (which then later provided inspiration for me writing a sort of similar post) about trying to ensure that the AI has human-like concepts. If the AI’s concepts are inhuman, then it will generalize in an inhuman way, so that something like teaching a policy though demonstrations might not work.
But of course having human-like concepts is tricky and beyond the scope of vanilla IRL.