I think the second robot you’re talking about isn’t the candidate for the AGI-could-kill-us-all level alignment concern. It’s more like a self driving car that could hit someone due to inadequate testing.
Guess I’m not sure though how many answers to our questions you envisage the agent you’re describing generating from second principles. That’s the nub here because both the agents I tried to describe above fit the bill of coffee fetching, but with clearly varying potential for world-ending generalisation.
I think the second robot you’re talking about isn’t the candidate for the AGI-could-kill-us-all level alignment concern. It’s more like a self driving car that could hit someone due to inadequate testing.
Guess I’m not sure though how many answers to our questions you envisage the agent you’re describing generating from second principles. That’s the nub here because both the agents I tried to describe above fit the bill of coffee fetching, but with clearly varying potential for world-ending generalisation.