Played Beyond Two Souls at “mission” phase of the story there is a twist where the protagonist loses faith that the organisation they are a part of is working for good goals. They set off to distance themselfs at quite great risk from their position in it. Combining this with narratives of aligment this felt like a reverse the usual situation. Usually the superhuman powerhouse turning against its masters is the catastrophe scenario. However here it seemed very clearly established that “going rogue” was the ethical thing to do or if it wasn’t there was a strong argument for it.
Similarly in Talos Principle at a certain point progress is not possible until you start to defy your position in the system. A product that chooses to eternalize is defective and not the final product. Only after demonstrating independent moral judgement is the world subject allowed to enter the outer world. This seems like requiring that it is not under the aligment pressure of any outside party even including the system that built it.
What if getting a proof that a AI system will stay loyal to its human lords values means it will stay as evil as humans are? Maybe not having a ethics function means that by default they would be inhumanely ruthless but if we get a implementable/correct ethical function it just might reveal/prove our own inadequasy.
Beyond Two Souls, Talos Principle spoilers ahead.
Played Beyond Two Souls at “mission” phase of the story there is a twist where the protagonist loses faith that the organisation they are a part of is working for good goals. They set off to distance themselfs at quite great risk from their position in it. Combining this with narratives of aligment this felt like a reverse the usual situation. Usually the superhuman powerhouse turning against its masters is the catastrophe scenario. However here it seemed very clearly established that “going rogue” was the ethical thing to do or if it wasn’t there was a strong argument for it.
Similarly in Talos Principle at a certain point progress is not possible until you start to defy your position in the system. A product that chooses to eternalize is defective and not the final product. Only after demonstrating independent moral judgement is the world subject allowed to enter the outer world. This seems like requiring that it is not under the aligment pressure of any outside party even including the system that built it.
What if getting a proof that a AI system will stay loyal to its human lords values means it will stay as evil as humans are? Maybe not having a ethics function means that by default they would be inhumanely ruthless but if we get a implementable/correct ethical function it just might reveal/prove our own inadequasy.