To what extent are humans by themselves evidence of GI alignment, though? A human can acquire values that disagree with those of the humans that taught them those values just by having new experiences/knowledge, to the point of even desiring completely opposite things to their peers (like human progress VS human extinction), doesn’t that mean that humans are not robustly aligned?
To what extent are humans by themselves evidence of GI alignment, though? A human can acquire values that disagree with those of the humans that taught them those values just by having new experiences/knowledge, to the point of even desiring completely opposite things to their peers (like human progress VS human extinction), doesn’t that mean that humans are not robustly aligned?