GunZoR comments on [Linkpost] Some high-level thoughts on the DeepMind alignment team’s strategy

GunZoR 9 Mar 2023 19:49 UTC
LW: 4 AF: 1
0
AF
But what stops a blue-cloud model from transitioning into a red-cloud model if the blue-cloud model is an AGI like the one hinted at on your slides (self-aware, goal-directed, highly competent)?
- Vika 10 Mar 2023 11:33 UTC
  LW: 2 AF: 1
  0
  AF Parent
  We expect that an aligned (blue-cloud) model would have an incentive to preserve its goals, though it would need some help from us to generalize them correctly to avoid becoming a misaligned (red-cloud) model. We talk about this in more detail in Refining the Sharp Left Turn (part 2).