Ulisse Mini comments on Ulisse Mini’s Shortform

Ulisse Mini 6 Nov 2022 0:44 UTC
1 point
Mmm, I think it matters a lot which of the 10B^[1] values is harder to instill, I think most of the difficulty is in corrigibility. Strong corrigibility seems like it basically solves alignment. If this is the case then corrigibility is a great thing to aim for, since it’s the real “hard part” as opposed to random human values. I’m ranting now though… :L
1. ↩︎
  I think it’s way less than 10B, probably <1000 though I haven’t thought about this much and don’t know what you’re counting as one “value” (If you mean value shard maybe closer to 10B, if you mean human interpretable value I think <1000)