Although I don’t expect the analogous human alignment story to go OK as written, even conditional on this story going through; we want a range of values from the AI, not just a single one. “Satisfy humans” would probably be bad as the only human-related shard.
😏[1]
Although I don’t expect the analogous human alignment story to go OK as written, even conditional on this story going through; we want a range of values from the AI, not just a single one. “Satisfy humans” would probably be bad as the only human-related shard.
Reminder to self: Always read the footnotes.