Yet every “AI” we’ve built so far has amplified traits of humanity that we consider flaws, as well as those we consider virtues. Do we expect that this would magically stop being the case if it passed a certain threshhold?
Ah, what? (I’m reacting to the “every” qualifier here.)
I’d say it comes down to founder effects.
I wouldn’t necessarily call it ‘using AI alignment solutions for human alignment’ though.
Perhaps a better starting point would be: how to discern alignment. And, are there predictable betrayals? Can that situation be improved?
human leaders
That wasn’t the first place I thought of.
How do you tell if a source is trustworthy? (Of information, or a physical good.)
How do you tell if it’s a good idea for someone to join your team?
Overall, human alignment sounds broad, and interesting.
There’s also some stuff about open source, questions that seem relevant. Less specifically, I read on twitter that:
Elon Musk wants to release the twitter algorithm, and for it develop encrypted chat or stomething. (I read the tweet.)
I think the person in charge of Mastodon (which is already open source) said something about working on encrypted chat as well. (I read the blog post.)
Somehow I feel it’s more likely that Mastodon will end up achieving both conditions than Twitter will.
Two announcements, but one doesn’t inspire much confidence. (How often does a not open source project go open source? Not partially, but fully. I see this as a somewhat general issue (open sourcing probability), not just one of specific context, or ‘the laws of probability say p(a and b) < p(a) or p(b) independently (if a and b are different), and here p(a) and p(a’) are reasonably similar’.)
Ah, what? (I’m reacting to the “every” qualifier here.)
I’d say it comes down to founder effects.
I wouldn’t necessarily call it ‘using AI alignment solutions for human alignment’ though.
Perhaps a better starting point would be: how to discern alignment. And, are there predictable betrayals? Can that situation be improved?
That wasn’t the first place I thought of.
How do you tell if a source is trustworthy? (Of information, or a physical good.)
How do you tell if it’s a good idea for someone to join your team?
Overall, human alignment sounds broad, and interesting.
There’s also some stuff about open source, questions that seem relevant. Less specifically, I read on twitter that:
Elon Musk wants to release the twitter algorithm, and for it develop encrypted chat or stomething. (I read the tweet.)
I think the person in charge of Mastodon (which is already open source) said something about working on encrypted chat as well. (I read the blog post.)
Somehow I feel it’s more likely that Mastodon will end up achieving both conditions than Twitter will.
Two announcements, but one doesn’t inspire much confidence. (How often does a not open source project go open source? Not partially, but fully. I see this as a somewhat general issue (open sourcing probability), not just one of specific context, or ‘the laws of probability say p(a and b) < p(a) or p(b) independently (if a and b are different), and here p(a) and p(a’) are reasonably similar’.)