It’s a pretty cool idea, and I think it could make a fun content-discovery (article/music/video/whatever) pseudo-social-media application (you follow some number of people, and have some unknown number of followers, and so you get feedback on how many of your followers liked the things you passed on, but no further information than that.
I don’t know whether I’d say it’s super alignment-relevant, but also this isn’t the Alignment Forum and people are allowed to have interests that are not AI alignment, and even to share those interests.
Nice! I actually had this as a loose idea in the back of my mind for a while, to have a network of people connected like this and have them signal to each other their track of the day, which could be actual fun. It is a feasible use case as well. The underlying reasoning is also that (at least for me) I would be more open to adopt an idea from a person with whom you feel a shared sense of collectivity, instead of an algorithm that thinks it knows me. Intrinsically, I want such an algorithm to be wrong, for the sake of my own autonomy :)
The way I see it, the relevance for alignment is to ask: what do we actually mean when saying that two intelligent agents are aligned? Are you and I aligned if we would make the same decision in a trolley problem? Or if we motivate our decisions in the same way? Or if we just don’t kill each other? None of these are meaningful indicators of two people being aligned, let alone humans and AI. And with unreliable indicators, will we ever succeed in solving the issue? I’d say two agents are aligned when one agent’s most rewarding decision results in a benefit of the other as well. Generalizing and scaling that alignment to many situations and many agents/people necessitates a ‘theory of mind’ mechanism, as well as a way to keep certain properties invariant under scaling and translation in complex networks. This is really a physicist’s way of thinking about the problem and I am just slowly getting into the language that others in the AI/alignment fields use.
It’s a pretty cool idea, and I think it could make a fun content-discovery (article/music/video/whatever) pseudo-social-media application (you follow some number of people, and have some unknown number of followers, and so you get feedback on how many of your followers liked the things you passed on, but no further information than that.
I don’t know whether I’d say it’s super alignment-relevant, but also this isn’t the Alignment Forum and people are allowed to have interests that are not AI alignment, and even to share those interests.
Nice! I actually had this as a loose idea in the back of my mind for a while, to have a network of people connected like this and have them signal to each other their track of the day, which could be actual fun. It is a feasible use case as well. The underlying reasoning is also that (at least for me) I would be more open to adopt an idea from a person with whom you feel a shared sense of collectivity, instead of an algorithm that thinks it knows me. Intrinsically, I want such an algorithm to be wrong, for the sake of my own autonomy :)
The way I see it, the relevance for alignment is to ask: what do we actually mean when saying that two intelligent agents are aligned? Are you and I aligned if we would make the same decision in a trolley problem? Or if we motivate our decisions in the same way? Or if we just don’t kill each other? None of these are meaningful indicators of two people being aligned, let alone humans and AI. And with unreliable indicators, will we ever succeed in solving the issue? I’d say two agents are aligned when one agent’s most rewarding decision results in a benefit of the other as well. Generalizing and scaling that alignment to many situations and many agents/people necessitates a ‘theory of mind’ mechanism, as well as a way to keep certain properties invariant under scaling and translation in complex networks. This is really a physicist’s way of thinking about the problem and I am just slowly getting into the language that others in the AI/alignment fields use.