John_Maxwell comments on Tournesol, YouTube and AI Risk

John_Maxwell 13 Feb 2021 22:29 UTC
LW: 2 AF: 1
AF

Like, maybe depending on the viewer history, the best video to polarize the person is different, and the algorithm could learn that. If you follow that line of reasoning, the system starts to make better and better models of human behavior and how to influence them, without having to “jump out of the system” as you say.

Makes sense.

...there’s a lot of content on YouTube about YouTube, so it could become “self-aware” in the sense of understanding the system in which it is embedded.

I think it might be useful to distinguish between being aware of oneself in a literal sense, and the term “self-aware” as it is used colloquially / the connotations the term sneaks in.

Some animals, if put in front of a mirror, will understand that there is some kind of moving animalish thing in front of them. The ones that pass the mirror test are the ones that realize that moving animalish thing is them.

There is a lot of content on YouTube about YouTube, so the system will likely become aware of itself in a literal sense. That’s not the same as our colloquial notion of “self-awareness”.

IMO, it’d be useful to understand the circumstances under which the first one leads to the second one.

My guess is that it works something like this. In order to survive and reproduce, evolution has endowed most animals with an inborn sense of self, to achieve self-preservation. (This sense of self isn’t necessary for cognition—if you trip on psychedelics and experience ego death, your brain can still think. Occasionally people will hurt themselves in this state since their self-preservation instincts aren’t functioning as normal.)

Colloquial “self-awareness” occurs when an animal looking in the mirror realizes that the thing in the mirror and its inborn sense of self are actually the same thing. Similar to Benjamin Franklin realizing that lightning and electricity are actually the same thing.

If this story is correct, we need not worry much about the average ML system developing “self-awareness” in the colloquial sense, since we aren’t planning to endow it with an inborn sense of self.

That doesn’t necessarily mean I think Predict-O-Matic is totally safe. See this post I wrote for instance.