The Shannon formula doesn’t define what information is, it it quantifies amount of information. People occasionally point this out as being kind of philosophically funny—we know how to measure amount of information, but we don’t really have a good definition of what information is. Talking about what information is immediately runs into the question of what the information is about, how the information relates to the thing(s) it’s about, etc.
Those are basically similar to the problems one runs into when talking about e.g. an AI’s objective and whether it’s “aligned with” something in the physical world. Like, this mathematical function (the objective) is supposed to talk about something out in the world, presumably it should relate to those things in the world somehow, etc. I claim it’s basically the same problem: how do we get symbolic information/functions/math-things to reliably “point to” particular things in the world?
(This is what Yudkowsky, IIUC, would call the “pointer problem”.)
Framed as a bits-of-information problem, the difficulty is not so much getting enough bits as getting bits which are actually “about” “human values”. (Presumably that’s why my explanations seem so confusing.)
If natural abstractions are a thing, in what sense is “make this AGI have particular effect X” trying to be about human values, if X is expressed using natural abstractions?
The Shannon formula doesn’t define what information is, it it quantifies amount of information. People occasionally point this out as being kind of philosophically funny—we know how to measure amount of information, but we don’t really have a good definition of what information is. Talking about what information is immediately runs into the question of what the information is about, how the information relates to the thing(s) it’s about, etc.
Those are basically similar to the problems one runs into when talking about e.g. an AI’s objective and whether it’s “aligned with” something in the physical world. Like, this mathematical function (the objective) is supposed to talk about something out in the world, presumably it should relate to those things in the world somehow, etc. I claim it’s basically the same problem: how do we get symbolic information/functions/math-things to reliably “point to” particular things in the world?
(This is what Yudkowsky, IIUC, would call the “pointer problem”.)
Framed as a bits-of-information problem, the difficulty is not so much getting enough bits as getting bits which are actually “about” “human values”. (Presumably that’s why my explanations seem so confusing.)
If natural abstractions are a thing, in what sense is “make this AGI have particular effect X” trying to be about human values, if X is expressed using natural abstractions?
In that case, it’s not about human values, which is one of the very nice things the natural abstraction hypothesis buys us.