First, note that “the Harry Potter in JK Rowling’s head” and “the Harry Potter in the books” can be different. For novels we usually expect those differences to be relatively small, but for a case like evolution authoring a genome authoring a brain authoring values, the difference is probably much more substantial. Then there’s a degree of freedom around which thing we want to talk about, and (I claim) when we talk about “human values” we’re talking about the one embedded in the reward stream, not e.g. the thing which evolution “intended”. So that’s why we didn’t talk about authors in this post: insofar as evolution “intended to write something different”, my values are the things it actually did write, not the things it “intended”.
(Note: if you’re in the habit of thinking about symbol grounding via the teleosemantic story which is standard in philosophy—i.e. symbol-meaning grounds out in what the symbol was optimized for in the ancestral environment—then that previous paragraph may sound very confusing and/or incoherent. Roughly speaking, the standard teleosemantic story does not allow for a difference between the Harry Potter in JK Rowling’s head vs the Harry Potter in the books: insofar as the words in the books were optimized to represent the Harry Potter in JK Rowling’s head, their true semantic meaning is the Harry Potter in JK Rowling’s head, and there is no separate “Harry Potter in the books” which they represent. I view this as a shortcoming of teleosemantics, and discuss an IMO importantly better way to handle teleology (and implicitly semantics) here: rather than “a thing’s purpose is whatever it was optimized for, grounding out in evolutionary optimization”, I say roughly “a thing’s purpose is whatever the thing can be best compressed by modeling it as having been optimized for”.)
If it is a Thing, is it a thing similar to Harry Potter?
And do you think this possible-thing has zero, one, or many Authors?
Off-the-cuff take: yes it’s a thing. An awful lot of different “authors” have created symbolic representations of that particular thing. But unlike Harry Potter, that particular thing does represent some real-world systems—e.g. I’m pretty sure people have implemented the Lorenz attractor in simple analogue circuits before, and probably there are some physical systems which happen to instantiate it.
Like… surely in epistemics you can give an agent a “cursed prior” to make it unable to update epistmically towards a real truth via only bayesian updates?
First, note that “the Harry Potter in JK Rowling’s head” and “the Harry Potter in the books” can be different. For novels we usually expect those differences to be relatively small, but for a case like evolution authoring a genome authoring a brain authoring values, the difference is probably much more substantial. Then there’s a degree of freedom around which thing we want to talk about, and (I claim) when we talk about “human values” we’re talking about the one embedded in the reward stream, not e.g. the thing which evolution “intended”. So that’s why we didn’t talk about authors in this post: insofar as evolution “intended to write something different”, my values are the things it actually did write, not the things it “intended”.
(Note: if you’re in the habit of thinking about symbol grounding via the teleosemantic story which is standard in philosophy—i.e. symbol-meaning grounds out in what the symbol was optimized for in the ancestral environment—then that previous paragraph may sound very confusing and/or incoherent. Roughly speaking, the standard teleosemantic story does not allow for a difference between the Harry Potter in JK Rowling’s head vs the Harry Potter in the books: insofar as the words in the books were optimized to represent the Harry Potter in JK Rowling’s head, their true semantic meaning is the Harry Potter in JK Rowling’s head, and there is no separate “Harry Potter in the books” which they represent. I view this as a shortcoming of teleosemantics, and discuss an IMO importantly better way to handle teleology (and implicitly semantics) here: rather than “a thing’s purpose is whatever it was optimized for, grounding out in evolutionary optimization”, I say roughly “a thing’s purpose is whatever the thing can be best compressed by modeling it as having been optimized for”.)
Off-the-cuff take: yes it’s a thing. An awful lot of different “authors” have created symbolic representations of that particular thing. But unlike Harry Potter, that particular thing does represent some real-world systems—e.g. I’m pretty sure people have implemented the Lorenz attractor in simple analogue circuits before, and probably there are some physical systems which happen to instantiate it.
Yup, anti-inductive agent.