the trivial example of a thermostat, which shows a “preference” for reality being a certain way
Doesn’t it rather have a preference for its sensors showing a certain reading? (This doesn’t lead to thermostat wireheading because the thermostat’s action won’t make the sensor alter its mechanism.)
Really, it’s only systems that can model a scenario where its sensors say X but the situation is actually Y, that could possibly have preferences that go beyond the future readings of its sensors. If you assert that a thermostat can have preferences about the territory but a human can’t, then you are twisting language to an unhelpful degree.
Doesn’t it rather have a preference for its sensors showing a certain reading? (This doesn’t lead to thermostat wireheading because the thermostat’s action won’t make the sensor alter its mechanism.)
Really, it’s only systems that can model a scenario where its sensors say X but the situation is actually Y, that could possibly have preferences that go beyond the future readings of its sensors. If you assert that a thermostat can have preferences about the territory but a human can’t, then you are twisting language to an unhelpful degree.