Relatedly, humans are very extensively optimized to predictively model their visual environment. But have you ever, even once in your life, thought anything remotely like “I really like being able to predict the near-future content of my visual field. I should just sit in a dark room to maximize my visual cortex’s predictive accuracy.”?
On reflection, the above discussion overclaims a bit in regards to humans. One complication is that the brain uses internal functions of its own activity as inputs to some of its reward functions, and some of those functions may correspond or correlate with something like “visual environment predictability”. Additionally, humans run an online reinforcement learning process, and human credit assignment isn’t perfect. If periods of low visual predictability correlate with negative reward in the near-future, the human may begin to intrinsically dislike being in unpredictable visual environments.
However, I still think that it’s rare for people’s values to assign much weight to their long-run visual predictive accuracy, and I think this is evidence against the hypothesis that a system trained to make lots of correct predictions will thereby intrinsically value making lots of correct predictions.
GPT-4 may beat most humans on a variety of challenging exams (page 5 of the GPT-4 paper), but still can’t reliably count the number of words in a sentence.
Should we even think that the number of words is an objective property of a linguistic system (at least in some cases)? It seems to me that there are grounds to doubt that based on how languages work.
It still fails to predict our answers, regardless I suppose.
Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined”
I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.
In particular it seems very plausible that I would respond by actively seeking out a predictable dark room if I were confronted with wildly out-of-distribution visual inputs, even if I’d never displayed anything like a preference for predictability of my visual inputs up until then.
When I had a stroke, and was confronted with wildly out-of-distribution visual inputs, one of the first things they did was to put me in a dark predictable room. It was a huge relief, and apparently standard in these kinds of cases.
n=1, but I’ve actually thought this before.
I added the following to the relevant section:
On reflection, the above discussion overclaims a bit in regards to humans. One complication is that the brain uses internal functions of its own activity as inputs to some of its reward functions, and some of those functions may correspond or correlate with something like “visual environment predictability”. Additionally, humans run an online reinforcement learning process, and human credit assignment isn’t perfect. If periods of low visual predictability correlate with negative reward in the near-future, the human may begin to intrinsically dislike being in unpredictable visual environments.
However, I still think that it’s rare for people’s values to assign much weight to their long-run visual predictive accuracy, and I think this is evidence against the hypothesis that a system trained to make lots of correct predictions will thereby intrinsically value making lots of correct predictions.
One small question:
Should we even think that the number of words is an objective property of a linguistic system (at least in some cases)? It seems to me that there are grounds to doubt that based on how languages work.
It still fails to predict our answers, regardless I suppose.
It’s pretty easily definable in English, at least in special cases, and my understanding is that GPT-4 fails in those cases.
(I suppose you know this)
Ok, I say it because, from a semantic perspective, it’s not obvious to me that there has to be a natural sense of wordhood. ‘Words’ are often composed of different units of meaning, and the composition doesn’t have to preserve the exact original meaning unaltered, and there are many phrases that have fixed meaning that can’t be derive from a literal analysis of the meaning of those ‘words’.
It might be arbitrary why some count as words and some don’t, but if you say that it can be “easily defined” I believe you, I don’t really know myself.
Yeah, I guess I think words are the things with spaces between them. I get that this isn’t very linguistically deep, and there are edge cases (e.g. hyphenated things, initialisms), but there are sentences that have an unambiguous number of words.
In particular it seems very plausible that I would respond by actively seeking out a predictable dark room if I were confronted with wildly out-of-distribution visual inputs, even if I’d never displayed anything like a preference for predictability of my visual inputs up until then.
When I had a stroke, and was confronted with wildly out-of-distribution visual inputs, one of the first things they did was to put me in a dark predictable room. It was a huge relief, and apparently standard in these kinds of cases.
I’m better now.