a close coupling of representation syntax and semantics is neccessary for a discovery program to prosper in a given domain
This is a really interesting point; it seems related to the idea that to be an expert in something, you need a vocabulary close to the domain in question.
It also immediately raises the question of what the expert vocabulary of vocabulary formation/acquisition is, i.e. the domain of learning.
a close coupling of representation syntax and semantics is neccessary for a discovery program to prosper in a given domain
This is a really interesting point; it seems related to the idea that to be an expert in something, you need a vocabulary close to the domain in question.
It doesn’t seem that interesting to me: it’s just a restatement that “data compression = data prediction”. When you have a vocabulary “close to the domain” that simply means that common concepts are compactly expressed. Once you’ve maximally compressed a domain, you have discovered all regularities, and simply outputting a short random string will decompress into something useful.
How do you find which concepts are common and how do you represent them? Aye, there’s the rub.
It also immediately raises the question of what the expert vocabulary of vocabulary formation/acquisition is, i.e. the domain of learning.
So my guess would be that the expert vocabulary of vocabulary formation is the vocabulary of data compression. I don’t know how to make any use of that, though, because the No Free Lunch Theorems seem to say that there’s no general algorithm that is the best across all domains And so there’s no algorithmic way to find which is the best compressor for this universe.
This is a really interesting point; it seems related to the idea that to be an expert in something, you need a vocabulary close to the domain in question.
I’m not so sure about this. I am pretty good at understanding visual reality, and I have some words to describe various objects, but my vocabulary is nowhere near as rich as my understanding is (of course, I’m only claiming to be an average member of a race of fantastically powerful interpreters of visual reality).
Let me give you an example. Say you had two pictures of faces of two different people, but the people look alike and the pictures were taken under similar conditions. Now a blind person, who happens to be a Matlab hacker, asks you to explain how you know the pictures are of different people, presumably by making reference to the pixel statistics of certain image regions (which the blind person can verify with Matlab). Is your face recognition vocabulary up to this challenge?
I think “vocabulary” in this sense refers to the vocabulary of the bits doing the actual processing. Humans don’t have access to the “vocabulary” of their fusiform gyruses, only the result of its computations.
Hm, the abstract for that paper mentions that:
This is a really interesting point; it seems related to the idea that to be an expert in something, you need a vocabulary close to the domain in question.
It also immediately raises the question of what the expert vocabulary of vocabulary formation/acquisition is, i.e. the domain of learning.
It doesn’t seem that interesting to me: it’s just a restatement that “data compression = data prediction”. When you have a vocabulary “close to the domain” that simply means that common concepts are compactly expressed. Once you’ve maximally compressed a domain, you have discovered all regularities, and simply outputting a short random string will decompress into something useful.
How do you find which concepts are common and how do you represent them? Aye, there’s the rub.
So my guess would be that the expert vocabulary of vocabulary formation is the vocabulary of data compression. I don’t know how to make any use of that, though, because the No Free Lunch Theorems seem to say that there’s no general algorithm that is the best across all domains And so there’s no algorithmic way to find which is the best compressor for this universe.
(ETA: multiple quick edits)
I’m not so sure about this. I am pretty good at understanding visual reality, and I have some words to describe various objects, but my vocabulary is nowhere near as rich as my understanding is (of course, I’m only claiming to be an average member of a race of fantastically powerful interpreters of visual reality).
Let me give you an example. Say you had two pictures of faces of two different people, but the people look alike and the pictures were taken under similar conditions. Now a blind person, who happens to be a Matlab hacker, asks you to explain how you know the pictures are of different people, presumably by making reference to the pixel statistics of certain image regions (which the blind person can verify with Matlab). Is your face recognition vocabulary up to this challenge?
I think “vocabulary” in this sense refers to the vocabulary of the bits doing the actual processing. Humans don’t have access to the “vocabulary” of their fusiform gyruses, only the result of its computations.