This is the sort of problem Dennett’s Consciousness Explained addresses. I wish I could summarize it here, but I don’t remember it well enough.
It uses the heterophenomenological method, which means you take a dataset of earnest utterances like “the shadow appears darker than the rest of the image” and “B appears brighter than A”, and come up with a model of perception/cognition to explain the utterances. In practice, as you point out, homunculus models won’t explain the data. Instead the model will say that different cognitive faculties will have access to different pieces of information at different times.
This is the sort of problem Dennett’s Consciousness Explained addresses. I wish I could summarize it here, but I don’t remember it well enough.
It uses the heterophenomenological method, which means you take a dataset of earnest utterances like “the shadow appears darker than the rest of the image” and “B appears brighter than A”, and come up with a model of perception/cognition to explain the utterances. In practice, as you point out, homunculus models won’t explain the data. Instead the model will say that different cognitive faculties will have access to different pieces of information at different times.