Scott summarizes the Predictive Processing theory, explains it in a very accessible way (no math required), and uses it to explain a whole bunch of mental phenomena (attention, imagination, motor behavior, autism, schizophrenia, etc.)
Can someone ELI5/TLDR this paper for me, explain in a way more accessible to a non-technical person?
- How does backprop work if the information can’t flow backwards? - In Scotts post, he says that when lower-level sense data contradicts high-level predictions, high-level layers can override lower-level predictions without you noticing it. But if low-level sensed data has high confidence/precision—the higher levels notice it and you experience “surprise”. Which one of those is equivalent to the backdrop error? Is it low-level predictions being overridden, or high-level layers noticing the surprise, or something else, like changing the connections between neurons to train the network and learn from the error somehow?
TLDR for this paper:
There is a separate set of ‘error’ neurons that communicate backwards. Their values converge on the appropriate back propagation terms.
A large error at the top levels corresponds to ‘surprise’, while a large error at the lower levels corresponds more to the ‘override’.
You guys will probably find this Slate Star Codex post interesting:
https://slatestarcodex.com/2017/09/05/book-review-surfing-uncertainty/
Scott summarizes the Predictive Processing theory, explains it in a very accessible way (no math required), and uses it to explain a whole bunch of mental phenomena (attention, imagination, motor behavior, autism, schizophrenia, etc.)
Can someone ELI5/TLDR this paper for me, explain in a way more accessible to a non-technical person?
- How does backprop work if the information can’t flow backwards?
- In Scotts post, he says that when lower-level sense data contradicts high-level predictions, high-level layers can override lower-level predictions without you noticing it. But if low-level sensed data has high confidence/precision—the higher levels notice it and you experience “surprise”. Which one of those is equivalent to the backdrop error? Is it low-level predictions being overridden, or high-level layers noticing the surprise, or something else, like changing the connections between neurons to train the network and learn from the error somehow?
TLDR for this paper: There is a separate set of ‘error’ neurons that communicate backwards. Their values converge on the appropriate back propagation terms.
A large error at the top levels corresponds to ‘surprise’, while a large error at the lower levels corresponds more to the ‘override’.