hold_my_fish comments on Human instincts, symbol grounding, and the blank-slate neocortex

hold_my_fish 26 Mar 2021 9:51 UTC
1 point
Thanks for your reply!
A few points where clarification would help, if you don’t mind (feel free to skip some):
- What are the capabilities of the “generative model”? In general, the term seems to be used in various ways. e.g.
  - Sampling from the learned distribution (analogous to GPT-3 at temp=1)
  - Evaluating the probability of a given point
  - Producing the predicted most likely point (analogous to GPT-3 at temp=0)
- Is what we’re predicting the input at the next time step? (Sometimes “predict” can be used to mean filling in missing information, but that doesn’t seem to make sense in this context.) Also, I’m not sure what I mean by “time step” here.
- The “input signal” here is coming from whatever is wired into the cortex, right? Does it work to think of this as a vector in $R^{n}$ ?
- Is the contextual information just whatever is the current input, plus whatever signals are still bouncing around?
Also, the capability described may be a bit too broad, since there are some predictions that the cortex seems to be bad at. Consider predicting the sum of two 8-digit integers. Digital computers compute that easily, so it’s fundamentally an easy problem, but for humans to do it requires effort. Yet for some other predictions, the cortex easily outperforms today’s digital computers. What characterizes the prediction problems that the cortex does well?
- Steven Byrnes 26 Mar 2021 15:43 UTC
  3 points
  Parent
  Think of a generative model as something like “This thing I’m looking at is a red bouncy ball”. Just looking at it you can guess pretty well how much it would weigh if you lifted it, how it would feel if you rubbed it, how it would smell if you smelled it, and how it would bounce if you threw it. Lots of ways to query these models! Powerful stuff!
  some predictions that the cortex seems to be bad at
  If a model is trained to minimize a loss function L, that doesn’t mean that, after training, it winds up with a very low value of L in every possible case. Right? I’m confused about why you’re confused. :-P