Jon Garcia answers Research ideas (AI Interpretability & Neurosciences) for a 2-months project

Jon Garcia 8 Jan 2023 20:55 UTC
2 points
0
Since two months is not a very long time to complete a research project, and I don’t know what lab resources or datasets you have access to, it’s a bit difficult to answer this.

It would be great if you could do something like build a model of human value formation based on the interactions between the hypothalamus, VTA, nucleus accumbens, vmPFC, etc. Like, how does the brain generalize its preferences from its gene-coded heuristic value functions? Can this inform how you might design RL systems that are more robust against reward misspecification?

Again, I doubt you can get beyond a toy model in the two months, but maybe you can think of something you can do related to the above.