No matter how smart you are, looking at the data is essential. Cognitive scientists have spent a long time looking at the data of how humans think / behave, and can probably appreciate subtleties that would be missed by even the most clever mathematicians (unless those mathematicians looked at the same set of data).
I believe Vladimir is thinking in terms of a general theory which could, say, take an arbitrary computational state-machine, interpret it as a decision-theoretic agent, and deduce the “state-machine it would want to be”, according to its “values”, where the phrases in quotes represent imprecise or even misleading designations for rigorous concepts yet to be identified. This would be a form of the long-sought “reflective decision theory” that gets talked about.
From this perspective, the coherent extrapolation of human volition is a matter of reconstructing the human state machine through first-principles physical and computational analysis of the human brain, identifying what type of agent it is, and reflectively idealizing it according to its type and its traits. (An examples of type-and-traits analysis would be 1) identifying an agent as an expected-utility maximizer—that’s its “type” − 2) identifying its specific utility function—that’s a “trait”. But the cognitive architecture underlying human decision-making is expected to be a lot more complicated to specify.)
So the paradigm really is one in which one hopes to skip over all the piecemeal ideas and empirical analysis that cognitive scientists have produced, by coming up with an analytical and extrapolative method of perfect rigor and great generality. In my opinion, people trying to develop this perfect a-priori method can still derive inspiration and knowledge from science that has already been done. But the idea is not “we can neglect existing science because our team will be smarter”, the idea is that a universal method—in the spirit of Solomonoff induction, but tractable—can be identified, which will then allow the problem to be solved with a minimum of prior knowledge.
From an outside view, such a plan seems unlikely to succeed. Science moves forward by data, engineering moves forward by trying things out. This is just intuition though, I would guess there is a reasonable amount of empirical evidence to be gained by looking at theoretical work and seeing how often it runs awry of unexpected facts about the world (I’m embarrassingly unsure of what the answer would be here; added to my list of things to try to figure out).
No matter how smart you are, looking at the data is essential. Cognitive scientists have spent a long time looking at the data of how humans think / behave, and can probably appreciate subtleties that would be missed by even the most clever mathematicians (unless those mathematicians looked at the same set of data).
I believe Vladimir is thinking in terms of a general theory which could, say, take an arbitrary computational state-machine, interpret it as a decision-theoretic agent, and deduce the “state-machine it would want to be”, according to its “values”, where the phrases in quotes represent imprecise or even misleading designations for rigorous concepts yet to be identified. This would be a form of the long-sought “reflective decision theory” that gets talked about.
From this perspective, the coherent extrapolation of human volition is a matter of reconstructing the human state machine through first-principles physical and computational analysis of the human brain, identifying what type of agent it is, and reflectively idealizing it according to its type and its traits. (An examples of type-and-traits analysis would be 1) identifying an agent as an expected-utility maximizer—that’s its “type” − 2) identifying its specific utility function—that’s a “trait”. But the cognitive architecture underlying human decision-making is expected to be a lot more complicated to specify.)
So the paradigm really is one in which one hopes to skip over all the piecemeal ideas and empirical analysis that cognitive scientists have produced, by coming up with an analytical and extrapolative method of perfect rigor and great generality. In my opinion, people trying to develop this perfect a-priori method can still derive inspiration and knowledge from science that has already been done. But the idea is not “we can neglect existing science because our team will be smarter”, the idea is that a universal method—in the spirit of Solomonoff induction, but tractable—can be identified, which will then allow the problem to be solved with a minimum of prior knowledge.
From an outside view, such a plan seems unlikely to succeed. Science moves forward by data, engineering moves forward by trying things out. This is just intuition though, I would guess there is a reasonable amount of empirical evidence to be gained by looking at theoretical work and seeing how often it runs awry of unexpected facts about the world (I’m embarrassingly unsure of what the answer would be here; added to my list of things to try to figure out).