I said that ” it seems that upon reflection I would embrace an extrapolation of the model-based system’s preferences as representing ‘my values’.”
OK, didn’t notice that; I was referring more to the opening dialog. Though “extrapolation” still doesn’t seem to fit, because brain “modules” are not the same kind of thing as goals. Two-step process where first you extract “current preferences” and then “extrapolate” them is likely not how this works, so positing that you get the final preferences somehow starting from the brains is weaker (and correspondingly better, in the absence of knowledge of how this is done).
I agree that the two-step process may very well not work. This is an extremely weak and preliminary result. There’s a lot more hacking at the edges to be done.
I agree that the two-step process may very well not work. This is an extremely weak and preliminary result.
What are you referring to by “this” in the second sentence? I don’t think there is a good reason to posit the two-step process, so if this is what you refer to, what’s the underlying result, however weak and preliminary?
Two-step process = (1) Extract preferences, (2) Extrapolate preferences. This may not work. This is one reason that this discovery about three valuation systems in the brain is so weak and preliminary for the purposes of CEV. I’m not sure it will turn out to be relevant to CEV at all.
I see, so the two-step thing acts as a precondition. Is it right that you are thinking of descriptive idealization/analysis of human brain as a path that might lead to definition of “current” (extracted) preferences, which is then to be corrected by “extrapolation”? If so, that would clarify for me your motivation for hoping to get anything FAI-relevant out of neuroscience: extrapolation step would correct the fatal flaws of the extraction step.
(I think extrapolation step (in this context) is magic that can’t work, and instead analysis of human brain must extract/define the right decision problem “directly”, that is formally/automatically, without losing information during descriptive idealization performed by humans, which any object-level study of neuroscience requires.)
Extraction + extrapolation is one possibility, though at this stage in the game it still looks incoherent to me. But sometimes things look incoherent before somebody smart comes along and makes them coherent and tractable.
Another possibility is that an FAI uploads some subset of humans and has them reason through their own preferences for a million subjective years and does something with their resulting judgments and preferences. This might also be basically incoherent.
Another possibility is that a single correct response to preferences falls out of game theory and decision theory, as Drescher attempts in Good and Real. This might also be incoherent.
In these terms, the plan I see as the most promising is that the correct way of extracting preferences from humans that doesn’t require further “extrapolation” falls out of decision theory.
(Not sure what you meant by Drescher’s option (what’s “response to preferences”?): does the book suggest that it’s unnecessary to use humans as utility definition material? In any case, this doesn’t sound like something he would currently believe.)
As I recall, Drescher still used humans as utility definition material but thought that there might be a single correct response to these utilities — one which falls out of decision theory and game theory.
What’s “response to utilities” (in grandparent you used “response to preferences” which I also didn’t understand)? Response of what for what purpose? (Perhaps, the right question is about what you mean by “utilities” here, as in extracted/descriptive or extrapolated/normative.)
No, it’s not a clarifying question about subtleties of that construction, I have no inkling of what you mean (seriously, no irony), and hence fail to parse what you wrote (related to “response to utilities” and “response to preferences”) at the most basic level. This is what I see in the grandparent:
Drescher still used humans as utility definition material but thought that there might be a single correct borogove — one which falls out of decision theory and game theory.
Drescher still used humans as utility definition material but thought that there might be a single, morally correct way to derive normative requirements from values — one which falls out of decision theory and game theory.
Suppose that by “values” in that sentence I meant something similar to the firing rates of certain populations of neurons, and by “normative requirements” I meant what I’d mean if I had solved metaethics.
Then that would refer to the “extrapolation” step (falling out of decision theory, as opposed to something CEV-esque), and assume that the results of an “extraction” step are already available, right? Does (did) Drescher hold this view?
From what I meant, it needn’t assume that the results of an extraction step are already available, and I don’t recall Drescher talking in so much detail about it. He just treats humans as utility material, however that might work.
(In general, it’s not clear in what ways descriptive “utility” can be more useful than original humans, or what it means as “utility”, unless it’s already normative preference, in which case it can’t be “extrapolated” any further. “Extrapolation” makes more sense as a way of constructing normative preference from something more like an algorithm that specifies behavior, which seems to be CEV’s purpose, and could then be seen as a particular method of extraction-without-need-for-extrapolation.)
I think you’ve also missed the possibility that all three “systems” might just be the observably inconsistent behavior of one system in different edge cases, or at least that the systems are far more entangled and far less independent than they seem.
(I think you may have also ignored the part where, to the extent that the model-based system has values, they are often more satisficing than maximizing.)
OK, didn’t notice that; I was referring more to the opening dialog. Though “extrapolation” still doesn’t seem to fit, because brain “modules” are not the same kind of thing as goals. Two-step process where first you extract “current preferences” and then “extrapolate” them is likely not how this works, so positing that you get the final preferences somehow starting from the brains is weaker (and correspondingly better, in the absence of knowledge of how this is done).
I agree that the two-step process may very well not work. This is an extremely weak and preliminary result. There’s a lot more hacking at the edges to be done.
What are you referring to by “this” in the second sentence? I don’t think there is a good reason to posit the two-step process, so if this is what you refer to, what’s the underlying result, however weak and preliminary?
By “this” I meant the content of the OP about the three systems that contribute to choice.
OK, in that case I’m confused, since I don’t see any connection between the first and the second sentences...
Let me try again:
Two-step process = (1) Extract preferences, (2) Extrapolate preferences. This may not work. This is one reason that this discovery about three valuation systems in the brain is so weak and preliminary for the purposes of CEV. I’m not sure it will turn out to be relevant to CEV at all.
I see, so the two-step thing acts as a precondition. Is it right that you are thinking of descriptive idealization/analysis of human brain as a path that might lead to definition of “current” (extracted) preferences, which is then to be corrected by “extrapolation”? If so, that would clarify for me your motivation for hoping to get anything FAI-relevant out of neuroscience: extrapolation step would correct the fatal flaws of the extraction step.
(I think extrapolation step (in this context) is magic that can’t work, and instead analysis of human brain must extract/define the right decision problem “directly”, that is formally/automatically, without losing information during descriptive idealization performed by humans, which any object-level study of neuroscience requires.)
Extraction + extrapolation is one possibility, though at this stage in the game it still looks incoherent to me. But sometimes things look incoherent before somebody smart comes along and makes them coherent and tractable.
Another possibility is that an FAI uploads some subset of humans and has them reason through their own preferences for a million subjective years and does something with their resulting judgments and preferences. This might also be basically incoherent.
Another possibility is that a single correct response to preferences falls out of game theory and decision theory, as Drescher attempts in Good and Real. This might also be incoherent.
In these terms, the plan I see as the most promising is that the correct way of extracting preferences from humans that doesn’t require further “extrapolation” falls out of decision theory.
(Not sure what you meant by Drescher’s option (what’s “response to preferences”?): does the book suggest that it’s unnecessary to use humans as utility definition material? In any case, this doesn’t sound like something he would currently believe.)
As I recall, Drescher still used humans as utility definition material but thought that there might be a single correct response to these utilities — one which falls out of decision theory and game theory.
What’s “response to utilities” (in grandparent you used “response to preferences” which I also didn’t understand)? Response of what for what purpose? (Perhaps, the right question is about what you mean by “utilities” here, as in extracted/descriptive or extrapolated/normative.)
Yeah, I don’t know. It’s kind of like asking what “should” or “ought” means. I don’t know.
No, it’s not a clarifying question about subtleties of that construction, I have no inkling of what you mean (seriously, no irony), and hence fail to parse what you wrote (related to “response to utilities” and “response to preferences”) at the most basic level. This is what I see in the grandparent:
For our purposes, how about...
Still no luck. What’s the distinction between “normative requirements” and “values”, in what way are these two ideas (as intended) not the same?
Suppose that by “values” in that sentence I meant something similar to the firing rates of certain populations of neurons, and by “normative requirements” I meant what I’d mean if I had solved metaethics.
Then that would refer to the “extrapolation” step (falling out of decision theory, as opposed to something CEV-esque), and assume that the results of an “extraction” step are already available, right? Does (did) Drescher hold this view?
From what I meant, it needn’t assume that the results of an extraction step are already available, and I don’t recall Drescher talking in so much detail about it. He just treats humans as utility material, however that might work.
OK, thanks! That would agree with my plan then.
(In general, it’s not clear in what ways descriptive “utility” can be more useful than original humans, or what it means as “utility”, unless it’s already normative preference, in which case it can’t be “extrapolated” any further. “Extrapolation” makes more sense as a way of constructing normative preference from something more like an algorithm that specifies behavior, which seems to be CEV’s purpose, and could then be seen as a particular method of extraction-without-need-for-extrapolation.)
I think you’ve also missed the possibility that all three “systems” might just be the observably inconsistent behavior of one system in different edge cases, or at least that the systems are far more entangled and far less independent than they seem.
(I think you may have also ignored the part where, to the extent that the model-based system has values, they are often more satisficing than maximizing.)