I’d like to add the connection between the notions of “meta-ethics” and “decision theory” (of the kind we’d want a FAI/CEV to start out with). For the purpose of solving FAI, these seem to be the same, with “decision theory” emphasizing the outline of the target, and “meta-ethics” the source of correctness criteria for such theory in human intuition.
Hmm. I thought metaethics was about specifying a utility function, and decision theory was about algorithms for achieving the optimum of a given utility function. Or do you have a different perspective on this?
Even if we assume that “utility function” has anything to do with FAI-grade decision problems, you’d agree that prior is also part of specification of which decisions should be made. Then there’s the way in which one should respond to observations, the way one handles logical uncertainty and decides that given amount of reflection is sufficient to suspend an ethical injunction (such as “don’t act yet”), the way one finds particular statements first in thinking about counterfactuals (what forms agent-provability), which can be generalized to non-standard inference systems, and on and on this list goes. This list is as long as morality, and it is morality, but it parses it in a specific way that extracts the outline of its architecture and not just individual pieces of data.
When you consider methods of more optimally solving a decision problem, how do you set criteria of optimality? Some things are intuitively obvious, and very robust to further reflection, but ultimately you’d want the decision problem itself to decide what counts as an improvement in the methods of solving it. For example, obtaining superintelligent ability to generate convincing arguments for a wrong statement can easily ruin your day. So efficient algorithms are, too, a subject of meta-ethics, but of course in the same sense as we can conclude that we can include an “action-definition” as a part of general decision problems, we can conclude that “more computational resources” is an improvement. And as you know from agent-simulates-predictor, that is not universally the case.
I think it is important to keep in mind that the approach currently favored here, in which your choice of meta-ethics guides your choice of decision theory, and in which your decision theory justifies your metaethics (in a kind of ouroborean epiphany of reflective equilibrium) - that approach is only one possible research direction.
There are other approaches that might be fruitful. In fact, it is far from clear to many people that the problem of preventing uFAI involves moral philosophy at all. (ETA: Or decision theory.)
To a small group, it sometimes appears that the only way of making progress is to maintain a narrow focus and to ruthlessly prune research subtrees as soon as they fall out of favor. But pruning in this way is gambling—it is an act of desperation by people who are made frantic by the ticking of the clock.
My preference (which may turn out to be a gamble too), is to ignore the ticking and to search the tree carefully with the help of a large, well-trained army of researchers.
Much depends of course on the quantity of time we have available. If the market progresses to AGI on it’s own in 10 years, our energies are probably best spent focused on a narrow set of practical alternatives.
If we have a hundred years, then perhaps we can afford to entertain several new generations of philosophers.
If the market progresses to AGI on it’s own in 10 years, our energies are probably best spent focused on a narrow set of practical alternatives.
But the problem itself seems to suggest that if you don’t solve it on its own terms, and instead try to mitigate the practical difficulties, you still lose completely. AGI is a universe-exploding A-Bomb which the mad scientists are about to test experimentally in a few decades, you can’t improve the outcome by building better shelters (or better casing for the bomb).
Yudkowsky apparently councils ignoring the ticking as well—here:
Until you can turn your back on your rivals and the ticking clock, blank them completely out of your mind, you will not be able to see what the problem itself is asking of you. In theory, you should be able to see both at the same time. In practice, you won’t.
I have argued repeatedly that the ticking is a fundamental part of the problem—and that if you ignore it, you just lose (with high probability) to those who are paying their clocks more attention. The “blank them completely out of your mind” advice seems to be an obviously-bad way of approaching the whole area.
It is unfortunate that getting more time looks very challenging. If we can’t do that, we can’t afford to dally around very much.
Yudkowsky apparently councils ignoring the ticking as well
Yes, and that comment may be the best thing he has ever written. It is a dilemma. Go too slow and the bad guys may win. Go too fast, and you may become the bad guys. For this problem, the difference between “good” and “bad” has nothing to do with good intentions.
Another analyis is that there are at least two types of possible problem:
One is the “runaway superintelligence” problem—which the SIAI seems focused on;
Another type of problem involves the preferences of only a small subset of human being respected.
The former problem has potentially more severe consequences (astronomical waste), but an engineering error like that seems pretty unlikely—at least to me.
The latter problem could still have some pretty bad consequences for many people, and seems much more probable—at least to me.
In a resource-limited world, too much attention on the first problem could easily contribute to running into the second problem.
Gary Drescher has an interesting way of grabbing deontological normative theory from his meta-ethics, coupled with decision theory and game theory. He explains it in Good and Real, though I haven’t had time to evaluate it much yet.
Given your interest, you probably should read it (if not having read it is included in what you mean by not having had time to evaluate it). Although I still haven’t, I know Gary is right on most things he talks about, and expresses himself clearly.
Right; you can extend decision theory to include reasoning about which computations the decision theoretic ‘agent’ is situated in and how that matters for which decisions to make, shaper/anchor semantics-style.
Meta-ethics per se is just the set of (hopefully mathematical-ish) intuitions we draw on and that guide how humans go about reasoning about what is right, and that we kind of expect to align somewhat with what a good situated decision theory would do, at least before the AI starts trading with other superintelligences that represented different contexts. If meta-level contextual/situated decision theory is convergent, the only differences between superintelligences are differences about what kind of world they’re in. Meta-ethics is thus kind of superfluous except as a vague source of intuitions that should probably be founded in math, whereas practical axiology (fueled by evolutionary psychology, evolutionary game theory, social psychology, etc) is indicative of the parts of humanity that (arguably) aren’t just filled in by the decision theory.
I’d like to add the connection between the notions of “meta-ethics” and “decision theory” (of the kind we’d want a FAI/CEV to start out with). For the purpose of solving FAI, these seem to be the same, with “decision theory” emphasizing the outline of the target, and “meta-ethics” the source of correctness criteria for such theory in human intuition.
Hmm. I thought metaethics was about specifying a utility function, and decision theory was about algorithms for achieving the optimum of a given utility function. Or do you have a different perspective on this?
Even if we assume that “utility function” has anything to do with FAI-grade decision problems, you’d agree that prior is also part of specification of which decisions should be made. Then there’s the way in which one should respond to observations, the way one handles logical uncertainty and decides that given amount of reflection is sufficient to suspend an ethical injunction (such as “don’t act yet”), the way one finds particular statements first in thinking about counterfactuals (what forms agent-provability), which can be generalized to non-standard inference systems, and on and on this list goes. This list is as long as morality, and it is morality, but it parses it in a specific way that extracts the outline of its architecture and not just individual pieces of data.
When you consider methods of more optimally solving a decision problem, how do you set criteria of optimality? Some things are intuitively obvious, and very robust to further reflection, but ultimately you’d want the decision problem itself to decide what counts as an improvement in the methods of solving it. For example, obtaining superintelligent ability to generate convincing arguments for a wrong statement can easily ruin your day. So efficient algorithms are, too, a subject of meta-ethics, but of course in the same sense as we can conclude that we can include an “action-definition” as a part of general decision problems, we can conclude that “more computational resources” is an improvement. And as you know from agent-simulates-predictor, that is not universally the case.
I think it is important to keep in mind that the approach currently favored here, in which your choice of meta-ethics guides your choice of decision theory, and in which your decision theory justifies your metaethics (in a kind of ouroborean epiphany of reflective equilibrium) - that approach is only one possible research direction.
There are other approaches that might be fruitful. In fact, it is far from clear to many people that the problem of preventing uFAI involves moral philosophy at all. (ETA: Or decision theory.)
To a small group, it sometimes appears that the only way of making progress is to maintain a narrow focus and to ruthlessly prune research subtrees as soon as they fall out of favor. But pruning in this way is gambling—it is an act of desperation by people who are made frantic by the ticking of the clock.
My preference (which may turn out to be a gamble too), is to ignore the ticking and to search the tree carefully with the help of a large, well-trained army of researchers.
Much depends of course on the quantity of time we have available. If the market progresses to AGI on it’s own in 10 years, our energies are probably best spent focused on a narrow set of practical alternatives.
If we have a hundred years, then perhaps we can afford to entertain several new generations of philosophers.
But the problem itself seems to suggest that if you don’t solve it on its own terms, and instead try to mitigate the practical difficulties, you still lose completely. AGI is a universe-exploding A-Bomb which the mad scientists are about to test experimentally in a few decades, you can’t improve the outcome by building better shelters (or better casing for the bomb).
Yudkowsky apparently councils ignoring the ticking as well—here:
I have argued repeatedly that the ticking is a fundamental part of the problem—and that if you ignore it, you just lose (with high probability) to those who are paying their clocks more attention. The “blank them completely out of your mind” advice seems to be an obviously-bad way of approaching the whole area.
It is unfortunate that getting more time looks very challenging. If we can’t do that, we can’t afford to dally around very much.
Yes, and that comment may be the best thing he has ever written. It is a dilemma. Go too slow and the bad guys may win. Go too fast, and you may become the bad guys. For this problem, the difference between “good” and “bad” has nothing to do with good intentions.
Another analyis is that there are at least two types of possible problem:
One is the “runaway superintelligence” problem—which the SIAI seems focused on;
Another type of problem involves the preferences of only a small subset of human being respected.
The former problem has potentially more severe consequences (astronomical waste), but an engineering error like that seems pretty unlikely—at least to me.
The latter problem could still have some pretty bad consequences for many people, and seems much more probable—at least to me.
In a resource-limited world, too much attention on the first problem could easily contribute to running into the second problem.
Vladimir_Nesov,
Gary Drescher has an interesting way of grabbing deontological normative theory from his meta-ethics, coupled with decision theory and game theory. He explains it in Good and Real, though I haven’t had time to evaluate it much yet.
Given your interest, you probably should read it (if not having read it is included in what you mean by not having had time to evaluate it). Although I still haven’t, I know Gary is right on most things he talks about, and expresses himself clearly.
Right; you can extend decision theory to include reasoning about which computations the decision theoretic ‘agent’ is situated in and how that matters for which decisions to make, shaper/anchor semantics-style.
Meta-ethics per se is just the set of (hopefully mathematical-ish) intuitions we draw on and that guide how humans go about reasoning about what is right, and that we kind of expect to align somewhat with what a good situated decision theory would do, at least before the AI starts trading with other superintelligences that represented different contexts. If meta-level contextual/situated decision theory is convergent, the only differences between superintelligences are differences about what kind of world they’re in. Meta-ethics is thus kind of superfluous except as a vague source of intuitions that should probably be founded in math, whereas practical axiology (fueled by evolutionary psychology, evolutionary game theory, social psychology, etc) is indicative of the parts of humanity that (arguably) aren’t just filled in by the decision theory.