Actually, this notion of consequentialism gives a new and the only clue I know of about how to infer agent goals, or how to constrain the kinds of considerations that should be considered goals, as compared to the other stuff that moves your action incidentally, such as psychological drives or laws of physics. I wonder if Eliezer had this insight before, given that he wrote a similar comment to this thread. I wasn’t ready to see this idea on my own until a few weeks ago, and this thread is the first time I thought about the question given the new framework, and saw the now-obvious construction. This deserves more than a comment, so I’ll be working on a two-post sequence to write this up intelligibly. Or maybe it’s actually just stupid, I’ll try to figure that out.
(A summary from my notes, in case I get run over by a bus; this uses a notion of “dependence” for which a toy example is described in my post on ADT, but which is much more general: )
The idea of consequentialism, of goal-directed control, can be modeled as follows. If a fact A is controlled by (can be explained/predicted based on) a dependence F: A->O, then we say that A is a decision (action) driven by a consequentialist consideration F, which in turn looks at how A controls the morally relevant fact O.
For a given decision A, there could be many different morally relevant facts O such that the dependence A->O has explanatory power about A. The more about A can a dependence A->O explain, the more morally relevant O is. Finding highly relevant facts O essentially captures A’s goals.
This model has two good properties. First, logical omniscience (in particular, just knowledge of actual action) renders the construction unusable, since we need to see dependencies A->O as ambient concepts explaining A, so both A and A->O need to remain potentially unknown. (This is the confusing part. It also lends motivation to the study of complete collection of moral arguments and the nature of agent-provable collection of moral arguments.)
Second, action (decision) itself, and many other facts that control the action but aren’t morally relevant, are distinguished by this model from the things that are. For example, A can’t be morally relevant, for that would require the trivial identity dependence A->A to explain A, which it can’t, since it’s too simple. Similarly for other stuff in simple relationship with A: the relationship between A and a fact must be in tune with A for the fact to be morally relevant, it’s not enough for the fact itself to be in tune with A.
This question doesn’t require a fixed definition for a goal concept, instead it shows how various concepts can be regarded as goals, and how their suitability for this purpose can be compared. The search for better morally relevant facts is left open-ended.
For the record, I mostly completed a draft of a prerequisite post (first of the two I had in mind) a couple of weeks ago, and it’s just no good, not much better than what one would take away from reading the previously published posts, and not particularly helpful in clarifying the intuition expressed in the above comments. So I’m focusing on improving my math skills, which I expect will help with formalization/communication problem (given a few months), as well as with moving forward. I might post some version of the post, but it seems it won’t be able to serve the previously intended purpose.
As for communication, it would help me (at least) if you used words in their normal senses unless they are a standard LW term of art (e.g. ‘rationalist’ means LW rationalist not Cartesian rationalist ) or unless you specify that you’re using the term in an uncommon sense.
It isn’t related to this thread. I was thinking of past confusions between us over ‘metaethics’ and ‘motivation’ and ‘meaning’ where I didn’t realize until pretty far into the discussion that you were using these terms to mean something different than they normally mean. I’d generally like to avoid that kind of thing; that’s all I meant.
Well, I’m mostly not interested in the concepts corresponding to how these words are “normally” used. The miscommunication problems resulted from both me misinterpreting your usage, and your misinterpreting my usage. I won’t be misinterpreting your usage in similar cases in the future, as I now know better what you mean by which words, and in my own usage, as we discussed a couple of times, I’ll be more clear through using disambiguating qualifiers, which in most cases amounts to writing word “normative” more frequently.
(Still unclear/strange why you brought it up in this particular context, but no matter...)
Actually, this notion of consequentialism gives a new and the only clue I know of about how to infer agent goals, or how to constrain the kinds of considerations that should be considered goals, as compared to the other stuff that moves your action incidentally, such as psychological drives or laws of physics. I wonder if Eliezer had this insight before, given that he wrote a similar comment to this thread. I wasn’t ready to see this idea on my own until a few weeks ago, and this thread is the first time I thought about the question given the new framework, and saw the now-obvious construction. This deserves more than a comment, so I’ll be working on a two-post sequence to write this up intelligibly. Or maybe it’s actually just stupid, I’ll try to figure that out.
(A summary from my notes, in case I get run over by a bus; this uses a notion of “dependence” for which a toy example is described in my post on ADT, but which is much more general: )
The idea of consequentialism, of goal-directed control, can be modeled as follows. If a fact A is controlled by (can be explained/predicted based on) a dependence F: A->O, then we say that A is a decision (action) driven by a consequentialist consideration F, which in turn looks at how A controls the morally relevant fact O.
For a given decision A, there could be many different morally relevant facts O such that the dependence A->O has explanatory power about A. The more about A can a dependence A->O explain, the more morally relevant O is. Finding highly relevant facts O essentially captures A’s goals.
This model has two good properties. First, logical omniscience (in particular, just knowledge of actual action) renders the construction unusable, since we need to see dependencies A->O as ambient concepts explaining A, so both A and A->O need to remain potentially unknown. (This is the confusing part. It also lends motivation to the study of complete collection of moral arguments and the nature of agent-provable collection of moral arguments.)
Second, action (decision) itself, and many other facts that control the action but aren’t morally relevant, are distinguished by this model from the things that are. For example, A can’t be morally relevant, for that would require the trivial identity dependence A->A to explain A, which it can’t, since it’s too simple. Similarly for other stuff in simple relationship with A: the relationship between A and a fact must be in tune with A for the fact to be morally relevant, it’s not enough for the fact itself to be in tune with A.
This question doesn’t require a fixed definition for a goal concept, instead it shows how various concepts can be regarded as goals, and how their suitability for this purpose can be compared. The search for better morally relevant facts is left open-ended.
I very much look forward to your short sequence on this. I hope you will also explain your notion of dependence in detail.
For the record, I mostly completed a draft of a prerequisite post (first of the two I had in mind) a couple of weeks ago, and it’s just no good, not much better than what one would take away from reading the previously published posts, and not particularly helpful in clarifying the intuition expressed in the above comments. So I’m focusing on improving my math skills, which I expect will help with formalization/communication problem (given a few months), as well as with moving forward. I might post some version of the post, but it seems it won’t be able to serve the previously intended purpose.
Bummer.
As for communication, it would help me (at least) if you used words in their normal senses unless they are a standard LW term of art (e.g. ‘rationalist’ means LW rationalist not Cartesian rationalist ) or unless you specify that you’re using the term in an uncommon sense.
Don’t see how this is related to this thread, and correspondingly what kinds of word misuse you have in mind.
It isn’t related to this thread. I was thinking of past confusions between us over ‘metaethics’ and ‘motivation’ and ‘meaning’ where I didn’t realize until pretty far into the discussion that you were using these terms to mean something different than they normally mean. I’d generally like to avoid that kind of thing; that’s all I meant.
Well, I’m mostly not interested in the concepts corresponding to how these words are “normally” used. The miscommunication problems resulted from both me misinterpreting your usage, and your misinterpreting my usage. I won’t be misinterpreting your usage in similar cases in the future, as I now know better what you mean by which words, and in my own usage, as we discussed a couple of times, I’ll be more clear through using disambiguating qualifiers, which in most cases amounts to writing word “normative” more frequently.
(Still unclear/strange why you brought it up in this particular context, but no matter...)
Yup, that sounds good.
I brought it up because you mentioned communication, and your comment showed up in my LW inbox today.