Unfortunately I think you’ll have to familiarize yourself more with the existent decision theory literature here on LW or on the decision theory mailing list in order to understand what I’m getting at. I’m already rather familiar with the standard arguments for FAI. If you’re already a member of the decision theory list then the most relevant thing to read would be Nesov’s talking about decision processes splitting off into coordinated subagents upon making observations. That at least hints in the right direction.
(I am mildly surprised that you have no idea what I’m talking about even after having read the thread I linked to that hints at the intuitions behind a creatorless decision theory. It’s not a very complicated idea, even if it might look uncomfortably like some hidden agenda promoting values deathism.)
(I still don’t see how that note could be recognized from the information you provided. Thank you for some clarity, I only wish you’d respect it more. I also remain ignorant about how the note relates to what you were discussing, but here’s an excuse to revisit that construction.)
The note in question mostly talks about a way in which an observation can shift agent’s focus of attention without changing its decision problem or potential state of knowledge. Agent’s preference stays the same.
The decision in question is where an agent focuses on seeing the implications of a particular observation (that is, gets to infer more in a particular direction, using the original premises), while mostly ignoring the implications of alternative observations (that is, inferring less from the same premises in other directions), thus mostly losing track of the counterfactual worlds where the observation turns out differently, leaving those worlds to its alternative versions. In doing so, the agent loses coordination with its versions in those alternative worlds, so its decisions will now be more about its own individual actions and not (or less) about the strategy coordinating it with the counterfactual versions of itself. In return, it gains more computational resources to devote to its particular subproblem.
This is one sense in which observations can act like knowledge (something to update on, focus on implications of) without getting more directly involved in agent’s reasoning algorithm, so that we can keep an agent updateless, in principle able to take counterfactuals into account. In this case, an agent is rather more computationally restricted than what UDT plays with, and it’s this restriction that motivates using observations in an updating-like manner, which is possible to do in this way without actually updating away the counterfactuals.
It is a compulsion of mine that given a choice between giving zero information and giving a small amount of information I must give a small amount or feel guilty for not even having tried to do the right thing. Likely leads to Goodhartian problems. I don’t have introspective access to the utility calculus that resulted in this compulsion.
E.g. in this case: Bla bla additive utility versus multiplicativeish “belief” self-coordination versus coordination with others computational complexity bla. Philosophy PSR and CFAI causal validity blah, Markovian causality includes formal/final causes. Extracting bits of Chaitin’s constant from “environment” bla. Bla don’t know if at equilibrium with respect to optimization after infinite time, unclear whether to act as if stars are twenty dollar bill on busy street or not.
Re friendliness, Loebian problems might cause collapse of recursive Bayes AI architectures via wireheading and so on, Goedel machine limits with strength of axioms, stronger axiom sets have self-reference problems. If true this would change singularity strategy, don’t have to worry as much about scary AIs unless they can solve Loebian problems indirectly.
ETA: Accidentally hit comment before editing/finishing but I’ll accept that as a sign from God.
It is a compulsion of mine that given a choice between giving zero information and giving a small amount of information I must give a small amount or feel guilty for not even having tried to do the right thing.
False dichotomy. In the same number of words you could be communicating much more clearly.
Unfortunately I think you’ll have to familiarize yourself more with the existent decision theory literature here on LW or on the decision theory mailing list in order to understand what I’m getting at. I’m already rather familiar with the standard arguments for FAI. If you’re already a member of the decision theory list then the most relevant thing to read would be Nesov’s talking about decision processes splitting off into coordinated subagents upon making observations. That at least hints in the right direction.
(I have no idea what Will is talking about; I don’t even see which things I wrote on the list he is referring to.)
Edit: Both issues now resolved, with Paul clarifying Will’s point and Will explicitly linking to the decision theory list post.
(“A note on observation and logical uncertainty”, January 20, 2011.)
(I am mildly surprised that you have no idea what I’m talking about even after having read the thread I linked to that hints at the intuitions behind a creatorless decision theory. It’s not a very complicated idea, even if it might look uncomfortably like some hidden agenda promoting values deathism.)
(I still don’t see how that note could be recognized from the information you provided. Thank you for some clarity, I only wish you’d respect it more. I also remain ignorant about how the note relates to what you were discussing, but here’s an excuse to revisit that construction.)
The note in question mostly talks about a way in which an observation can shift agent’s focus of attention without changing its decision problem or potential state of knowledge. Agent’s preference stays the same.
The decision in question is where an agent focuses on seeing the implications of a particular observation (that is, gets to infer more in a particular direction, using the original premises), while mostly ignoring the implications of alternative observations (that is, inferring less from the same premises in other directions), thus mostly losing track of the counterfactual worlds where the observation turns out differently, leaving those worlds to its alternative versions. In doing so, the agent loses coordination with its versions in those alternative worlds, so its decisions will now be more about its own individual actions and not (or less) about the strategy coordinating it with the counterfactual versions of itself. In return, it gains more computational resources to devote to its particular subproblem.
This is one sense in which observations can act like knowledge (something to update on, focus on implications of) without getting more directly involved in agent’s reasoning algorithm, so that we can keep an agent updateless, in principle able to take counterfactuals into account. In this case, an agent is rather more computationally restricted than what UDT plays with, and it’s this restriction that motivates using observations in an updating-like manner, which is possible to do in this way without actually updating away the counterfactuals.
It is a compulsion of mine that given a choice between giving zero information and giving a small amount of information I must give a small amount or feel guilty for not even having tried to do the right thing. Likely leads to Goodhartian problems. I don’t have introspective access to the utility calculus that resulted in this compulsion.
E.g. in this case: Bla bla additive utility versus multiplicativeish “belief” self-coordination versus coordination with others computational complexity bla. Philosophy PSR and CFAI causal validity blah, Markovian causality includes formal/final causes. Extracting bits of Chaitin’s constant from “environment” bla. Bla don’t know if at equilibrium with respect to optimization after infinite time, unclear whether to act as if stars are twenty dollar bill on busy street or not.
Re friendliness, Loebian problems might cause collapse of recursive Bayes AI architectures via wireheading and so on, Goedel machine limits with strength of axioms, stronger axiom sets have self-reference problems. If true this would change singularity strategy, don’t have to worry as much about scary AIs unless they can solve Loebian problems indirectly.
ETA: Accidentally hit comment before editing/finishing but I’ll accept that as a sign from God.
False dichotomy. In the same number of words you could be communicating much more clearly.