Wouldn’t the same job be done by the agent using proper counterfactuals instead of logic ones, which seems like something that would also be needed for other purposes?
I don’t know who (if anyone) has done any work on this, but when a human considers a counterfactual statement like “If Gore won in 2000” that is very underspecified, because the implicit assumption is to discard contradicting knowledge, but how to do that exactly is left open. Humans just know that they should assume something like “Bush didn’t win Florida” instead of “266 > 271”.
If an example agent needs to be able to use precisely defined proper counterfactuals I think it might be possible to do that with an ordering function for its current knowledge. The agent would start with the couterfactual under consideration, add items from its knowledge base in the order specified for that counterfactual, test for each item whether it can find a contradiction, and discard the current item whenever it finds a contradiction (from consideration for evaluating the counterfactual).
For the example I think the order would look like this: A=a, S, U except for A, the source code of A.
That would do much the same thing as “playing chicken with the universe” with respect of not being impressed by proofs about its output, no?
More generally I think the items would be further split up, particularly “A behaves like a program with this source code in all other cases” would come before “A behaves like a program with this source code in the case under consideration”. Other instances of A would also have to be treated as instances of A and not as programs with the source code (i.e. statements like “A’ has the same source code as A” would come before either of their source codes in the ordering).
The idea of using an ordering function for knowledge is new to me, thanks!
For the example I think the order would look like this: A=a, S, U except for A, the source code of A.
The hard part is getting “U except for A”. Given the source code of U and the source code for A, we don’t know how to “factor” one program by another (or equivalently, find all correlates of the agent in the universe). If we knew how to do that, it would help a lot with UDT in general.
I guess you would actually have various knowledge items about the world, some of them implying things about A, and any that in conjunction with the others so far cause a contradiction with A=a that the agent can find would be discarded. Maybe that already would be enough; I’m not sure.
That’s a really difficult question. It’s hard to say what principles humans follow when evaluating counterfactuals, and even harder to say in how far that’s a reasonable example to follow.
I think higher level observational laws should usually have a higher priority than concrete data points they are based on, and all else equal they should be in descending order of generality and confidence. That the US. president can veto US federal legislation and that the person who can veto US federal legislation is the same person as the commander in chief of the US military forces should both have a higher priority than that George W. Bush could veto US federal legislation.
It would also depend on what the counterfactual is used for. For counterfactuals concerning the past timing would obviously extremely important.
In the case of considering the counterfactual implications of a decision the agent makes you could try ascending order of strength as Bayesian evidence about the agent as a secondary criterion, maybe? Or perhaps instead ratio of that strength to general importance? (Which would probably require nested counterfactuals? Are we concerned with computability yet?)
EDIT: I think the knowledge items would have redundancy so that even if the agent can derive itself directly from the laws of physics and needs to reject (one of) them it can reconstruct almost normal physics from various observational laws. It also seems redundancy could reduce the importance of the initial order somewhat.
Wouldn’t the same job be done by the agent using proper counterfactuals instead of logic ones, which seems like something that would also be needed for other purposes?
I don’t know who (if anyone) has done any work on this, but when a human considers a counterfactual statement like “If Gore won in 2000” that is very underspecified, because the implicit assumption is to discard contradicting knowledge, but how to do that exactly is left open. Humans just know that they should assume something like “Bush didn’t win Florida” instead of “266 > 271”.
If an example agent needs to be able to use precisely defined proper counterfactuals I think it might be possible to do that with an ordering function for its current knowledge. The agent would start with the couterfactual under consideration, add items from its knowledge base in the order specified for that counterfactual, test for each item whether it can find a contradiction, and discard the current item whenever it finds a contradiction (from consideration for evaluating the counterfactual).
For the example I think the order would look like this: A=a, S, U except for A, the source code of A.
That would do much the same thing as “playing chicken with the universe” with respect of not being impressed by proofs about its output, no?
More generally I think the items would be further split up, particularly “A behaves like a program with this source code in all other cases” would come before “A behaves like a program with this source code in the case under consideration”. Other instances of A would also have to be treated as instances of A and not as programs with the source code (i.e. statements like “A’ has the same source code as A” would come before either of their source codes in the ordering).
Does that make sense?
The idea of using an ordering function for knowledge is new to me, thanks!
The hard part is getting “U except for A”. Given the source code of U and the source code for A, we don’t know how to “factor” one program by another (or equivalently, find all correlates of the agent in the universe). If we knew how to do that, it would help a lot with UDT in general.
I guess you would actually have various knowledge items about the world, some of them implying things about A, and any that in conjunction with the others so far cause a contradiction with A=a that the agent can find would be discarded. Maybe that already would be enough; I’m not sure.
What considerations should be used to order the knowledge items?
That’s a really difficult question. It’s hard to say what principles humans follow when evaluating counterfactuals, and even harder to say in how far that’s a reasonable example to follow.
I think higher level observational laws should usually have a higher priority than concrete data points they are based on, and all else equal they should be in descending order of generality and confidence. That the US. president can veto US federal legislation and that the person who can veto US federal legislation is the same person as the commander in chief of the US military forces should both have a higher priority than that George W. Bush could veto US federal legislation.
It would also depend on what the counterfactual is used for. For counterfactuals concerning the past timing would obviously extremely important.
In the case of considering the counterfactual implications of a decision the agent makes you could try ascending order of strength as Bayesian evidence about the agent as a secondary criterion, maybe? Or perhaps instead ratio of that strength to general importance? (Which would probably require nested counterfactuals? Are we concerned with computability yet?)
EDIT: I think the knowledge items would have redundancy so that even if the agent can derive itself directly from the laws of physics and needs to reject (one of) them it can reconstruct almost normal physics from various observational laws. It also seems redundancy could reduce the importance of the initial order somewhat.