So, if I’m reading this correctly, the trouble is that you try to select actions by searching for proofs, instead of an algorithm known to halt quickly, and thus you only behave correctly if it’s possible to find proofs (which it basically isn’t)?
At the risk of sounding falsely prescient, I thought that was a really fishy way to do things, but I can’t claim I had the mathematical understanding to say why.
Semi-related, I know this subthread still requires Löbian objects, but I’m still confused by why you think CDT can’t think this way with the proper causal graph (and access to an inference module).
How do you auto-generate the proper causal graph, given the source code of a game and an opponent? (The reason that this algorithm appealed to me is because, if it had worked the way I originally thought, it would automatically do the right thing against a lot of agents in a lot of different games. I don’t think that anything I could actually pseudocode in the CDT direction achieves that.)
I was imagining generating a causal graph of the game from the code of the game, where the opponent’s code would be one of the nodes in the graph. I agree that determining what will happen from arbitrary code is arbitrarily difficult- but that’s not a decision theory problem so much as a computer science problem. (I trust my decision theories more when they consider easy problems easy and difficult problems difficult; it’s part of a sanity check that they’re faithfully representing reality.)
I think it’s better to be thinking like Nash than like Turing here; Nash’s approach is the one that turns an infinite regress into a mixed strategy, rather than an impossibility.
That is, I can think of at least one way to restate the tournament which allows for about as much meaningful diversity in strategies, but doesn’t allow infinite regress. The winner is still the TDT-inspired mask, though it no longer requires arbitrary proofs.
This might not be interesting to you, of course- if your end goal is to something like studying the intertemporal decision dynamics of self-modifying AI, then you want something like a proof that “I can judge the desirability of any modification to myself.” My guess is that if you go down that route then you’re going to have a bad time, and I’d instead ask the question of “are there any modifications to myself that I can prove the desirability of in a reasonable length of time?”
We can have different mathematical aesthetics, sure. I got excited about these decision theories in the first place when I saw that Löbian cooperation worked without any black boxes, and that’s the standard to which I’m holding my thinking. If you can find a way to automatically generate better causal diagrams for a wide range (even if not a complete one) of games and opponents, then I’ll start finding that approach really fascinating too.
So, if I’m reading this correctly, the trouble is that you try to select actions by searching for proofs, instead of an algorithm known to halt quickly, and thus you only behave correctly if it’s possible to find proofs (which it basically isn’t)?
At the risk of sounding falsely prescient, I thought that was a really fishy way to do things, but I can’t claim I had the mathematical understanding to say why.
Semi-related, I know this subthread still requires Löbian objects, but I’m still confused by why you think CDT can’t think this way with the proper causal graph (and access to an inference module).
How do you auto-generate the proper causal graph, given the source code of a game and an opponent? (The reason that this algorithm appealed to me is because, if it had worked the way I originally thought, it would automatically do the right thing against a lot of agents in a lot of different games. I don’t think that anything I could actually pseudocode in the CDT direction achieves that.)
I was imagining generating a causal graph of the game from the code of the game, where the opponent’s code would be one of the nodes in the graph. I agree that determining what will happen from arbitrary code is arbitrarily difficult- but that’s not a decision theory problem so much as a computer science problem. (I trust my decision theories more when they consider easy problems easy and difficult problems difficult; it’s part of a sanity check that they’re faithfully representing reality.)
I think it’s better to be thinking like Nash than like Turing here; Nash’s approach is the one that turns an infinite regress into a mixed strategy, rather than an impossibility.
That is, I can think of at least one way to restate the tournament which allows for about as much meaningful diversity in strategies, but doesn’t allow infinite regress. The winner is still the TDT-inspired mask, though it no longer requires arbitrary proofs.
This might not be interesting to you, of course- if your end goal is to something like studying the intertemporal decision dynamics of self-modifying AI, then you want something like a proof that “I can judge the desirability of any modification to myself.” My guess is that if you go down that route then you’re going to have a bad time, and I’d instead ask the question of “are there any modifications to myself that I can prove the desirability of in a reasonable length of time?”
We can have different mathematical aesthetics, sure. I got excited about these decision theories in the first place when I saw that Löbian cooperation worked without any black boxes, and that’s the standard to which I’m holding my thinking. If you can find a way to automatically generate better causal diagrams for a wide range (even if not a complete one) of games and opponents, then I’ll start finding that approach really fascinating too.