A larger set of symbols for rewards makes no difference—since the reward signal is a scalar. If you compare with an animal, that has millions of pain sensors that operate in parallel. The animal is onto something there—something to do with a-priori knowledge about the common causes of pain. Having lots of pain sensors has positive aspects—e.g. it saves you experimenting to figure out what hurts.
You can encode 16 64 bit integers in a 1024 bit integer. The scalar/parallel distinction is bogus.
(Edit: I original wrote “5 32 bit integers” when I meant “2**5 32 bit integers”. Changed to “16 64 bit integers” because “32 32 bit integers” looked too much like a typo.)
Not very serious unless you are making claims about your agent being “the most intelligent unbiased agent possible”. Then this kind of thing starts to make a difference...
Strawman argument. The only claim made is that it’s the most intelligent up to a constant factor, and a bunch of other conditions are thrown in. When Hutter’s involved, you can bet that some of the constant factors are large compared to the size of the universe.
You can encode 5 32 bit integers in a 1024 bit integer. The scalar/parallel distinction is bogus.
Er, not if you are adding the rewards together and maximising the results, you can’t! That is exactly what happens to the rewards used by AIXI.
Not very serious unless you are making claims about your agent being “the most intelligent unbiased agent possible”. Then this kind of thing starts to make a difference...
Strawman argument. The only claim made is that it’s the most intelligent up to a constant factor, and a bunch of other conditions are thrown in.
Actually Hutter says this sort of thing all over the place (I was quoting him above) - and it seems pretty irritating and misleading to me. I’m not saying the claims he makes in the fine print are wrong, but rather that the marketing headlines are misleading.
You can encode 5 32 bit integers in a 1024 bit integer. The scalar/parallel distinction is bogus.
Er, not if you are adding the rewards together and maximising the results, you can’t! That is exactly what happens to the rewards used by AIXI.
You’re right there, I’m confusing AIXI with another design I’ve been working with in a similar idiom. For AIXI to work, you have to combine together all the environmental stuff and compute a utility, make the code for doing the combining part of the environment (not the AI), and then use that resulting utility as the input to AIXI.
You can encode 16 64 bit integers in a 1024 bit integer. The scalar/parallel distinction is bogus.
(Edit: I original wrote “5 32 bit integers” when I meant “2**5 32 bit integers”. Changed to “16 64 bit integers” because “32 32 bit integers” looked too much like a typo.)
Strawman argument. The only claim made is that it’s the most intelligent up to a constant factor, and a bunch of other conditions are thrown in. When Hutter’s involved, you can bet that some of the constant factors are large compared to the size of the universe.
Er, not if you are adding the rewards together and maximising the results, you can’t! That is exactly what happens to the rewards used by AIXI.
Actually Hutter says this sort of thing all over the place (I was quoting him above) - and it seems pretty irritating and misleading to me. I’m not saying the claims he makes in the fine print are wrong, but rather that the marketing headlines are misleading.
You’re right there, I’m confusing AIXI with another design I’ve been working with in a similar idiom. For AIXI to work, you have to combine together all the environmental stuff and compute a utility, make the code for doing the combining part of the environment (not the AI), and then use that resulting utility as the input to AIXI.