Monte-Carlo AIXI compiles down to 3MB. This is almost entirely due to having a bunch of fast math libraries compiled into it but it’s definitely a reasonable measure of it’s real world Kolmogorov complexity. Or at least an upper bound.
Although, that is MC-AIXI(FAC-CTW). It may not have the same strong optimality guarantees as computing power goes to inf. Perhaps check the paper about it to make sure?
My experience with testing this implementation of AIXI myself is that it can only function on toy domains and only then when the data has been heavily curated to the point where it is basically only able to recognize the simplest of patterns. The only really cool thing it’s doing is recognizing them in a general way. But it doesn’t scale to actually do anything interesting on real data no matter how much computing power you have in practice. 100 years of Moore’s law won’t make this implementation of AIXI able to distinguish 2 faces from each other given two jpg images.
I can look into it more if you want. I’m not sure I’m clear on what upper board you want.
A couple of comments from the author of mc-aixi(fac-ctw).
There is a massive difference in generalisation ability between Solomonoff Induction and the scaled down Bayes mixture used by mc-aixi(fac-ctw). The mixture used in this approximation scales with more CPU power, however it will never get anywhere near the full power of Solomonoff Induction. The class of environments the agent supports is described in more detail in the paper Louie references.
Let’s be clear though − 100 years of Moores Law would significantly improve the performance of even the current implementation of the agent. However it is important to not forget about data efficiency. mc-aixi(fac-ctw) can only exploit very simple structure given reasonable amounts of data. If this structure doesn’t exist, the performance will not be much better than learning by rote. Thus Louie’s comment about image recognition is spot on. Future versions will be able to generalise more effectively for certain kinds of data, however expect progress to be very incremental.
For what it’s worth, the next version of mc-aixi will contain about 3x more lines of source code. It will also be restricted to toy domains, however they will be significantly more interesting than what is currently published.
Monte-Carlo AIXI compiles down to 3MB. This is almost entirely due to having a bunch of fast math libraries compiled into it but it’s definitely a reasonable measure of it’s real world Kolmogorov complexity. Or at least an upper bound.
Although, that is MC-AIXI(FAC-CTW). It may not have the same strong optimality guarantees as computing power goes to inf. Perhaps check the paper about it to make sure?
My experience with testing this implementation of AIXI myself is that it can only function on toy domains and only then when the data has been heavily curated to the point where it is basically only able to recognize the simplest of patterns. The only really cool thing it’s doing is recognizing them in a general way. But it doesn’t scale to actually do anything interesting on real data no matter how much computing power you have in practice. 100 years of Moore’s law won’t make this implementation of AIXI able to distinguish 2 faces from each other given two jpg images.
I can look into it more if you want. I’m not sure I’m clear on what upper board you want.
Hi.
A couple of comments from the author of mc-aixi(fac-ctw).
There is a massive difference in generalisation ability between Solomonoff Induction and the scaled down Bayes mixture used by mc-aixi(fac-ctw). The mixture used in this approximation scales with more CPU power, however it will never get anywhere near the full power of Solomonoff Induction. The class of environments the agent supports is described in more detail in the paper Louie references.
Let’s be clear though − 100 years of Moores Law would significantly improve the performance of even the current implementation of the agent. However it is important to not forget about data efficiency. mc-aixi(fac-ctw) can only exploit very simple structure given reasonable amounts of data. If this structure doesn’t exist, the performance will not be much better than learning by rote. Thus Louie’s comment about image recognition is spot on. Future versions will be able to generalise more effectively for certain kinds of data, however expect progress to be very incremental.
For what it’s worth, the next version of mc-aixi will contain about 3x more lines of source code. It will also be restricted to toy domains, however they will be significantly more interesting than what is currently published.
Happy to answer any other questions.
Cheers, Joel