Roko comments on The Solomonoff Prior is Malign

Roko Oct 14, 2020, 5:49 PM
LW: 6 AF: 1
AF
It seems to me that using a combination of execution time, memory use and program length mostly kills this set of arguments.

Something like a game-of-life initial configuration that leads to the eventual evolution of intelligent game-of-life aliens who then strategically feed outputs into GoL in order to manipulate you may have very good complexity performance, but both the speed and memory are going to be pretty awful. The fixed cost in memory and execution steps of essentially simulating an entire universe is huge.

But yes, the pure complexity prior certainly has some perverse and unsettling properties.

EDIT: This is really a special case of Mesa-Optimizers being dangerous. (See, e.g. https://www.lesswrong.com/posts/XWPJfgBymBbL3jdFd/an-58-mesa-optimization-what-it-is-and-why-we-should-care). The set of dangerous Mesa-Optimizers is obviously bigger than just “simulated aliens” and even time- and space-efficient algorithms might run into them.
- Tomáš Gavenčiak Oct 16, 2020, 11:26 AM
  LW: 8 AF: 3
  AF Parent
  Complexity indeed matters: the universe seems to be bounded in both time and space, so running anything like Solomonoff prior algorithm (in one of its variants) or AIXI may be outright impossible for any non-trivial model. This for me significantly weakens or changes some of the implications.
  
  A Fermi upper bound of the direct Solomonoff/AIXI algorithm trying TMs in the order of increasing complexity: even if checking one TM took one Planck time on one atom, you could only check cca 10^250=2^800 machines within a lifetime of the universe (~10^110 years until Heat death), so the machines you could even look at have description complexity a meager 800 bits.
  - You could likely speed the greedy search up, but note that most algorithmic speedups do not have a large effect on the exponent (even multiplying the exponent with constants is not very helpful).
  - Significantly narrowing down the space of TMs to a narrow subclass may help, but then we need to take look at the particular (small) class of TMs rather than have intuitions about all TMs. (And the class would need to be really narrow—see below).
  - Due to the Church-Turing thesis, any limiting the scope of the search is likely not very effective, as you can embed arbitrary programs (and thus arbitrary complexity) in anything that is strong enough to be a TM interpreter (which the universe is in multiple ways).
  - It may be hypothetically possible to search for the “right” TMS without examining them individually (witch some future tech, e.g. how sci-fi imagined quantum computing), but if such speedup is possible, any TMs modelling the universe would need to be able to contain this. This would increase any evaluation complexity of the TMs, making them more significantly costly than the Planck time I assumed above (would need a finer Fermi estimate with more complex assumptions?).
- Mark Xu Oct 14, 2020, 6:47 PM
  4 points
  Parent
  I am not so convinced that penalizing more stuff will make these arguments weak enough that we don’t have to worry about them. For an example of why I think this, see Are minimal circuits deceptive?. Also, adding execution/memory constraints penalizes all hypothesis and I don’t think universes with consequentialists are asymmetrically penalized.
  
  I agree about this being a special case of mesa-optimization.
  - Roko Oct 15, 2020, 10:21 AM
    7 points
    Parent
    
    adding execution/memory constraints penalizes all hypothesis
    
    In reality these constraints do exist, so the question of “what happens if you don’t care about efficiency at all?” is really not important. In practice, efficiency is absolutely critical and everything that happens in AI is dominated by efficiency considerations.
    
    I think that mesa-optimization will be a problem. It probably won’t look like aliens living in the Game of Life though.
    
    It’ll look like an internal optimizer that just “decides” that the minds of the humans who created it are another part of the environment to be optimized for its not-correctly-aligned goal.