By “predictor” I don’t mean something that produces exact predictions, I mean something that produces probabilistic predictions of given quantities. Maybe we should call it “inductor” to avoid conflation with optimal predictors (even though the concepts are closely related). As I said before, I think that an agent has to have a reasonable model of humans to follow human values. Moreover an agent that doesn’t have a reasonable model of humans is probably much less dangerous since it won’t be able to manipulate humans (although I guess the risk is still non-negligible).
The question is what kind of inductors are complexity-theoretically feasible and what class of models do these inductors correspond to. Bounded Solomonoff induction using Λ works on the class of samplable models. In machine learning language, inductors using samplable models are feasible since it is possible to train the inductors by sampling random such models (i.e. by sampling the bounded Solomonoff ensemble). On the other hand it’s not clear what broader classes of models are admissible if any.
That said, it seems plausible that if it’s feasible to construct inductors for a broader class, my procedure will remain efficient.
Model 1: The “superinductor” works by finding an efficient transformation q of the input sequence x and a good sampleable model for q(x). E.g.q(x) contains the results of substituting the observed SAT-solutions into the observed SAT-instances. In this model, we can apply my procedure by running utility inference on q(x) instead of x.
Model 2: The superinductor works via an optimal predictor for SAT*. I think that it should be relatively straightforward to show that given an optimal predictor for SAT + assuming the ability to design AGIs for target utility functions relative to a SAT oracle, there is an optimal predictor for the total utility function my utility inference procedure defines (after including external agents that run over a SAT oracle). Therefore it is possible to maximize the latter.
*A (poly,log)-optimal predictor for SAT cannot exist unless all sparse problems in NP have efficient heuristic algorithms in some sense, which is unlikely. On the other hand, there is no reason that I know why a (poly,0)-optimal predictor for SAT cannot exist.
By “predictor” I don’t mean something that produces exact predictions, I mean something that produces probabilistic predictions of given quantities. Maybe we should call it “inductor” to avoid conflation with optimal predictors (even though the concepts are closely related). As I said before, I think that an agent has to have a reasonable model of humans to follow human values. Moreover an agent that doesn’t have a reasonable model of humans is probably much less dangerous since it won’t be able to manipulate humans (although I guess the risk is still non-negligible).
The question is what kind of inductors are complexity-theoretically feasible and what class of models do these inductors correspond to. Bounded Solomonoff induction using Λ works on the class of samplable models. In machine learning language, inductors using samplable models are feasible since it is possible to train the inductors by sampling random such models (i.e. by sampling the bounded Solomonoff ensemble). On the other hand it’s not clear what broader classes of models are admissible if any.
That said, it seems plausible that if it’s feasible to construct inductors for a broader class, my procedure will remain efficient.
Model 1: The “superinductor” works by finding an efficient transformation q of the input sequence x and a good sampleable model for q(x). E.g.q(x) contains the results of substituting the observed SAT-solutions into the observed SAT-instances. In this model, we can apply my procedure by running utility inference on q(x) instead of x.
Model 2: The superinductor works via an optimal predictor for SAT*. I think that it should be relatively straightforward to show that given an optimal predictor for SAT + assuming the ability to design AGIs for target utility functions relative to a SAT oracle, there is an optimal predictor for the total utility function my utility inference procedure defines (after including external agents that run over a SAT oracle). Therefore it is possible to maximize the latter.
*A (poly,log)-optimal predictor for SAT cannot exist unless all sparse problems in NP have efficient heuristic algorithms in some sense, which is unlikely. On the other hand, there is no reason that I know why a (poly,0)-optimal predictor for SAT cannot exist.