Isnasene comments on “embedded self-justification,” or something like that

Isnasene 3 Nov 2019 16:17 UTC
1 point
(I don’t think that is really possible: if the function is sophisticated enough to actually work in general, it must have a lot of internal sub-structure, and the smaller things it does inside itself could be treated as “decisions” that aren’t being made using the whole function, which contradicts the original premise.)
Even if the decision function has a lot of sub-structure, I think that in the context of AGI
- (less important point) It is unlikely that we will be able to directly separate substructures of the function from the whole function. This is because I’m assuming the function is using some heuristic approximating logical induction to think about itself and this has extremely broad uses across basically every aspect of the function.
- (more important point) It doesn’t matter if it’s a sub-structure or not. The point is that some part of the decision function is already capable of reasoning about either improving itself or about improving other aspects of the decision function. So whatever method it uses to anticipate whether it should try self-improvement is already baked-in in some way.
Re: Q/A/A1, I guess I agree that these things are (as best I can tell) inevitably pragmatic. And that, as EY says in the post you link, “I’m managing the recursion to the best of my ability” can mean something better than just “I work on exactly N levels and then my decisions at level N+1 are utterly arbitrary.” But then this seems to threaten the Embedded Agency programme, because it would mean we can’t make theoretically grounded assessments or comparisons involving agents as strong as ourselves or stronger.
So “I work on exactly N levels and then my decisions at level N+1 are utterly arbitrary” is not exactly true because, in all relevant scenarios, we’re the ones who build the AI. It’s more like “So I work on exactly N levels and then my decisions at level N+1 were deemed irrelevant by the selection pressures that created me which granted me this decision-function that deemed further levels irrelevant.”
If we’re okay with leveraging normative or empirical assumptions about the world, we should be able to assess AGI (or have the AGI assess itself) with methods that we’re comfortable with.
In some sense, we have practical examples of what this looks like. N, the level of meta, can be viewed as a hyperparameter of our learning system. However, in data science, hyperparameters perform differently for different problems so people often use Bayesian optimization to iteratively pick the best hyperparameters. But, you might say, our Bayesian hyperparameter optimization process requires its own priors—it too has hyperparameters!
But no one really bothers to optimize these for a couple reasons--
#1. As we increase the level of meta in a particular optimization process, we tend to see diminishing returns on the improved model performance
#2. Meta-optimization is prohibitively expensive: Each N-level meta-optimizer generally needs to consider multiple possibilities of (N-1)-level optimizers in order to pick the best one. Inductively, this means your N-level meta-optimizer’s computational cost is around $x^{N}$ where x represents the number of (N-1)-level optimizers each N-level optimizer needs to consider.
But #1. can’t actually be proved. It’s just an assumptiont that we think is true because we have a strong observational prior for it being true. Maybe we should question how human brains generate their priors but, at the end of the day, the way we do this questioning is still determined by our hard-coded algorithms for dealing with probability.
The upshot is that, when we look at problems to the one similar we face with embedded agency, we still use the Eliezer-an approach. We just happen to be very confident in our boundary for reasons that cannot be rigorously justified.