Hey Steven, im new in the LW community so please excuse my formatting.
Case #1 would involve changing the model weights, while Case #2 would not. Instead, Case #2 would solely involve changing the model activations.
I am confused about the deployment part of offline training. Is it not the case that when people use a model (aka query a trained model on validation set), they seek to evaluate and not fit the new examples? So would it not be about changing weights in online learning vs using the relevant activations in offline mode?
Two models for AGI development. The one on the left is directly analogous to how evolution created human brains. The one on the right involves an analogy between the genome and the source code defining an ML algorithm, as spelled out in the next subsection.
Could it be the case that the “evolution from scratch” model is learned in the Learned Content of the “ML code” approach? Is that what the mesa-optimization line suggests?
when people use a model (aka query a trained model on validation set)
You say “aka”, but those seem different to me. For example, in regards to GPT-3, we can consider:
Training: Weights are updated by self-supervised learning
Evaluation: OpenAI staff use the trained model on data that wasn’t part of training, in order to estimate things like perplexity, performance on benchmarks, etc.
Use / deployment: Some random author buys access to the OpenAI API and uses GPT-3 to help them to brainstorm how to advance the plot of a short story that they’re writing.
Could it be the case that the “evolution from scratch” model is learned in the Learned Content of the “ML code” approach? Is that what the mesa-optimization line suggests?
We’re talking about the diagram in Section 8.3, right side. I interpret your comment as saying: What if the “Learned content” box was sufficiently powerful that it could, say, implement any computable function? If so, then a whole second, separate model-based RL system could appear inside that “Learned content” box. (Is that what you’re saying?)
If so, I agree in principle. But in practice I expect the “Leaned content” box to not be able to implement any computable function, or (more specifically) to run all the machinery of an entire separate “mesa” model-based RL system. Instead I expect it to be narrowly tailored to performing an operation that we might describe as “querying and/or updating a probabilistic world-model”. (And value function and so on.)
So I think “mesa-optimizers”, as the term is normally used today, are really specific to the “evolution from scratch” model, and not a useful thing to talk about in the context of the “genome = ML code” model.
Hey Steven, im new in the LW community so please excuse my formatting.
I am confused about the deployment part of offline training. Is it not the case that when people use a model (aka query a trained model on validation set), they seek to evaluate and not fit the new examples? So would it not be about changing weights in online learning vs using the relevant activations in offline mode?
Could it be the case that the “evolution from scratch” model is learned in the Learned Content of the “ML code” approach? Is that what the mesa-optimization line suggests?
Thanks!
Thanks yourself!
You say “aka”, but those seem different to me. For example, in regards to GPT-3, we can consider:
Training: Weights are updated by self-supervised learning
Evaluation: OpenAI staff use the trained model on data that wasn’t part of training, in order to estimate things like perplexity, performance on benchmarks, etc.
Use / deployment: Some random author buys access to the OpenAI API and uses GPT-3 to help them to brainstorm how to advance the plot of a short story that they’re writing.
We’re talking about the diagram in Section 8.3, right side. I interpret your comment as saying: What if the “Learned content” box was sufficiently powerful that it could, say, implement any computable function? If so, then a whole second, separate model-based RL system could appear inside that “Learned content” box. (Is that what you’re saying?)
If so, I agree in principle. But in practice I expect the “Leaned content” box to not be able to implement any computable function, or (more specifically) to run all the machinery of an entire separate “mesa” model-based RL system. Instead I expect it to be narrowly tailored to performing an operation that we might describe as “querying and/or updating a probabilistic world-model”. (And value function and so on.)
So I think “mesa-optimizers”, as the term is normally used today, are really specific to the “evolution from scratch” model, and not a useful thing to talk about in the context of the “genome = ML code” model.