Task: Suggest surprising experiments that challenge assumptions
Context: A researcher is considering an alignment proposal that hinges on some key assumptions. They would like to see some suggestions for experiments (either theoreetical thoughts experiments or actual real-world experiments) that could challenge those assumptions. If the experiment has been done, it should report the results.
Input type: An assumption about a powerful AI system
Output type: a suggestion for an experiment that could challenge that assumption. If it has been done already, the results of those experiments.
Instance 1:
Input: The performance of a model is impossible to predict, so we can’t hope to have an idea of a model’s capabilities before it is trained and evaluated.
Output: It might be that a key measure of performance of a model, such as the loss, might scale predictably with the model size. This was investigated by Kaplan et al (https://arxiv.org/abs/2001.08361), who found that the loss tends to follow a power law.
Instance 2:
Input: Suppose a model is trained on data that is mixed with some noise (as in https://arxiv.org/pdf/2009.08092.pdf ).The model will necessarily learn that the data was mixed with some noise, rather than learn a really complex decision boundary.
Output: Suppose that you try fine-tuning one of these models on data that doesn’t have the noise. It might be very slow to adapt to this in which case it might have learned the complex decision boundary. (This experiment hasn’t been done.)
Instance 3:
Input: It’s impossible to train a neural network without non-linearities like ReLU or a sigmoid.
Output: That is true for theoretical neural networks, but real neural networks are trained using floating point numbers with inherently non-linear arithmetic. These imperfections might be enough to train a competent model. This experiment was done by Jakob Foerster, who found that this was indeed enough: https://openai.com/blog/nonlinear-computation-in-linear-networks/
Task: Suggest surprising experiments that challenge assumptions
Context: A researcher is considering an alignment proposal that hinges on some key assumptions. They would like to see some suggestions for experiments (either theoreetical thoughts experiments or actual real-world experiments) that could challenge those assumptions. If the experiment has been done, it should report the results.
Input type: An assumption about a powerful AI system
Output type: a suggestion for an experiment that could challenge that assumption. If it has been done already, the results of those experiments.
Instance 1:
Input: The performance of a model is impossible to predict, so we can’t hope to have an idea of a model’s capabilities before it is trained and evaluated.
Output: It might be that a key measure of performance of a model, such as the loss, might scale predictably with the model size. This was investigated by Kaplan et al (https://arxiv.org/abs/2001.08361), who found that the loss tends to follow a power law.
Instance 2:
Input: Suppose a model is trained on data that is mixed with some noise (as in https://arxiv.org/pdf/2009.08092.pdf ).The model will necessarily learn that the data was mixed with some noise, rather than learn a really complex decision boundary.
Output: Suppose that you try fine-tuning one of these models on data that doesn’t have the noise. It might be very slow to adapt to this in which case it might have learned the complex decision boundary. (This experiment hasn’t been done.)
Instance 3:
Input: It’s impossible to train a neural network without non-linearities like ReLU or a sigmoid.
Output: That is true for theoretical neural networks, but real neural networks are trained using floating point numbers with inherently non-linear arithmetic. These imperfections might be enough to train a competent model. This experiment was done by Jakob Foerster, who found that this was indeed enough: https://openai.com/blog/nonlinear-computation-in-linear-networks/