Thanks for taking and sharing your notes! Adding some of my own below that I haven’t seen mentioned yet:
Sam made a case that people will stop caring about the size of the models as measured by the number of parameters, but will instead care about the training compute (with models that train continuously being the ultimate target). Parameters will get outdated in the same way we don’t measure CPU performance using gigahertz anymore.
The main bottleneck towards the AGI at the moment is the algorithmic/theoretical breakthroughs. There were times when Sam was convinced compute was the bottleneck, but not anymore. OpenAI believes there’s enough compute in the world to be able to run an AGI (whenever the algo breakthroughs arrive). He also shared that the most pessimistic scenarios they’ve modelled put the power requirements for running hardware for an AGI at around one nuclear plant. Which in his opinion is not too much, and also means that you could put that machine close to clean energy (e.g. near a volcano in Iceland or a waterfall in Canada).
On differences between the approaches between OpenAI and DeepMind: DeepMind seems to involve a lot more neuroscientists and psychologists in their research. OpenAI studies deeplearning “like people study physics”.
Sam mentioned that the name “OpenAI” is unfortunate, but they are stuck with it. The reason they don’t release some of their models along with weights and biases is so that they can keep some level of control over their usage, and can shut them down if they need to. He said that they like the current API-based approach to release those models without completely giving away the control over them.
On figuring out whether the model is conscious, Sam shared one speculation by a researcher from OpenAI: make sure to train the model on data that does not mention “self-awareness” or “consciousness” in any way, then at run time try to explain those concepts. If the model responds with something akin to “I understand exactly what you’re saying”, it’s a worrying sign about that model’s self-awareness. Also, as pointed out above, they have no idea whether intelligence can be untangled from consciousness.
The whole discussion about merits of leaving the academia (or just generally an organization that does not reward thinking about AI safety) vs staying to persuade some of the smartest people who are still part of that system.
Parameters will get outdated in the same way we don’t measure CPU performance using gigahertz anymore.
This was in the context of when he was asked about the number of parameters of GPT-4. He said that the big changes are not in the number of parameters but in the structure of the model. And he made a pretty excited and/or confident impression on me when he said it. I wouldn’t be surprised if the nxt GPT-N is much better without many more parameters.
Thanks for taking and sharing your notes! Adding some of my own below that I haven’t seen mentioned yet:
Sam made a case that people will stop caring about the size of the models as measured by the number of parameters, but will instead care about the training compute (with models that train continuously being the ultimate target). Parameters will get outdated in the same way we don’t measure CPU performance using gigahertz anymore.
The main bottleneck towards the AGI at the moment is the algorithmic/theoretical breakthroughs. There were times when Sam was convinced compute was the bottleneck, but not anymore. OpenAI believes there’s enough compute in the world to be able to run an AGI (whenever the algo breakthroughs arrive). He also shared that the most pessimistic scenarios they’ve modelled put the power requirements for running hardware for an AGI at around one nuclear plant. Which in his opinion is not too much, and also means that you could put that machine close to clean energy (e.g. near a volcano in Iceland or a waterfall in Canada).
On differences between the approaches between OpenAI and DeepMind: DeepMind seems to involve a lot more neuroscientists and psychologists in their research. OpenAI studies deeplearning “like people study physics”.
Sam mentioned that the name “OpenAI” is unfortunate, but they are stuck with it. The reason they don’t release some of their models along with weights and biases is so that they can keep some level of control over their usage, and can shut them down if they need to. He said that they like the current API-based approach to release those models without completely giving away the control over them.
On figuring out whether the model is conscious, Sam shared one speculation by a researcher from OpenAI: make sure to train the model on data that does not mention “self-awareness” or “consciousness” in any way, then at run time try to explain those concepts. If the model responds with something akin to “I understand exactly what you’re saying”, it’s a worrying sign about that model’s self-awareness. Also, as pointed out above, they have no idea whether intelligence can be untangled from consciousness.
The whole discussion about merits of leaving the academia (or just generally an organization that does not reward thinking about AI safety) vs staying to persuade some of the smartest people who are still part of that system.
This was in the context of when he was asked about the number of parameters of GPT-4. He said that the big changes are not in the number of parameters but in the structure of the model. And he made a pretty excited and/or confident impression on me when he said it. I wouldn’t be surprised if the nxt GPT-N is much better without many more parameters.