Are you talking about an abstract generative model with unlimited computational resources or an actual model that may exist in X years from now?
I think it’s plausible we’ll have models capable of pretty sophisticated simulation, with access to enormous computational resources, before AGI becomes a serious threat.
Solving complex problems with AI doesn’t necessarily mean simulations
Totally agree. I’m using the word simulation a bit loosely. All I mean is that the AI can predict observations as if from a simulation.
But if you do expect that the model will run simulations inside itself, what level of simulation fidelity
I’m imagining something like “the model has high fidelity on human thought patterns and some basic physical-world-model stuff like that human bodies are fragile, but it doesn’t simulate individual cells or anything like that”.
There is no way around the halting problem and the second law of thermodynamics.
I don’t think the halting problem is relevant here? Nor the second law… but maybe I’m missing something?
I should have been clearer. Don’t exactly remember what I was thinking about now. Maybe about the suggested prompt in the “Simulating Human Alignment Researchers” section: if the oracle is suggested to somehow cut through everything that would have happened in 2000 (or more?) years in its simulation, it should either run for thousands or millions of years itself (a sufficiently high-fidelity simulation) or, or the simulation will dramatically diverge from what is actually likely to happen in reality. (However, there is no particular relation between this idea, the halting problem, and the second law).
Alternatively, if the prompt is designed just to “prime the oracle’s imagination” before writing the textbook, rather than an invitation for an elaborate simulation, I don’t see how it’s at all safer than plainly asking the oracle to write the alignment textbook with proofs.
I think it’s plausible we’ll have models capable of pretty sophisticated simulation, with access to enormous computational resources, before AGI becomes a serious threat.
Totally agree. I’m using the word simulation a bit loosely. All I mean is that the AI can predict observations as if from a simulation.
I’m imagining something like “the model has high fidelity on human thought patterns and some basic physical-world-model stuff like that human bodies are fragile, but it doesn’t simulate individual cells or anything like that”.
I don’t think the halting problem is relevant here? Nor the second law… but maybe I’m missing something?
I should have been clearer. Don’t exactly remember what I was thinking about now. Maybe about the suggested prompt in the “Simulating Human Alignment Researchers” section: if the oracle is suggested to somehow cut through everything that would have happened in 2000 (or more?) years in its simulation, it should either run for thousands or millions of years itself (a sufficiently high-fidelity simulation) or, or the simulation will dramatically diverge from what is actually likely to happen in reality. (However, there is no particular relation between this idea, the halting problem, and the second law).
Alternatively, if the prompt is designed just to “prime the oracle’s imagination” before writing the textbook, rather than an invitation for an elaborate simulation, I don’t see how it’s at all safer than plainly asking the oracle to write the alignment textbook with proofs.