The theory based on anchoring to the sum “computation done by evolution in the history of life” completely fails to postdict the success of deep learning, and thus is so wildly improbable as to not be worth any further consideration, and seriously undermines the entire report.
I don’t think the point is that you need that much compute but that’s an upper bound on how much compute you might need. So I don’t understand your argument; it’s not like the report takes this as its central estimate. I don’t think the scaling in performance we’ve seen in the past 10 years, in which training compute got scaled up by 6-7 OOM in total, is strong evidence against training requirements for AGI being around 10^40 FLOP. That question just looks mostly uncertain to me.
The point of that post was to disprove statements like that as they are based on naive oversimplified layman models (which any good semiconductor engineer would already scoff at, but the wisdom from that field is not widely enough distributed).
Again, I don’t think this is particularly relevant to the post. I agree with you that the Landauer limit bound is very loose, that’s the entire reason I cited your post to begin with. I’m not sure why you somehow felt that your message had not been properly communicated. I’ve edited this part of the question to clarify what I actually meant.
However, it’s much easier to justify this bound for a physical computation you don’t understand very well than to justify something that’s tighter, and all you get after correcting for that is probably ~ 5 OOM of difference in the final answer, which I already incorporate in my 10^45 FLOP figure and which is also immaterial to the question I’m trying to ask here.
If one starts doing anthropic adjustments, you don’t just stop there. The vast majority of our measure will be in various simulations, which dramatically shifts everything around to favor histories with life leading to singularities.
I strongly disagree with this way of applying anthropic adjustments. I think these should not in principle be different from Bayesian updates: you start with some prior (could be something like a simplicity prior) over all subjective universes you could have observed and update based on what you actually observe. In that case there’s a trivial sense in which the simulation hypothesis is true, because you could always have a simulator that simulates every possible program that halts or something like this, but that doesn’t help you actually reduce the entropy of your own observations or to predict anything about the future, so it is not functional.
I think for this to go through you need to do anthropics using SIA or something similar and I don’t think that’s justifiable, so I also think this whole argument is illegitimate.
So I don’t understand your argument; it’s not like the report takes this as its central estimate. I don’t think the scaling in performance we’ve seen in the past 10 years, in which training compute got scaled up by 6-7 OOM in total, is strong evidence against training requirements for AGI being around 10^40 FLOP.
My argument is that the report does not use the correct procedure, where the correct procedure is to develop one or a few simple models that best postdict the relevant observed history. Most of the (correct) report would then be comparing posdictions of the simple models to the relevant history (AI progress), to adjust hyperparms and do model selection.
However, it’s much easier to justify this bound for a physical computation you don’t understand very well than to justify something that’s tighter, and all you get after correcting for that is probably ~ 5 OOM of difference in the final answer, which is immaterial to the question I’m trying to ask here.
Fair.
If one starts doing anthropic adjustments, you don’t just stop there. The vast majority of our measure will be in various simulations, which dramatically shifts everything around to favor histories with life leading to singularities.
I strongly disagree with this way of applying anthropic adjustments. I think these should not in principle be different from Bayesian updates: you start with some prior (could be something like a simplicity prior) over all subjective universes you could have observed and update based on what you actually observe. In that case there’s a trivial sense in which the simulation hypothesis is true, because you could always have a simulator that simulates every possible program that halts or something like this, but that doesn’t help you actually reduce the entropy of your own observations or to predict anything about the future, so it is not functional.
The optimal inference procedure (solomonoff in binary logic form, equivalent to full bayesianism) is basically what you describe: form a predictive distribution from all computable theories ranked by total entropy (posterior fit + complexity prior). I agree that probably does lead to accepting the simulation hypothesis, because most of the high fit submodels based on extensive physics sims will likely locate observers in simulations rather than root realities.
The anthropic update is then updating from approximate predictive models which don’t feature future sims to those that do.
I don’t understand what you mean by “doesn’t help you actually reduce the entropy of your own observations”, as that’s irrelevant. The anthropic update to include sim hypothesis is not an update to the core ideal predictive models themselves (as that is physics), it’s an update to the approximations we naturally must use to predict the far future.
I think for this to go through you need to do anthropics using SIA or something similar
I don’t see the connection to SIA, and regardless of that philosophical confusion there is only one known universally correct in the limits inference method: so the question is always just what would a computationally unbound solomonoff inducer infer?
I don’t think the point is that you need that much compute but that’s an upper bound on how much compute you might need. So I don’t understand your argument; it’s not like the report takes this as its central estimate. I don’t think the scaling in performance we’ve seen in the past 10 years, in which training compute got scaled up by 6-7 OOM in total, is strong evidence against training requirements for AGI being around 10^40 FLOP. That question just looks mostly uncertain to me.
Again, I don’t think this is particularly relevant to the post. I agree with you that the Landauer limit bound is very loose, that’s the entire reason I cited your post to begin with. I’m not sure why you somehow felt that your message had not been properly communicated. I’ve edited this part of the question to clarify what I actually meant.
However, it’s much easier to justify this bound for a physical computation you don’t understand very well than to justify something that’s tighter, and all you get after correcting for that is probably ~ 5 OOM of difference in the final answer, which I already incorporate in my 10^45 FLOP figure and which is also immaterial to the question I’m trying to ask here.
I strongly disagree with this way of applying anthropic adjustments. I think these should not in principle be different from Bayesian updates: you start with some prior (could be something like a simplicity prior) over all subjective universes you could have observed and update based on what you actually observe. In that case there’s a trivial sense in which the simulation hypothesis is true, because you could always have a simulator that simulates every possible program that halts or something like this, but that doesn’t help you actually reduce the entropy of your own observations or to predict anything about the future, so it is not functional.
I think for this to go through you need to do anthropics using SIA or something similar and I don’t think that’s justifiable, so I also think this whole argument is illegitimate.
My argument is that the report does not use the correct procedure, where the correct procedure is to develop one or a few simple models that best postdict the relevant observed history. Most of the (correct) report would then be comparing posdictions of the simple models to the relevant history (AI progress), to adjust hyperparms and do model selection.
Fair.
The optimal inference procedure (solomonoff in binary logic form, equivalent to full bayesianism) is basically what you describe: form a predictive distribution from all computable theories ranked by total entropy (posterior fit + complexity prior). I agree that probably does lead to accepting the simulation hypothesis, because most of the high fit submodels based on extensive physics sims will likely locate observers in simulations rather than root realities.
The anthropic update is then updating from approximate predictive models which don’t feature future sims to those that do.
I don’t understand what you mean by “doesn’t help you actually reduce the entropy of your own observations”, as that’s irrelevant. The anthropic update to include sim hypothesis is not an update to the core ideal predictive models themselves (as that is physics), it’s an update to the approximations we naturally must use to predict the far future.
I don’t see the connection to SIA, and regardless of that philosophical confusion there is only one known universally correct in the limits inference method: so the question is always just what would a computationally unbound solomonoff inducer infer?