I’m not truly impressed with GiveWell’s general optimization since they never made a good case that malaria was connected to astronomical benefits or, indeed, seem to have realized that such a case is necessary for effective altruism.
Well, but I’m not sure MIRI can be said to have “made a good case” that its own work is well-connected to astronomical benefits, either. Presumably the argument for that looks something like the FAI Research as Effective Altruism argument, but that argument hasn’t been made in much detail, with the key assumptions clearly identified and argued for with clarity and solid evidential backing. E.g.:
Beckstead’s 2013 thesis is the first document I’m aware of that clearly lays out all the assumptions baked into the argument for the overwhelming importance of the far future.
My 2013 post When Will AI Be Created? is (I think) the best available piece for capturing the enormous difficulties of predicting AI — with reference to lots of relevant empirical data — while also (barely) making the case for assigning a good chunk of one’s probability mass to getting AI this century. But it’s still pretty inadequate, and the part making the case for the plausibility of AI this century could be substantially improved if more time was invested. (Compare to Bostrom 1998, which I find inadequate. I also think it will now look naively timelines-optimistic to most observers.)
Moreover, it’s not that Givewell (well, Holden) hasn’t “realized” that recommended altruistic interventions (e.g. bednets) need to be connected via argument to astronomical benefits. Rather, Holden has been aware of astronomical waste arguments for a long time, and has reasons for rejecting them. He also discussed astronomical waste arguments many times with Beckstead while Beckstead was writing his dissertation. Unfortunately, Holden has struggled to clearly express his reasons for rejecting astronomical waste arguments. He tried to explain his reasons to me in person once but I couldn’t make sense of what he was saying. He also tried to explain his point in the last three paragraphs of this comment, but I, at least, still don’t understand quite what he’s saying. Explaining is hard.
Also, Holden has spent a lot of time working up to an explanation of why he (currently) thinks that (1) “generic good work” (which may indirectly produce astronomical benefits via ripple effects) has higher expected value than (2) narrow interventions aimed more directly at astronomical benefit. His two latest posts in this thread are Flow-through effects and Possible global catastrophic risks, and he has promised that “a future post will discuss how I think about the overall contribution of economic/technological development to our odds of having a very bright, as opposed to very problematic, future.”
And all this during the early years in which GiveWell mostly hasn’t been investigating trickier issues like how different interventions connect to potential astronomical benefits, because GiveWell (wisely, I think) decided to start under the streetlight.
Well, but I’m not sure MIRI can be said to have “made a good case” that its own work is well-connected to astronomical benefits, either.
False modesty. The ‘good case’ already made for FAI being (optimally) related to astronomical benefits and the ‘good case’ already made for malaria reduction being (optimally) related to astronomical benefits are not of the same order of magnitude of already madeness.
I’m not sure “false modesty” applies, at least given my views about the degree to which the FAI case has been made.
For my own idea of “good case made,” anyway, I’d say the “malaria nets near-optimally connected to astronomical benefits” case is close to 0% of the way to “good case made,” and the “FAI research near-optimally connected to astronomical benefits” case is more like 10% of the way to “good case made.”
I don’t think that MIRI has made a case for the particular FAI research that it’s doing having non-negligible relevance to AI safety. See my “Chinese Economy” comments here.
Unfortunately, Holden has struggled to clearly express his reasons for rejecting astronomical waste arguments.
It looks to me like he is using a bounded utility function with a really low bound. See this passage:
I feel that humanity’s future may end up being massively better than its past, and unexpected new developments (particularly technological innovation) may move us toward such a future with surprising speed. Quantifying just how much better such a future would be does not strike me as a very useful exercise, but very broadly, it’s easy for me to imagine a possible future that is at least as desirable as human extinction is undesirable. In other words, if I somehow knew that economic and technological development were equally likely to lead to human extinction or to a brighter long-term future, it’s easy for me to imagine that I could still prefer such development to stagnation.
If the best possible future that Holden can imagine (which the rest of the post makes clear does includes space colonization) doesn’t have much more than twice the utility of stagnation (setting extinction to be the zero point), then “astronomical waste” obviously isn’t very astronomical in terms of Holden’s utility function.
He gave a lower bound, sufficient to motivate the view that we should not seek stagnation, which is what he seems to be talking about there. Why interpret a lower bound (when this is all that is needed to establish the point, and less controversial) which is “easy” into a near-upper-bound?
Stagnation on Earth means astronomical waste almost exactly as much as near-term extinction (and also cuts us off from very high standards of living that might be achieved). Holden is saying that the conclusion that growth with plausible risk levels beats permanent stagnation is robust. Talking about 100:1 tradeoffs would be less robust.
I guess I was doing a Bayesian update based on what he wrote. Yes, technically he gave a lower bound, but while someone who thinks that the best possible future is 10 times better than stagnation (relative to extinction) might still write “Quantifying just how much better such a future would be does not strike me as a very useful exercise, but very broadly, it’s easy for me to imagine a possible future that is at least as desirable as human extinction is undesirable”, someone who thinks it’s at least a thousand or a billion times better probably wouldn’t.
Moreover, it’s not that Givewell (well, Holden) hasn’t “realized” that recommended altruistic interventions (e.g. bednets) need to be connected via argument to astronomical benefits. Rather, Holden has been aware of astronomical waste arguments for a long time, and has reasons for rejecting them. He also discussed astronomical waste arguments many times with Beckstead while Beckstead was writing his dissertation. Unfortunately, Holden has struggled to clearly express his reasons for rejecting astronomical waste arguments. He tried to explain his reasons to me in person once but I couldn’t make sense of what he was saying. He also tried to explain his point in the last three paragraphs of this comment, but I, at least, still don’t understand quite what he’s saying. Explaining is hard.
Speaking for myself:
One response is that the argument for acting on the astronomical waste argument is only one relatively strong argument that should be weighed against more prosaic ethical considerations in order to account for model uncertainty.
Here is a concrete argument against giving shaping the far future dominant consideration in one’s philanthropic decision making, within the astronomical waste framework.
The doomsday argument suggests that the human race is going to go extinct in the relatively near term with very high probability. This is strange, because there doesn’t seem to be other reason for thinking this.
A reconciliation of the doomsday argument with the absence of other evidence for extinction that is sometimes offered is the theory that we’re living in one of many simulations that were created by past humans who underwent a singularity scenario, and that our simulation is going to be turned off soon.
If we’re in one of many simulations with other humans, and the humans in these simulations are sufficiently correlated, timeless decision theory suggests that ordinary helping has astronomical benefits.
Those who subscribe to this view often believe that despite this consideration, shaping the far future nevertheless dominates ordinary helping in expected value. But they might be wrong about this. It appears that they would have to be wrong with awfully high probability in order to overturn the expected value of focusing on shaping the far future. But maybe this appearance is illusory, and for some reason that people haven’t recognized yet, the benefits of ordinary helping mediated through timeless decision theory swamp the expected value of focusing on shaping the far future.
A nontrivial chance of this being true would establish a lower bound on how good a potential opportunity to shape the far future has to be in order to overcome opportunities for ordinary helping.
Thanks, Jonah. I think skepticism about the dominance of the far future is actually quite compelling, such that I’m not certain that focusing on the far future dominates (though I think it’s likely that it does on balance, but much less than I naively thought).
The strongest argument is just that believing we are in a position to influence astronomical numbers of minds runs contrary to Copernican intuitions that we should be typical observers. Isn’t it a massive coincidence that we happen to be among a small group of creatures that can most powerfully affect our future light cone? Robin Hanson’s resolution of Pascal’s mugging relied on this idea.
The simulation-argument proposal is one specific way to hash out this Copernican intuition. The sim arg is quite robust and doesn’t depend on the self-sampling assumption the way the doomsday argument does. We have reasonable a priori reasons for thinking there should be lots of sims—not quite as strong as the arguments for thinking we should be able to influence the far future, but not vastly weaker.
Let’s look at some sample numbers. We’ll work in units of “number of humans alive in 2014,” so that the current population of Earth is 1. Let’s say the far future contains N humans (or human-ish sentient creatures), and a fraction f of those are sims that think they’re on Earth around 2014. The sim arg suggests that Nf >> 1, i.e., we’re probably in one of those sims. The probability we’re not in such a sim is 1/(Nf+1), which we can approximate as 1/(Nf). Now, maybe future people have a higher intensity of experience i relative to that of present-day people. Also, it’s much easier to affect the near future than the far future, so let e represent the amount of extra “entropy” that our actions face if they target the far future. For example, e = 10^-6 says there’s a factor-of-a-million discount for how likely our actions are to actually make the difference we intend for the far future vs. if we had acted to affect the near term. This entropy can come from uncertainty about what the far future will look like, failures of goal preservation, or intrusion of black swans.
Now let’s consider two cases—one assuming no correlations among actors (CDT) and one assuming full correlations (TDT-ish).
CDT case:
If we help in the short run, we can affect something like 1 people (where “1” means “7 billion”).
If we help in the long run, if we’re not in a sim, we can affect N people, with an i experience-intensity multiple, with a factor of e for uncertainty/entropy in our efforts. But the probability we’re not in a sim is 1/(Nf), so the overall expected value is 1/(Nf)Nie = ie/f.
It’s not obvious that ie/f > 1. For instance, if f = 10^-4, i = 10^2, and e = 10^-6, this would equal 1. Hence it wouldn’t be clear that targeting the far future is better than targeting the near term.
TDT-ish case:
There are Nf+1 copies of people (who think they’re) on Earth in 2014, so if we help in the short run, we help all of those Nf+1 people because our actions are mirrored across our copies. Since Nf >> 1, we can approximate this as Nf.
If we help by taking far-future-targeting actions, even if we’re in a sim, our actions can timelessly affect what happens in the basement, so we can have an impact regardless of whether we’re in a sim or not. The future contains N people with i intensity factor, and there’s e entropy on actions that try to do far-future stuff relative to short-term stuff. The expected value is Nie.
The ratio of long-term helping to short-term helping is Nie/(Nf) = ie/f, exactly the same as before. Hence, the uncertainty about whether the near- or far-future dominates persists.
I’ve tried these calculations with a few other tweaks, and something close to ie/f continues to pop out.
Now, this point is again of the “one relatively strong argument” variety, so I’m not claiming this particular elaboration is definitive. But it illustrates the types of ways that far-future-dominance arguments could be neglecting certain factors.
Note also that even if you think ie/f >> 1, it’s still less than the 10^30 or whatever factor a naive far-future-dominance perspective might assume. Also, to be clear, I’m ignoring flow-through effects of short-term helping on the far future and just talking about the intrinsic value of the direct targets of our actions.
In the long-run CDT case, why the assumption that people in a sim can’t affect people in the far future? At the very least, if we’re in a sim, we can affect people in the far future of our sim; and probably indirectly in baseline too, insofar as if we come up with a really good idea in the sim, then those who are running the sim may take notice of the idea and implement it outside said sim.
As for the figures; I have a few thoughts about f. Let us assume that the far future consists of one base world, which runs a number of simulations, which in turn run sub-simulations (and those run sub-sub-simulations, and so on). Let us assume that, at any given moment, each simulation’s internal clock is set to a randomly determined year. Let us further assume that our universe is fairly typical in terms of population.
The number of humans who have ever lived, up until 2011, has been estimated at 107 billion. This means that, if all simulations are constrained to run up until 2014 only, the fraction of people in simulations (at any given moment) who believe that they are alive in 2014 will be approximately 7⁄107 (the baseline will not significantly affect this figure if the number of simulations is large). If the simulations are permitted to run longer (and I see no reason why they wouldn’t be), then that figure will of course be lower, and possibly significantly lower.
I can therefore conclude that, in all probability, f < 7⁄107.
At the same time, Nf >> 1 means that f > 1/N. Of course, since N can be arbitrarily large, this tells us little; but it does imply, at least, that f>0.
Simulating humans near the singularity may be more interesting than simulating hunter-gatherers, so it may be that the fraction of sims around now is more than 7⁄107.
One reason not to expect the sims to go into the far future is that any far future with high altruistic import will have high numbers of computations, which would be expensive to simulate. It’s cheaper to simulate a few billion humans who have only modest computing power. For the same reason, it’s not clear that we’d have lots of sims within sims within sims, because those would get really expensive—unless computing power is so trivially cheap in the basement that it doesn’t matter.
That said, you’re right there could be at least a reasonable future ahead of us in a sim, but I’m doubtful many sims run the whole length of galactic history—again, unless the basement is drowning in computing power that it doesn’t know what to do with.
Interesting point about coming up with a really good idea. But one would tend to think that the superintelligent AIs in the basement would be much better at that. Why would they bother creating dumb little humans who go on to create their own superintelligences in the sim when they could just use superintelligences in the basement? If the simulators are interested in cognitive/evolutionary diversity, maybe that could be a reason.
Simulating humans near the singularity may be more interesting than simulating hunter-gatherers, so it may be that the fraction of sims around now is more than 7⁄107.
Possibly, but every 2014 needs to have a history; we can find evidence in our universe that around 107 billion people have existed, and I’m assuming that we’re fairly typical so far as universes go.
...annnnnd I’ve just realised that there’s no reason why someone in the future couldn’t run a simulation up to (say) 1800, save that, and then run several simulations from that date forwards, each with little tweaks (a sort of a Monte Carlo approach to history).
One reason not to expect the sims to go into the far future is that any far future with high altruistic import will have high numbers of computations, which would be expensive to simulate. It’s cheaper to simulate a few billion humans who have only modest computing power.
I question the applicability of this assertion to our universe. Yes, a game like Sid Meier’s Civilisation is a whole lot easier to simulate than (say) a crate of soil at the level of individual grains—because there’s a lot of detail being glossed over in Civilisation. The game does not simulate every grain of soil, every drop of water.
Our universe—whether it’s baseline or a simulation—seems to be running right down to the atomic level. That is, if we’re being simulated, then every individual atom, every electron and proton, is being simulated. Simulating a grain of sand at that level of detail is quite a feat of computing—but simulating a grain-of-sand-sized computer would be no harder. In each case, it’s the individual atoms that are being simulated, and atoms follow the same laws whether in a grain of sand or in a CPU. (They have to, or we’d never have figured out how to build the CPU).
So I don’t think there’s been any change in the computing power required to simulate our universe with the increase in human population and computing power.
For the same reason, it’s not clear that we’d have lots of sims within sims within sims, because those would get really expensive—unless computing power is so trivially cheap in the basement that it doesn’t matter.
Sub-sims just need to be computationally simpler by a few orders of magnitude than their parent sims. If we create a sim, then computing power in that universe will be fantastically expensive as compared to ours; if we are a sim, then computing power in our parent universe must be sufficient to run our universe (and it is therefore fantastically cheap as compared to our universe). I have no idea how to tell whether we’re in a top-end one-of-a-kind research lab computer, or the one-universe-up equivalent of a smartphone.
That said, you’re right there could be at least a reasonable future ahead of us in a sim, but I’m doubtful many sims run the whole length of galactic history—again, unless the basement is drowning in computing power that it doesn’t know what to do with.
You have a good point. If we’re a sim, we could be terminated unexpectedly at any time. Presumably as soon as the conditions of the sim are fulfilled.
Of course, the fact that our sim (if we are a sim) is running at all implies that the baseline must have the computing power to run us; in comparison with which, everything that we could possibly do with computing power is so trivial that it hardly even counts as a drain on resources. Of course, that doesn’t mean that there aren’t equivalently computationally expensive things that they might want to do with our computing resources (like running a slightly different sim, perhaps)...
Interesting point about coming up with a really good idea. But one would tend to think that the superintelligent AIs in the basement would be much better at that. Why would they bother creating dumb little humans who go on to create their own superintelligences in the sim when they could just use superintelligences in the basement?
Maybe we’re the sim that the superintelligence is using to test its ideas before introducing them to the baseline? If our universe fulfills its criteria better than any other, then it acts in such a way as to make baseline more like our universe. (Whatever those criteria are...)
there’s no reason why someone in the future couldn’t run a simulation up to (say) 1800, save that, and then run several simulations from that date forwards, each with little tweaks
Yep, exactly. That’s how you can get more than 7⁄107 of the people in 2014.
That is, if we’re being simulated, then every individual atom, every electron and proton, is being simulated.
Probably not, though. In Bostrom’s simulation-argument paper, he notes that you only need the environment to be accurate enough that observers think the sim is atomically precise. For instance, when they perform quantum experiments, you make those experiments come out right, but that doesn’t mean you actually have to simulate quantum mechanics everywhere. Because superficial sims would be vastly cheaper, we should expect vastly more of them, so we’d probably be in one of them.
Many present-day computer simulations capture high-level features of a system without delving into all the gory details. Probably most sims could suffice to have intermediate levels of detail for physics and even minds. (E.g., maybe you don’t need to simulate every neuron, just their higher-level aggregate behaviors, except when neuroscientists look at individual neurons.)
Of course, the fact that our sim (if we are a sim) is running at all implies that the baseline must have the computing power to run us; in comparison with which, everything that we could possibly do with computing power is so trivial
This is captured by the N term in my rough calculations above. If the basement has gobs of computing power, that means N is really big. But N cancels out from the final action-relevant ie/f expression.
Probably not, though. In Bostrom’s simulation-argument paper, he notes that you only need the environment to be accurate enough that observers think the sim is atomically precise.
Hmmm. It’s a fair argument, but I’m not sure how well it would work out in practice.
To clarify, I’m not saying that the sim couldn’t be run like that. My claim is, rather, that if we are in a sim being run with varying levels of accuracy as suggested, then we should be able to detect it.
Consider, for the moment, a hill. That hill consists of a very large number of electrons, protons and neutrons. Assume for the moment that the hill is not the focus of a scientific experiment. Then, it may be that the hill is being simulated in some computationally cheaper manner than simulating every individual particle.
There are two options. Either the computationally cheaper manner is, in every single possible way, indistinguishable from simulating every individual particle. In this case, there is no reason to use the more computationally expensive method when a scientist tries to run an experiment which includes the hill; all hills can use the computationally cheaper method.
The alternative is that there is some way, however slight or subtle, in which the behaviour of the atoms in the hill differs from the behaviour of those same atoms when under scientific investigation. If this is the case, then it means that the scientific laws deduced from experiments on the hill will, in some subtle way, not match the behaviour of hills in general. In this case, there must be a detectable difference; in effect, under certain circumstances hills are following a different set of physical laws and sooner or later someone is going to notice that. (Note that this can be avoided, to some degree, by saving the sim at regular intervals; if someone notices the difference between the approximation and a hill made out of properly simulated atoms, then the simulation is reloaded from a save just before that difference happened and the approximation is updated to hide that detail. This can’t be done forever—after a few iterations, the approximation’s computational complexity will begin to approach the computational complexity of the atomic hill in any case, plus you’ve now wasted a lot of cycles running sims that had no purpose other than refining the approximation—but it could stave off discovery for a period, at least).
Having said that, though, another thought has occurred to me. There’s no guarantee (if we are in a sim) that the laws of physics are the same in our universe as they are in baseline; we may, in fact, have laws of physics specifically designed to be easier to compute. Consider, for example, the uncertainty principle. Now, I’m no quantum physicist, but as I understand it, the more precisely a particle’s position can be determined, the less precisely its momentum can be known—and, at the same time, the more precisely its momentum is known, the less precisely its position can be found. Now, in terms of a simulation, the uncertainty principle means that the computer running the simulation need not keep track of the position and momentum of every particle at full precision. It may, instead, keep track of some single combined value (a real quantum physicist might be able to guess at what that value is, and how position and/or momentum can be derived from it). And given the number of atoms in the observable universe, the data storage saved by this is massive (and suggests that Baseline’s storage space, while immense, is not infinite).
Of course, like any good simplification, the Uncertainty Principle is applied everywhere, whether a scientist is looking at the data or not.
What is and isn’t simulated to a high degree of detail can be determined dynamically. If people decide they want to investigate a hill, some system watching the sim can notice that and send a signal that the sim needs to make the hill observations correspond with quantum/etc. physics. This shouldn’t be hard to do. For instance, if the theory predicts observation X +/- Y, you can generate some random numbers centered around X with std. dev. Y. Or you can make them somewhat different if the theory is wrong and to account for model uncertainty.
If the scientists would do lots of experiments that are connected in complex ways such that consistency requires them to come out with certain complex relationships, you’d need to get somewhat more fancy with faking the measurements. Worst case, you can actually do a brute-force sim of that part of physics for the brief period required. And yeah, as you say, you can always revert to a previous state if you screw up and the scientists find something amiss, though you probably wouldn’t want to do that too often.
There’s no guarantee (if we are in a sim) that the laws of physics are the same in our universe as they are in baseline; we may, in fact, have laws of physics specifically designed to be easier to compute.
Worst case, you can actually do a brute-force sim of that part of physics for the brief period required.
This is kind of where the trouble starts to come in. What happens when the scientist, instead of looking at hills in the present, turns instead to look at historical records of hills a hundred years in the past?
If he has actually found some complex interaction that the simplified model fails to cover, then he has a chance of finding evidence of living in a simulation; yes, the simulation can be rolled back a hundred years and then re-run from that point onwards, but is that really more computationally efficient than just running the full physics all the time? (Especially if you have to regularly keep going back to update the model).
This is where his fellow scientists call him a “crackpot” because he can’t replicate any of his experimental findings. ;)
More seriously, the sim could modify his observations to make him observe the right things. For instance, change the photons entering his eyes to be in line with what they should be, change the historical records a la 1984, etc. Or let him add an epicycle to his theory to account for the otherwise unexplainable results.
In practice, I doubt atomic-level effects are ever going to produce clearly observable changes outside of physics labs, so 99.99999% of the time the simulators wouldn’t have to worry about this as long as they simulated macroscopic objects to enough detail.
In practice, I doubt atomic-level effects are ever going to produce clearly observable changes outside of physics labs, so 99.99999% of the time the simulators wouldn’t have to worry about this as long as they simulated macroscopic objects to enough detail.
Well, yes, I’m not saying that this would make it easy to discover evidence that we are living in a simulation. It would simply make it possible to do so.
it’s much easier to affect the near future than the far future, so let e represent the amount of extra “entropy” that our actions face if they target the far future. For example, e = 10^-6 says there’s a factor-of-a-million discount for how likely our actions are to actually make the difference we intend for the far future vs. if we had acted to affect the near-term.
In the past, when I expressed worries about the difficulties associated to far-future meme-spreading, which you favor as an alternative to extinction-risk reduction, you said you thought there was a significant chance of a singleton-dominated future. Such a singleton, you argued, would provide the necessary causal stability for targeted meme-spreading to successfully influence our distant descendants. But now you seem to be implying that, other things equal, far-future meme-spreading is several orders of magnitude less likely to succeed than short-term interventions (including interventions aimed at reducing near-term risk of extinction, which plausibly represents a significant fraction of total extinction risk). I find these two views hard to reconcile.
Well, but I’m not sure MIRI can be said to have “made a good case” that its own work is well-connected to astronomical benefits, either. Presumably the argument for that looks something like the FAI Research as Effective Altruism argument, but that argument hasn’t been made in much detail, with the key assumptions clearly identified and argued for with clarity and solid evidential backing. E.g.:
I’m not aware of a thorough, empirical (written) investigation of whether elites will handle AI just fine.
Beckstead’s 2013 thesis is the first document I’m aware of that clearly lays out all the assumptions baked into the argument for the overwhelming importance of the far future.
My 2013 post When Will AI Be Created? is (I think) the best available piece for capturing the enormous difficulties of predicting AI — with reference to lots of relevant empirical data — while also (barely) making the case for assigning a good chunk of one’s probability mass to getting AI this century. But it’s still pretty inadequate, and the part making the case for the plausibility of AI this century could be substantially improved if more time was invested. (Compare to Bostrom 1998, which I find inadequate. I also think it will now look naively timelines-optimistic to most observers.)
Moreover, it’s not that Givewell (well, Holden) hasn’t “realized” that recommended altruistic interventions (e.g. bednets) need to be connected via argument to astronomical benefits. Rather, Holden has been aware of astronomical waste arguments for a long time, and has reasons for rejecting them. He also discussed astronomical waste arguments many times with Beckstead while Beckstead was writing his dissertation. Unfortunately, Holden has struggled to clearly express his reasons for rejecting astronomical waste arguments. He tried to explain his reasons to me in person once but I couldn’t make sense of what he was saying. He also tried to explain his point in the last three paragraphs of this comment, but I, at least, still don’t understand quite what he’s saying. Explaining is hard.
Also, Holden has spent a lot of time working up to an explanation of why he (currently) thinks that (1) “generic good work” (which may indirectly produce astronomical benefits via ripple effects) has higher expected value than (2) narrow interventions aimed more directly at astronomical benefit. His two latest posts in this thread are Flow-through effects and Possible global catastrophic risks, and he has promised that “a future post will discuss how I think about the overall contribution of economic/technological development to our odds of having a very bright, as opposed to very problematic, future.”
And all this during the early years in which GiveWell mostly hasn’t been investigating trickier issues like how different interventions connect to potential astronomical benefits, because GiveWell (wisely, I think) decided to start under the streetlight.
False modesty. The ‘good case’ already made for FAI being (optimally) related to astronomical benefits and the ‘good case’ already made for malaria reduction being (optimally) related to astronomical benefits are not of the same order of magnitude of already madeness.
I’m not sure “false modesty” applies, at least given my views about the degree to which the FAI case has been made.
For my own idea of “good case made,” anyway, I’d say the “malaria nets near-optimally connected to astronomical benefits” case is close to 0% of the way to “good case made,” and the “FAI research near-optimally connected to astronomical benefits” case is more like 10% of the way to “good case made.”
I don’t think that MIRI has made a case for the particular FAI research that it’s doing having non-negligible relevance to AI safety. See my “Chinese Economy” comments here.
Ah, I’d heard a rumor you’d updated away from that, guess that was mistaken. I’ve replied to that comment.
Thanks
It looks to me like he is using a bounded utility function with a really low bound. See this passage:
If the best possible future that Holden can imagine (which the rest of the post makes clear does includes space colonization) doesn’t have much more than twice the utility of stagnation (setting extinction to be the zero point), then “astronomical waste” obviously isn’t very astronomical in terms of Holden’s utility function.
He gave a lower bound, sufficient to motivate the view that we should not seek stagnation, which is what he seems to be talking about there. Why interpret a lower bound (when this is all that is needed to establish the point, and less controversial) which is “easy” into a near-upper-bound?
Stagnation on Earth means astronomical waste almost exactly as much as near-term extinction (and also cuts us off from very high standards of living that might be achieved). Holden is saying that the conclusion that growth with plausible risk levels beats permanent stagnation is robust. Talking about 100:1 tradeoffs would be less robust.
I guess I was doing a Bayesian update based on what he wrote. Yes, technically he gave a lower bound, but while someone who thinks that the best possible future is 10 times better than stagnation (relative to extinction) might still write “Quantifying just how much better such a future would be does not strike me as a very useful exercise, but very broadly, it’s easy for me to imagine a possible future that is at least as desirable as human extinction is undesirable”, someone who thinks it’s at least a thousand or a billion times better probably wouldn’t.
Speaking for myself:
One response is that the argument for acting on the astronomical waste argument is only one relatively strong argument that should be weighed against more prosaic ethical considerations in order to account for model uncertainty.
Here is a concrete argument against giving shaping the far future dominant consideration in one’s philanthropic decision making, within the astronomical waste framework.
The doomsday argument suggests that the human race is going to go extinct in the relatively near term with very high probability. This is strange, because there doesn’t seem to be other reason for thinking this.
A reconciliation of the doomsday argument with the absence of other evidence for extinction that is sometimes offered is the theory that we’re living in one of many simulations that were created by past humans who underwent a singularity scenario, and that our simulation is going to be turned off soon.
If we’re in one of many simulations with other humans, and the humans in these simulations are sufficiently correlated, timeless decision theory suggests that ordinary helping has astronomical benefits.
Those who subscribe to this view often believe that despite this consideration, shaping the far future nevertheless dominates ordinary helping in expected value. But they might be wrong about this. It appears that they would have to be wrong with awfully high probability in order to overturn the expected value of focusing on shaping the far future. But maybe this appearance is illusory, and for some reason that people haven’t recognized yet, the benefits of ordinary helping mediated through timeless decision theory swamp the expected value of focusing on shaping the far future.
A nontrivial chance of this being true would establish a lower bound on how good a potential opportunity to shape the far future has to be in order to overcome opportunities for ordinary helping.
Thanks, Jonah. I think skepticism about the dominance of the far future is actually quite compelling, such that I’m not certain that focusing on the far future dominates (though I think it’s likely that it does on balance, but much less than I naively thought).
The strongest argument is just that believing we are in a position to influence astronomical numbers of minds runs contrary to Copernican intuitions that we should be typical observers. Isn’t it a massive coincidence that we happen to be among a small group of creatures that can most powerfully affect our future light cone? Robin Hanson’s resolution of Pascal’s mugging relied on this idea.
The simulation-argument proposal is one specific way to hash out this Copernican intuition. The sim arg is quite robust and doesn’t depend on the self-sampling assumption the way the doomsday argument does. We have reasonable a priori reasons for thinking there should be lots of sims—not quite as strong as the arguments for thinking we should be able to influence the far future, but not vastly weaker.
Let’s look at some sample numbers. We’ll work in units of “number of humans alive in 2014,” so that the current population of Earth is 1. Let’s say the far future contains N humans (or human-ish sentient creatures), and a fraction f of those are sims that think they’re on Earth around 2014. The sim arg suggests that Nf >> 1, i.e., we’re probably in one of those sims. The probability we’re not in such a sim is 1/(Nf+1), which we can approximate as 1/(Nf). Now, maybe future people have a higher intensity of experience i relative to that of present-day people. Also, it’s much easier to affect the near future than the far future, so let e represent the amount of extra “entropy” that our actions face if they target the far future. For example, e = 10^-6 says there’s a factor-of-a-million discount for how likely our actions are to actually make the difference we intend for the far future vs. if we had acted to affect the near term. This entropy can come from uncertainty about what the far future will look like, failures of goal preservation, or intrusion of black swans.
Now let’s consider two cases—one assuming no correlations among actors (CDT) and one assuming full correlations (TDT-ish).
CDT case:
If we help in the short run, we can affect something like 1 people (where “1” means “7 billion”).
If we help in the long run, if we’re not in a sim, we can affect N people, with an i experience-intensity multiple, with a factor of e for uncertainty/entropy in our efforts. But the probability we’re not in a sim is 1/(Nf), so the overall expected value is 1/(Nf)Nie = ie/f.
It’s not obvious that ie/f > 1. For instance, if f = 10^-4, i = 10^2, and e = 10^-6, this would equal 1. Hence it wouldn’t be clear that targeting the far future is better than targeting the near term.
TDT-ish case:
There are Nf+1 copies of people (who think they’re) on Earth in 2014, so if we help in the short run, we help all of those Nf+1 people because our actions are mirrored across our copies. Since Nf >> 1, we can approximate this as Nf.
If we help by taking far-future-targeting actions, even if we’re in a sim, our actions can timelessly affect what happens in the basement, so we can have an impact regardless of whether we’re in a sim or not. The future contains N people with i intensity factor, and there’s e entropy on actions that try to do far-future stuff relative to short-term stuff. The expected value is Nie.
The ratio of long-term helping to short-term helping is Nie/(Nf) = ie/f, exactly the same as before. Hence, the uncertainty about whether the near- or far-future dominates persists.
I’ve tried these calculations with a few other tweaks, and something close to ie/f continues to pop out.
Now, this point is again of the “one relatively strong argument” variety, so I’m not claiming this particular elaboration is definitive. But it illustrates the types of ways that far-future-dominance arguments could be neglecting certain factors.
Note also that even if you think ie/f >> 1, it’s still less than the 10^30 or whatever factor a naive far-future-dominance perspective might assume. Also, to be clear, I’m ignoring flow-through effects of short-term helping on the far future and just talking about the intrinsic value of the direct targets of our actions.
In the long-run CDT case, why the assumption that people in a sim can’t affect people in the far future? At the very least, if we’re in a sim, we can affect people in the far future of our sim; and probably indirectly in baseline too, insofar as if we come up with a really good idea in the sim, then those who are running the sim may take notice of the idea and implement it outside said sim.
As for the figures; I have a few thoughts about f. Let us assume that the far future consists of one base world, which runs a number of simulations, which in turn run sub-simulations (and those run sub-sub-simulations, and so on). Let us assume that, at any given moment, each simulation’s internal clock is set to a randomly determined year. Let us further assume that our universe is fairly typical in terms of population.
The number of humans who have ever lived, up until 2011, has been estimated at 107 billion. This means that, if all simulations are constrained to run up until 2014 only, the fraction of people in simulations (at any given moment) who believe that they are alive in 2014 will be approximately 7⁄107 (the baseline will not significantly affect this figure if the number of simulations is large). If the simulations are permitted to run longer (and I see no reason why they wouldn’t be), then that figure will of course be lower, and possibly significantly lower.
I can therefore conclude that, in all probability, f < 7⁄107.
At the same time, Nf >> 1 means that f > 1/N. Of course, since N can be arbitrarily large, this tells us little; but it does imply, at least, that f>0.
Thanks, CCC. :)
Simulating humans near the singularity may be more interesting than simulating hunter-gatherers, so it may be that the fraction of sims around now is more than 7⁄107.
One reason not to expect the sims to go into the far future is that any far future with high altruistic import will have high numbers of computations, which would be expensive to simulate. It’s cheaper to simulate a few billion humans who have only modest computing power. For the same reason, it’s not clear that we’d have lots of sims within sims within sims, because those would get really expensive—unless computing power is so trivially cheap in the basement that it doesn’t matter.
That said, you’re right there could be at least a reasonable future ahead of us in a sim, but I’m doubtful many sims run the whole length of galactic history—again, unless the basement is drowning in computing power that it doesn’t know what to do with.
Interesting point about coming up with a really good idea. But one would tend to think that the superintelligent AIs in the basement would be much better at that. Why would they bother creating dumb little humans who go on to create their own superintelligences in the sim when they could just use superintelligences in the basement? If the simulators are interested in cognitive/evolutionary diversity, maybe that could be a reason.
Possibly, but every 2014 needs to have a history; we can find evidence in our universe that around 107 billion people have existed, and I’m assuming that we’re fairly typical so far as universes go.
...annnnnd I’ve just realised that there’s no reason why someone in the future couldn’t run a simulation up to (say) 1800, save that, and then run several simulations from that date forwards, each with little tweaks (a sort of a Monte Carlo approach to history).
I question the applicability of this assertion to our universe. Yes, a game like Sid Meier’s Civilisation is a whole lot easier to simulate than (say) a crate of soil at the level of individual grains—because there’s a lot of detail being glossed over in Civilisation. The game does not simulate every grain of soil, every drop of water.
Our universe—whether it’s baseline or a simulation—seems to be running right down to the atomic level. That is, if we’re being simulated, then every individual atom, every electron and proton, is being simulated. Simulating a grain of sand at that level of detail is quite a feat of computing—but simulating a grain-of-sand-sized computer would be no harder. In each case, it’s the individual atoms that are being simulated, and atoms follow the same laws whether in a grain of sand or in a CPU. (They have to, or we’d never have figured out how to build the CPU).
So I don’t think there’s been any change in the computing power required to simulate our universe with the increase in human population and computing power.
Sub-sims just need to be computationally simpler by a few orders of magnitude than their parent sims. If we create a sim, then computing power in that universe will be fantastically expensive as compared to ours; if we are a sim, then computing power in our parent universe must be sufficient to run our universe (and it is therefore fantastically cheap as compared to our universe). I have no idea how to tell whether we’re in a top-end one-of-a-kind research lab computer, or the one-universe-up equivalent of a smartphone.
You have a good point. If we’re a sim, we could be terminated unexpectedly at any time. Presumably as soon as the conditions of the sim are fulfilled.
Of course, the fact that our sim (if we are a sim) is running at all implies that the baseline must have the computing power to run us; in comparison with which, everything that we could possibly do with computing power is so trivial that it hardly even counts as a drain on resources. Of course, that doesn’t mean that there aren’t equivalently computationally expensive things that they might want to do with our computing resources (like running a slightly different sim, perhaps)...
Maybe we’re the sim that the superintelligence is using to test its ideas before introducing them to the baseline? If our universe fulfills its criteria better than any other, then it acts in such a way as to make baseline more like our universe. (Whatever those criteria are...)
Hi CCC :)
Yep, exactly. That’s how you can get more than 7⁄107 of the people in 2014.
Probably not, though. In Bostrom’s simulation-argument paper, he notes that you only need the environment to be accurate enough that observers think the sim is atomically precise. For instance, when they perform quantum experiments, you make those experiments come out right, but that doesn’t mean you actually have to simulate quantum mechanics everywhere. Because superficial sims would be vastly cheaper, we should expect vastly more of them, so we’d probably be in one of them.
Many present-day computer simulations capture high-level features of a system without delving into all the gory details. Probably most sims could suffice to have intermediate levels of detail for physics and even minds. (E.g., maybe you don’t need to simulate every neuron, just their higher-level aggregate behaviors, except when neuroscientists look at individual neurons.)
This is captured by the N term in my rough calculations above. If the basement has gobs of computing power, that means N is really big. But N cancels out from the final action-relevant ie/f expression.
Hmmm. It’s a fair argument, but I’m not sure how well it would work out in practice.
To clarify, I’m not saying that the sim couldn’t be run like that. My claim is, rather, that if we are in a sim being run with varying levels of accuracy as suggested, then we should be able to detect it.
Consider, for the moment, a hill. That hill consists of a very large number of electrons, protons and neutrons. Assume for the moment that the hill is not the focus of a scientific experiment. Then, it may be that the hill is being simulated in some computationally cheaper manner than simulating every individual particle.
There are two options. Either the computationally cheaper manner is, in every single possible way, indistinguishable from simulating every individual particle. In this case, there is no reason to use the more computationally expensive method when a scientist tries to run an experiment which includes the hill; all hills can use the computationally cheaper method.
The alternative is that there is some way, however slight or subtle, in which the behaviour of the atoms in the hill differs from the behaviour of those same atoms when under scientific investigation. If this is the case, then it means that the scientific laws deduced from experiments on the hill will, in some subtle way, not match the behaviour of hills in general. In this case, there must be a detectable difference; in effect, under certain circumstances hills are following a different set of physical laws and sooner or later someone is going to notice that. (Note that this can be avoided, to some degree, by saving the sim at regular intervals; if someone notices the difference between the approximation and a hill made out of properly simulated atoms, then the simulation is reloaded from a save just before that difference happened and the approximation is updated to hide that detail. This can’t be done forever—after a few iterations, the approximation’s computational complexity will begin to approach the computational complexity of the atomic hill in any case, plus you’ve now wasted a lot of cycles running sims that had no purpose other than refining the approximation—but it could stave off discovery for a period, at least).
Having said that, though, another thought has occurred to me. There’s no guarantee (if we are in a sim) that the laws of physics are the same in our universe as they are in baseline; we may, in fact, have laws of physics specifically designed to be easier to compute. Consider, for example, the uncertainty principle. Now, I’m no quantum physicist, but as I understand it, the more precisely a particle’s position can be determined, the less precisely its momentum can be known—and, at the same time, the more precisely its momentum is known, the less precisely its position can be found. Now, in terms of a simulation, the uncertainty principle means that the computer running the simulation need not keep track of the position and momentum of every particle at full precision. It may, instead, keep track of some single combined value (a real quantum physicist might be able to guess at what that value is, and how position and/or momentum can be derived from it). And given the number of atoms in the observable universe, the data storage saved by this is massive (and suggests that Baseline’s storage space, while immense, is not infinite).
Of course, like any good simplification, the Uncertainty Principle is applied everywhere, whether a scientist is looking at the data or not.
What is and isn’t simulated to a high degree of detail can be determined dynamically. If people decide they want to investigate a hill, some system watching the sim can notice that and send a signal that the sim needs to make the hill observations correspond with quantum/etc. physics. This shouldn’t be hard to do. For instance, if the theory predicts observation X +/- Y, you can generate some random numbers centered around X with std. dev. Y. Or you can make them somewhat different if the theory is wrong and to account for model uncertainty.
If the scientists would do lots of experiments that are connected in complex ways such that consistency requires them to come out with certain complex relationships, you’d need to get somewhat more fancy with faking the measurements. Worst case, you can actually do a brute-force sim of that part of physics for the brief period required. And yeah, as you say, you can always revert to a previous state if you screw up and the scientists find something amiss, though you probably wouldn’t want to do that too often.
SMBC
This is kind of where the trouble starts to come in. What happens when the scientist, instead of looking at hills in the present, turns instead to look at historical records of hills a hundred years in the past?
If he has actually found some complex interaction that the simplified model fails to cover, then he has a chance of finding evidence of living in a simulation; yes, the simulation can be rolled back a hundred years and then re-run from that point onwards, but is that really more computationally efficient than just running the full physics all the time? (Especially if you have to regularly keep going back to update the model).
This is where his fellow scientists call him a “crackpot” because he can’t replicate any of his experimental findings. ;)
More seriously, the sim could modify his observations to make him observe the right things. For instance, change the photons entering his eyes to be in line with what they should be, change the historical records a la 1984, etc. Or let him add an epicycle to his theory to account for the otherwise unexplainable results.
In practice, I doubt atomic-level effects are ever going to produce clearly observable changes outside of physics labs, so 99.99999% of the time the simulators wouldn’t have to worry about this as long as they simulated macroscopic objects to enough detail.
Well, yes, I’m not saying that this would make it easy to discover evidence that we are living in a simulation. It would simply make it possible to do so.
In the past, when I expressed worries about the difficulties associated to far-future meme-spreading, which you favor as an alternative to extinction-risk reduction, you said you thought there was a significant chance of a singleton-dominated future. Such a singleton, you argued, would provide the necessary causal stability for targeted meme-spreading to successfully influence our distant descendants. But now you seem to be implying that, other things equal, far-future meme-spreading is several orders of magnitude less likely to succeed than short-term interventions (including interventions aimed at reducing near-term risk of extinction, which plausibly represents a significant fraction of total extinction risk). I find these two views hard to reconcile.