Do you have preferred arguments (or links to preferred arguments) for/against these claims? From where I stand:
Point 1 looks to be less a positive claim and more a policy criticism (for which I’d need to know what specifically you dislike about the policy in question to respond in more depth), points 2 and 3 are straightforwardly true statements on my model (albeit I’d somewhat weaken my phrasing of point 3; I don’t necessarily think agency is “automatic”, although I do consider it quite likely to arise by default), point 4 seems likewise true, because the argmax function is only sensitive to the sign of the difference in magnitude, not the difference itself, point 5 is the kind of thing that would benefit immensely from liberal usage of hyperlinks, point 6 is again a policy criticism in need of corresponding explanation, point 7 seems ill-supported and would benefit from more concrete analysis (both numerically i.e. where are you getting your numbers, and probabilistically i.e. how are you assigning your likelihoods), and point 8 again seems like the kind of thing where links would be immensely beneficial.
On the whole, I think your comment generates more heat than light, and I think there were significantly better moves available to you if your aim was to open a discussion (several of which I predict would have resulted in comments I would counterfactually have upvoted). As it is, however, your comment does not meet the bar for discourse quality I would like to see for comments on LW, which is why I have given it a strong downvote (and a weak disagree-vote).
one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabilities to consider all binding sites and higher level system interactions at the same (this is how a superintelligence could perform medicine without unexpected side effects) is obviously far above human level.
This is not possible per the laws of physics. Intelligence isn’t the only factor. I don’t think we can have a reasonable discussion if you are going to maintain a persistent belief in magic. Note by foom I am claiming you believe in a system that solely based on a superior algorithm will immediately take over the planet. It is not affected by compute, difficulty in finding a recursively better algorithm, diminishing returns on intelligence in most tasks, or money/robotics. I claim each of these obstacles takes time to clear. (time = decades)
Who says the system needs to be agentic at all or long running? This is bad design. EY is not a SWE.
This is irrational because no discount rate. Risking a nuclear war raises the pkill of millions of people now. The quadrillions of people this could ‘save’ may never exist because of many unknowns, hence there needs to be a large discount rate.
This is also 6.
CAIS is an extension of stateless microservices, and is how all reliable software built now works. Giving the machines self modification or a long running goal is not just bad because it’s AI, it’s generally bad practice.
one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabilities to consider all binding sites and higher level system interactions at the same (this is how a superintelligence could perform medicine without unexpected side effects) is obviously far above human level.
To be clear: I am straightforwardly in favor of longevity research—and, separately, I am agnostic on the question of whether superhuman general intelligence is necessary to crack said research; that seems like a technical challenge, and one that I presently see no reason to consider unsolvable at current levels of intelligence. (I am especially skeptical of the part where you seemingly think a solution will look like “analyzing thousands of simultaneous interactions across millions of patients and model all binding sites in a living human”—especially as you didn’t argue for this claim at all.) As a result, the dichotomy you present here seems clearly unjustified.
(You are, in fact, justified in arguing that doing longevity research without increased intelligence of some kind will cause the process to take longer, but (i) that’s a different argument from the one you’re making, with accordingly different costs/benefits, and (ii) even accepting this modified version of the argument, there are more ways to get to “increased intelligence” than AI research—human intelligence enhancement, for example, seems like another viable road, and a significantly safer one at that.)
This is not possible per the laws of physics. Intelligence isn’t the only factor. I don’t think we can have a reasonable discussion if you are going to maintain a persistent belief in magic. Note by foom I am claiming you believe in a system that solely based on a superior algorithm will immediately take over the planet. It is not affected by compute, difficulty in finding a recursively better algorithm, diminishing returns on intelligence in most tasks, or money/robotics. I claim each of these obstacles takes time to clear. (time = decades)
I dispute that FOOM-like scenarios are ruled out by laws of physics, or that this position requires anything akin to a belief in “magic”. (That I—and other proponents of this view—would dispute this characterization should have been easily predictable to you in advance, and so your choice to adopt this phrasing regardless speaks ill of your ability to model opposing views.)
The load-bearing claim here (or rather, set of claims) is, of course, located within the final parenthetical: (“time = decades”). You appear to be using this claim as evidence to justify your previous assertions that FOOM is physically impossible/”magic”, but this ignores that the claim that each of the obstacles you listed represents a decades-long barrier is itself in need of justification.
(Additionally, if we were to take your model as fact—and hence accept that any possible AI systems would require decades to scale to a superhuman level of capability—this significantly weakens the argument from aging-related costs you made in your point 1, by essentially nullifying the point that AI systems would significantly accelerate longevity research.)
Who says the system needs to be agentic at all or long running? This is bad design. EY is not a SWE.
Agency does not need to be built into the system as a design property, on EY’s model or on mine; it is something that tends to naturally arise (on my model) as capabilities increase, even from systems whose inherent event/runtime loop does not directly map to an agent-like frame. You have not, so far as I can tell, engaged with this model at all; and in the absence of such engagement “EY is not a SWE” is not a persuasive counterargument but a mere ad hominem.
(Your response folded point 4 into point 3, so I will move on to point 5.)
Thank you very much for the links! For the first post you link, the top comment is from EY, in direct contradiction to your initial statement here:
He has ignored reasonable and buildable AGI systems proposed by Eric fucking Drexler himself, on this very site, and seems to pretend the idea doesn’t exist.
Given the factual falsity of this claim, I would request that you explicitly acknowledge it as false, and retract it; and (hopefully) exercise greater moderation (and less hyperbole) in your claims about other people’s behavior in the future.
In any case—setting aside the point that your initial allegation was literally false—EY’s comment on that post makes [what looks to me like] a reasonably compelling argument against the core of Drexler’s proposal. There follows some back-and-forth between the two (Yudkowsky and Drexler) on this point. It does not appear to me from that thread that there is anything close to a consensus that Yudkowsky was wrong and Drexler was right; both commenters received large amounts of up- and agree-votes throughout.
Given this, I think the takeaway you would like for me to derive from these posts is less clear than you would like it to be, and the obvious remedy would be to state specifically what it is you think is wrong with EY’s response(s). Is it the argument you made in this comment? If so, that seems essentially to be a restatement of your point 2, phrased interrogatively rather than declaratively—and my objection to that point can be considered to apply here as well.
This is irrational because no discount rate. Risking a nuclear war raises the pkill of millions of people now. The quadrillions of people this could ‘save’ may never exist because of many unknowns, hence there needs to be a large discount rate.
P(doom) is unacceptably high under the current trajectory (on EY’s model). Do you think that the people who are alive today will not be counted towards the kill count of a future unaligned AGI? The value that stands to be destroyed (on EY’s model) consists, not just of these quadrillions of future individuals, but each and every living human who would be killed in a (hypothetical) nuclear exchange, and then some.
You can dispute EY’s model (though I would prefer you do so in more detail than you have up until now—see my replies to your other points), but disputing his conclusion based on his model (which is what you are doing here) is a dead-end line of argument: accepting that ASI presents an unacceptably high existential risk makes the relevant tradeoffs quite stark, and not at all in doubt.
(As was the case with points 4⁄5, point 7 was folded into point 6, and so I will move on to the final point.)
CAIS is an extension of stateless microservices, and is how all reliable software built now works. Giving the machines self modification or a long running goal is not just bad because it’s AI, it’s generally bad practice.
Setting aside that you (again) didn’t provide a link, my current view is that Richard Ngo has provided some reasonable commentary on CAIS as an approach; my own view largely accords with his on this point and so I think claiming this as the one definitive approach to end all AI safety approaches (or anything similar) is massively overconfident.
And if you don’t think that—which I would hope you don’t!—then I would move to asking what, exactly, you would like to convey by this point. “CAIS exists” is true, and not helpful; “CAIS seems promising to me” is perhaps a weaker but more defensible claim than the outlandish one given above, but nonetheless doesn’t seem strong enough to justify your initial statement:
Alignment proposals he has described are basically are impossible, while CAIS is just straightforward engineering and we don’t need to delay anything it’s the default approach.
So, unfortunately, I’m left at present with a conclusion that can be summarized quite well by taking the final sentence of your great-grandparent comment, and performing a simple replacement of one name with another:
Unfortunately I have to start to conclude [Gerald Monroe] is not rational or worth paying attention to, which is ironic.
At the end of the day, either robot doubling times and machinery production rates and real world chip production rates and time for robots to collect scientific data and time for compute to search the algorithm space takes decades or or does not.
At the end of the day, EY continues to internalize CAIS in future arguments or he does not. It was not a false claim, I am saying he pretends it doesn’t exist now in talks about alignment he made after Drexlers post.
Either you believe in ground truth reality or you do not. I don’t have the time or interest to get sucked I to a wordcel definition of words fight. Either ground truth reality supports the following claims:
EY and you continue to factor in cais, which is modern software engineering, or you don’t
The worst of 4 factors: data, compute, algorithms, robotics/money takes decades to foom or it doesn’t.
If ground truth reality supports 1 and 2 I am right, if it does not I am wrong. Note foom means “become strong enough to conquer the planet”. Slowing down aging enough for LEV is a far lesser goal and thus your argument there is also false.
Pinning my beliefs to falsifiable things is rational.
You continue to assert things without justification, which is fine insofar as your goal is not to persuade others. And perhaps this isn’t your goal! Perhaps your goal is merely to make it clear what your beliefs are, without necessarily providing the reasoning/evidence/argumentation that would convince a neutral observer to believe the same things you do.
But in that case, you are not, in fact, licensed to act surprised, and to call others “irrational”, if they fail to update to your position after merely seeing it stated. You haven’t actually given anyone a reason they should update to your position, and so—if they weren’t already inclined to agree with you—failing to agree with you is not “irrational”, “wordcel”, or whatever other pejorative you are inclined to use, but merely correct updating procedure.
So what are we left with, then? You seem to think that this sentence says something meaningful:
If ground truth reality supports 1 and 2 I am right, if it does not I am wrong.
but it is merely a tautology: “If I am right I am right, whereas if I am wrong I am wrong.” If there is additional substance to this statement of yours, I currently fail to see it. This statement can be made for any set of claims whatsoever, and so to observe it being made for a particular set of claims does not, in fact, serve as evidence for that set’s truth or falsity.
Of course, the above applies to your position, and also to my own, as well as to EY’s and to anyone else who claims to have a position on this topic. Does this thereby imply that all of these positions are equally plausible? No, I claim—no more so than, for example, “either I win the lottery or I don’t” implies a 50⁄50 spread on the outcome space. This, I claim, is structurally isomorphic to the sentence you emitted, and equally as invalid.
In order to argue that a particular possibility ought to be singled out as likelier than the others, requires more than just stating it and thereby privileging it with all of your probability mass. You must do the actual hard work of coming up with evidence, and interpreting that evidence so as to favor your model over competing models. This is work that you have not yet done, despite being many comments deep into this thread—and is therefore substantial evidence in my view that it is work you cannot do (else you could easily win this argument—or at the very least advance it substantially—by doing just that)!
Of course, you claim you are not here to do that. Too “wordcel”, or something along those lines. Well, good for you—but in that case I think the label “irrational” applies squarely to one participant in this conversation, and the name of that participant is not “Eliezer Yudkowsky”.
You’ve done an excellent job of arguing your points. It doesn’t mean they are correct, however.
Would you agree that if you made a perfect argument against the theory of relativity (numerous contemporary physicists did) it was still a waste of time?
In this context, let’s break open the object level argument. Because only the laws of physics get a vote—you don’t and I don’t.
The object level argument is that the worst of the below determines if foom is possible:
1. Compute. Right now there is a shortage of compute, and with a bit of rough estimating the shortage is actually pretty severe. Nvidia makes approximately 60 million GPUs per year, of which 500k-1000k are A/H100s. This is based on taking their data center revenue (source: wsj) and dividing by an estimated cost per chipset of (10k, 20k). Compute production can be increased, but the limit would be all the world’s 14nm or better silicon dedicated to producing AI compute. This can be increased but it takes time. Let’s estimate how many worth of labor an AI system with access to all new compute (old compute doesn’t matter due to a lack of interconnect bandwidth). If a GPT-4 instance requires a full DGX “supercompute” node, which is 8 H100s with 80 Gb of memory each, (so approximately 1T weights in fp16), how much would it require for realtime multimodal operation? Let’s assume 4x the compute, which may be a gross underestimate. So 8 more cards are running at least 1 robot in real time, 8 more are processing images for vision, and 8 more for audio i/o and helper systems for longer duration memory context.
So then if all new cards are used for inference, 1m/32 = 31,250 “instances” worth of labor. Since they operate 24 hours a day this is equivalent to perhaps 100k humans? If all of the silicon Nvidia has the contract rights to build is going into H100s, this scales by about 30 times, or 3m humans. And most of those instances cannot be involved in world takeover efforts, they have to be collecting revenue for their owners. If Nvidia gets all the silicon in the world (this may happen as it can outbid everyone else) it gives them approximately another oom. Still not enough. There are bottlenecks on increasing chip production. This also also links to my next point:
2. Algorithm search space. Every search of a possible AGI design that is better than what you have requires a massive training run. Each training run occupies tens of thousands of GPUs for around 1 month, give or take. (source: llama paper, which was sub GPT-4 in perf. They needed 2048 A100s for 3 weeks for 65b). Presumably searching this space is a game of diminishing returns : to find an algorithm better than the best you currently have requires increasingly large numbers of searches and compute. Compute that can’t be spent on exploiting the algorithm you have right now.
3. Robotics/money : for an AGI to actually take over, it has to redirect resources to itself. And this assumes humans don’t simply use CAIS and have thousands of stateless AI systems separately handling these real world tasks. Robotics is especially problematic : you know and I know how poor the current hardware is, and there are budget cuts and layoffs in many of the cutting edge labs. The best robotics hardware company, boston dynamics, keeps getting passed around as each new owner can’t find a way to make money from it. So it takes time—time to develop new robotics hardware. Time to begin mass production. Time for the new robotics produced by the first round of production to begin assisting with the manufacture of itself. Time for the equipment in the real world to begin to fail from early failures after a few thousand hours, then the design errors to be found and fixed. This puts years on the clock, likely decades. It requires humans to both build massive amounts of robotic equipment, and then put it all under the control of either insecure narrow AI task performing systems, or to stupidly centralize control to large AGIs.
4. Data. This is explained much better by https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience . The chaos means that in order for any system to develop new tools for novel tasks, the system needs sufficiently high quality information about the task domain or the tool building is not possible. This prevents blind nanoforge building (what EY is talking about when he says someone could bootstrap to diamondoid nanotechnology from wet protein biology) or blind species killing bioweapon construction.
“blind” means ‘without sufficient direct observation and a large number of domain experiments’. Meaning I am claiming it is flat impossible to accomplish these things without a large amount of automated equipment or human lab technicians acting as paid willing stooges to investigate the domain in question and empirically develop step by step a path to a solution.
Speaking in probabilities, the above is a joint probability. Meaning that so long as the AGI system is limited by any of the 4 terms, it is throttled and unable to proceed with foom. Putting some terms on it:
90% for compute (we have direct evidence), 95% for robotics (direct evidence), 99% for data (near certain evidence), 50% for searching AGI spaces (quite possible there are findable much stronger algorithms in the search space)
So the probability of foom is the series probability of all 4, as all 4 must be their not form.
so 0.1 x 0.05 x 0.01 x 0.5 = 0.000025 pfoom
Notice that you can put quite a lot of doubt on my estimates. Say you do not believe me, and believe the chance I am right on each term is 25%.
pfoom = 0.32
Not enough to be as confident we’re all dead as EY is. Though I concede a 32% chance of the end of humanity isn’t super great.
As for the last part :
CAIS means you use small, isolated AI systems that tested extremely well on the training set. You check with an autoencoder how compressible to the state space of the training set the current inference set fed to the agent is for this timestamp. What you are doing is validating that it is impossible for the AI system to know that the current input is not from the training set. This both makes deception difficult and it makes the probability of success for the AI system on the current task to be similar to how it benchmarked in the training set.
If the real world input is outside that space, you transfer control to a different agent, which in the worst case is an embedded control system using classic control techniques.
This is currently how autonomous cars more or less work (depends on the team and the project).
I have several yoe actually working on embedded ML systems, and many more years on embedded controls. The above is correct. Eliezer Yudkowsky was wrong to dismiss it.
Note the Eliezer has mentioned that ML teams are going to need to find “some way” to get from—I think he estimated about an 80% chance that a GPT-3 style agent is correct on a question—to the many 9s of real world reliability.
Stateless, well isolated systems is one of the few ways human engineers know how to accomplish that. So we may get a significant amount of AI safety by default simply to meet requirements.
one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabilities to consider all binding sites and higher level system interactions at the same (this is how a superintelligence could perform medicine without unexpected side effects) is obviously far above human level.
There are alternative mitigations to the problem:
Anti aging research
Cryonics
I agree that it’s bad that most people currently alive are apparently going to die. However I think that since mitigations like that are much less risky we should pursue them rather than try to rush AGI.
I think the odds of success (epistemic status: I went to medical school but dropped out) are low if you mean “humans without help from any system more capable than current software” are researching aging and cryonics alone.
They are both extremely difficult problems.
So the tradeoff is “everyone currently alive and probably their children” vs “future people who might exist”.
I obviously lean one way but this is what the choice is between. Certain death for everyone alive (by not improving AGI capabilities) in exchange for preventing possible death for everyone alive sooner and preventing the existence of future people who may never exist no matter the timeline.
Do you have preferred arguments (or links to preferred arguments) for/against these claims? From where I stand:
Point 1 looks to be less a positive claim and more a policy criticism (for which I’d need to know what specifically you dislike about the policy in question to respond in more depth), points 2 and 3 are straightforwardly true statements on my model (albeit I’d somewhat weaken my phrasing of point 3; I don’t necessarily think agency is “automatic”, although I do consider it quite likely to arise by default), point 4 seems likewise true, because the argmax function is only sensitive to the sign of the difference in magnitude, not the difference itself, point 5 is the kind of thing that would benefit immensely from liberal usage of hyperlinks, point 6 is again a policy criticism in need of corresponding explanation, point 7 seems ill-supported and would benefit from more concrete analysis (both numerically i.e. where are you getting your numbers, and probabilistically i.e. how are you assigning your likelihoods), and point 8 again seems like the kind of thing where links would be immensely beneficial.
On the whole, I think your comment generates more heat than light, and I think there were significantly better moves available to you if your aim was to open a discussion (several of which I predict would have resulted in comments I would counterfactually have upvoted). As it is, however, your comment does not meet the bar for discourse quality I would like to see for comments on LW, which is why I have given it a strong downvote (and a weak disagree-vote).
one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabilities to consider all binding sites and higher level system interactions at the same (this is how a superintelligence could perform medicine without unexpected side effects) is obviously far above human level.
This is not possible per the laws of physics. Intelligence isn’t the only factor. I don’t think we can have a reasonable discussion if you are going to maintain a persistent belief in magic. Note by foom I am claiming you believe in a system that solely based on a superior algorithm will immediately take over the planet. It is not affected by compute, difficulty in finding a recursively better algorithm, diminishing returns on intelligence in most tasks, or money/robotics. I claim each of these obstacles takes time to clear. (time = decades)
Who says the system needs to be agentic at all or long running? This is bad design. EY is not a SWE.
This is an extension of (3)
https://www.lesswrong.com/posts/HByDKLLdaWEcA2QQD/applying-superintelligence-without-collusion https://www.lesswrong.com/posts/5hApNw5f7uG8RXxGS/the-open-agency-model
This is irrational because no discount rate. Risking a nuclear war raises the pkill of millions of people now. The quadrillions of people this could ‘save’ may never exist because of many unknowns, hence there needs to be a large discount rate.
This is also 6.
CAIS is an extension of stateless microservices, and is how all reliable software built now works. Giving the machines self modification or a long running goal is not just bad because it’s AI, it’s generally bad practice.
To be clear: I am straightforwardly in favor of longevity research—and, separately, I am agnostic on the question of whether superhuman general intelligence is necessary to crack said research; that seems like a technical challenge, and one that I presently see no reason to consider unsolvable at current levels of intelligence. (I am especially skeptical of the part where you seemingly think a solution will look like “analyzing thousands of simultaneous interactions across millions of patients and model all binding sites in a living human”—especially as you didn’t argue for this claim at all.) As a result, the dichotomy you present here seems clearly unjustified.
(You are, in fact, justified in arguing that doing longevity research without increased intelligence of some kind will cause the process to take longer, but (i) that’s a different argument from the one you’re making, with accordingly different costs/benefits, and (ii) even accepting this modified version of the argument, there are more ways to get to “increased intelligence” than AI research—human intelligence enhancement, for example, seems like another viable road, and a significantly safer one at that.)
I dispute that FOOM-like scenarios are ruled out by laws of physics, or that this position requires anything akin to a belief in “magic”. (That I—and other proponents of this view—would dispute this characterization should have been easily predictable to you in advance, and so your choice to adopt this phrasing regardless speaks ill of your ability to model opposing views.)
The load-bearing claim here (or rather, set of claims) is, of course, located within the final parenthetical: (“time = decades”). You appear to be using this claim as evidence to justify your previous assertions that FOOM is physically impossible/”magic”, but this ignores that the claim that each of the obstacles you listed represents a decades-long barrier is itself in need of justification.
(Additionally, if we were to take your model as fact—and hence accept that any possible AI systems would require decades to scale to a superhuman level of capability—this significantly weakens the argument from aging-related costs you made in your point 1, by essentially nullifying the point that AI systems would significantly accelerate longevity research.)
Agency does not need to be built into the system as a design property, on EY’s model or on mine; it is something that tends to naturally arise (on my model) as capabilities increase, even from systems whose inherent event/runtime loop does not directly map to an agent-like frame. You have not, so far as I can tell, engaged with this model at all; and in the absence of such engagement “EY is not a SWE” is not a persuasive counterargument but a mere ad hominem.
(Your response folded point 4 into point 3, so I will move on to point 5.)
Thank you very much for the links! For the first post you link, the top comment is from EY, in direct contradiction to your initial statement here:
Given the factual falsity of this claim, I would request that you explicitly acknowledge it as false, and retract it; and (hopefully) exercise greater moderation (and less hyperbole) in your claims about other people’s behavior in the future.
In any case—setting aside the point that your initial allegation was literally false—EY’s comment on that post makes [what looks to me like] a reasonably compelling argument against the core of Drexler’s proposal. There follows some back-and-forth between the two (Yudkowsky and Drexler) on this point. It does not appear to me from that thread that there is anything close to a consensus that Yudkowsky was wrong and Drexler was right; both commenters received large amounts of up- and agree-votes throughout.
Given this, I think the takeaway you would like for me to derive from these posts is less clear than you would like it to be, and the obvious remedy would be to state specifically what it is you think is wrong with EY’s response(s). Is it the argument you made in this comment? If so, that seems essentially to be a restatement of your point 2, phrased interrogatively rather than declaratively—and my objection to that point can be considered to apply here as well.
P(doom) is unacceptably high under the current trajectory (on EY’s model). Do you think that the people who are alive today will not be counted towards the kill count of a future unaligned AGI? The value that stands to be destroyed (on EY’s model) consists, not just of these quadrillions of future individuals, but each and every living human who would be killed in a (hypothetical) nuclear exchange, and then some.
You can dispute EY’s model (though I would prefer you do so in more detail than you have up until now—see my replies to your other points), but disputing his conclusion based on his model (which is what you are doing here) is a dead-end line of argument: accepting that ASI presents an unacceptably high existential risk makes the relevant tradeoffs quite stark, and not at all in doubt.
(As was the case with points 4⁄5, point 7 was folded into point 6, and so I will move on to the final point.)
Setting aside that you (again) didn’t provide a link, my current view is that Richard Ngo has provided some reasonable commentary on CAIS as an approach; my own view largely accords with his on this point and so I think claiming this as the one definitive approach to end all AI safety approaches (or anything similar) is massively overconfident.
And if you don’t think that—which I would hope you don’t!—then I would move to asking what, exactly, you would like to convey by this point. “CAIS exists” is true, and not helpful; “CAIS seems promising to me” is perhaps a weaker but more defensible claim than the outlandish one given above, but nonetheless doesn’t seem strong enough to justify your initial statement:
So, unfortunately, I’m left at present with a conclusion that can be summarized quite well by taking the final sentence of your great-grandparent comment, and performing a simple replacement of one name with another:
Well argued but wrong.
At the end of the day, either robot doubling times and machinery production rates and real world chip production rates and time for robots to collect scientific data and time for compute to search the algorithm space takes decades or or does not.
At the end of the day, EY continues to internalize CAIS in future arguments or he does not. It was not a false claim, I am saying he pretends it doesn’t exist now in talks about alignment he made after Drexlers post.
Either you believe in ground truth reality or you do not. I don’t have the time or interest to get sucked I to a wordcel definition of words fight. Either ground truth reality supports the following claims:
EY and you continue to factor in cais, which is modern software engineering, or you don’t
The worst of 4 factors: data, compute, algorithms, robotics/money takes decades to foom or it doesn’t.
If ground truth reality supports 1 and 2 I am right, if it does not I am wrong. Note foom means “become strong enough to conquer the planet”. Slowing down aging enough for LEV is a far lesser goal and thus your argument there is also false.
Pinning my beliefs to falsifiable things is rational.
You continue to assert things without justification, which is fine insofar as your goal is not to persuade others. And perhaps this isn’t your goal! Perhaps your goal is merely to make it clear what your beliefs are, without necessarily providing the reasoning/evidence/argumentation that would convince a neutral observer to believe the same things you do.
But in that case, you are not, in fact, licensed to act surprised, and to call others “irrational”, if they fail to update to your position after merely seeing it stated. You haven’t actually given anyone a reason they should update to your position, and so—if they weren’t already inclined to agree with you—failing to agree with you is not “irrational”, “wordcel”, or whatever other pejorative you are inclined to use, but merely correct updating procedure.
So what are we left with, then? You seem to think that this sentence says something meaningful:
but it is merely a tautology: “If I am right I am right, whereas if I am wrong I am wrong.” If there is additional substance to this statement of yours, I currently fail to see it. This statement can be made for any set of claims whatsoever, and so to observe it being made for a particular set of claims does not, in fact, serve as evidence for that set’s truth or falsity.
Of course, the above applies to your position, and also to my own, as well as to EY’s and to anyone else who claims to have a position on this topic. Does this thereby imply that all of these positions are equally plausible? No, I claim—no more so than, for example, “either I win the lottery or I don’t” implies a 50⁄50 spread on the outcome space. This, I claim, is structurally isomorphic to the sentence you emitted, and equally as invalid.
In order to argue that a particular possibility ought to be singled out as likelier than the others, requires more than just stating it and thereby privileging it with all of your probability mass. You must do the actual hard work of coming up with evidence, and interpreting that evidence so as to favor your model over competing models. This is work that you have not yet done, despite being many comments deep into this thread—and is therefore substantial evidence in my view that it is work you cannot do (else you could easily win this argument—or at the very least advance it substantially—by doing just that)!
Of course, you claim you are not here to do that. Too “wordcel”, or something along those lines. Well, good for you—but in that case I think the label “irrational” applies squarely to one participant in this conversation, and the name of that participant is not “Eliezer Yudkowsky”.
You’ve done an excellent job of arguing your points. It doesn’t mean they are correct, however.
Would you agree that if you made a perfect argument against the theory of relativity (numerous contemporary physicists did) it was still a waste of time?
In this context, let’s break open the object level argument. Because only the laws of physics get a vote—you don’t and I don’t.
The object level argument is that the worst of the below determines if foom is possible:
1. Compute. Right now there is a shortage of compute, and with a bit of rough estimating the shortage is actually pretty severe. Nvidia makes approximately 60 million GPUs per year, of which 500k-1000k are A/H100s. This is based on taking their data center revenue (source: wsj) and dividing by an estimated cost per chipset of (10k, 20k). Compute production can be increased, but the limit would be all the world’s 14nm or better silicon dedicated to producing AI compute. This can be increased but it takes time.
Let’s estimate how many worth of labor an AI system with access to all new compute (old compute doesn’t matter due to a lack of interconnect bandwidth). If a GPT-4 instance requires a full DGX “supercompute” node, which is 8 H100s with 80 Gb of memory each, (so approximately 1T weights in fp16), how much would it require for realtime multimodal operation? Let’s assume 4x the compute, which may be a gross underestimate. So 8 more cards are running at least 1 robot in real time, 8 more are processing images for vision, and 8 more for audio i/o and helper systems for longer duration memory context.
So then if all new cards are used for inference, 1m/32 = 31,250 “instances” worth of labor. Since they operate 24 hours a day this is equivalent to perhaps 100k humans? If all of the silicon Nvidia has the contract rights to build is going into H100s, this scales by about 30 times, or 3m humans. And most of those instances cannot be involved in world takeover efforts, they have to be collecting revenue for their owners. If Nvidia gets all the silicon in the world (this may happen as it can outbid everyone else) it gives them approximately another oom. Still not enough. There are bottlenecks on increasing chip production. This also also links to my next point:
2. Algorithm search space. Every search of a possible AGI design that is better than what you have requires a massive training run. Each training run occupies tens of thousands of GPUs for around 1 month, give or take. (source: llama paper, which was sub GPT-4 in perf. They needed 2048 A100s for 3 weeks for 65b). Presumably searching this space is a game of diminishing returns : to find an algorithm better than the best you currently have requires increasingly large numbers of searches and compute. Compute that can’t be spent on exploiting the algorithm you have right now.
3. Robotics/money : for an AGI to actually take over, it has to redirect resources to itself. And this assumes humans don’t simply use CAIS and have thousands of stateless AI systems separately handling these real world tasks. Robotics is especially problematic : you know and I know how poor the current hardware is, and there are budget cuts and layoffs in many of the cutting edge labs. The best robotics hardware company, boston dynamics, keeps getting passed around as each new owner can’t find a way to make money from it. So it takes time—time to develop new robotics hardware. Time to begin mass production. Time for the new robotics produced by the first round of production to begin assisting with the manufacture of itself. Time for the equipment in the real world to begin to fail from early failures after a few thousand hours, then the design errors to be found and fixed. This puts years on the clock, likely decades. It requires humans to both build massive amounts of robotic equipment, and then put it all under the control of either insecure narrow AI task performing systems, or to stupidly centralize control to large AGIs.
4. Data. This is explained much better by https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience . The chaos means that in order for any system to develop new tools for novel tasks, the system needs sufficiently high quality information about the task domain or the tool building is not possible. This prevents blind nanoforge building (what EY is talking about when he says someone could bootstrap to diamondoid nanotechnology from wet protein biology) or blind species killing bioweapon construction.
“blind” means ‘without sufficient direct observation and a large number of domain experiments’. Meaning I am claiming it is flat impossible to accomplish these things without a large amount of automated equipment or human lab technicians acting as paid willing stooges to investigate the domain in question and empirically develop step by step a path to a solution.
Speaking in probabilities, the above is a joint probability. Meaning that so long as the AGI system is limited by any of the 4 terms, it is throttled and unable to proceed with foom. Putting some terms on it:
90% for compute (we have direct evidence), 95% for robotics (direct evidence), 99% for data (near certain evidence), 50% for searching AGI spaces (quite possible there are findable much stronger algorithms in the search space)
So the probability of foom is the series probability of all 4, as all 4 must be their not form.
so 0.1 x 0.05 x 0.01 x 0.5 = 0.000025 pfoom
Notice that you can put quite a lot of doubt on my estimates. Say you do not believe me, and believe the chance I am right on each term is 25%.
pfoom = 0.32
Not enough to be as confident we’re all dead as EY is. Though I concede a 32% chance of the end of humanity isn’t super great.
As for the last part :
CAIS means you use small, isolated AI systems that tested extremely well on the training set. You check with an autoencoder how compressible to the state space of the training set the current inference set fed to the agent is for this timestamp. What you are doing is validating that it is impossible for the AI system to know that the current input is not from the training set. This both makes deception difficult and it makes the probability of success for the AI system on the current task to be similar to how it benchmarked in the training set.
If the real world input is outside that space, you transfer control to a different agent, which in the worst case is an embedded control system using classic control techniques.
This is currently how autonomous cars more or less work (depends on the team and the project).
I have several yoe actually working on embedded ML systems, and many more years on embedded controls. The above is correct. Eliezer Yudkowsky was wrong to dismiss it.
Note the Eliezer has mentioned that ML teams are going to need to find “some way” to get from—I think he estimated about an 80% chance that a GPT-3 style agent is correct on a question—to the many 9s of real world reliability.
Stateless, well isolated systems is one of the few ways human engineers know how to accomplish that. So we may get a significant amount of AI safety by default simply to meet requirements.
Of course, Eliezer knows about CAIS. He just thinks that it is a clever idea that has no chance to work.
It’s very funny that you think AI can solve very complex problem of aging, but don’t believe that AI can solve much simpler problem “kill everyone”.
There are alternative mitigations to the problem:
Anti aging research
Cryonics
I agree that it’s bad that most people currently alive are apparently going to die. However I think that since mitigations like that are much less risky we should pursue them rather than try to rush AGI.
I think the odds of success (epistemic status: I went to medical school but dropped out) are low if you mean “humans without help from any system more capable than current software” are researching aging and cryonics alone.
They are both extremely difficult problems.
So the tradeoff is “everyone currently alive and probably their children” vs “future people who might exist”.
I obviously lean one way but this is what the choice is between. Certain death for everyone alive (by not improving AGI capabilities) in exchange for preventing possible death for everyone alive sooner and preventing the existence of future people who may never exist no matter the timeline.