I’m an AGI safety / AI alignment researcher in Boston with a particular focus on brain algorithms. Research Fellow at Astera. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed , X/Twitter , Bluesky , Mastodon , Threads , GitHub , Wikipedia , Physics-StackExchange , LinkedIn
Steven Byrnes
How is it that some tiny number of man made mirror life forms would be such a threat to the millions of naturally occurring life forms, but those millions of naturally occurring life forms would not be an absolutely overwhelming symmetrical threat to those few man made mirror forms?
Can’t you ask the same question for any invasive species? Yet invasive species exist. “How is it that some people putting a few Nile perch into Lake Victoria in the 1950s would cause ‘the extinction or near-extinction of several hundred native species’, but the native species of Lake Victoria would not be an absolutely overwhelming symmetrical threat to those Nile perch?”
If I’m not mistaking, you’ve already changed the wording
No, I haven’t changed anything in this post since Dec 11, three days before your first comment.
valid EA response … EA forum … EA principles …
This isn’t EA forum. Also, you shouldn’t equate “EA” with “concerned about AGI extinction”. There are plenty of self-described EAs who think that AGI extinction is astronomically unlikely and a pointless thing to worry about. (And also plenty of self-described EAs who think the opposite.)
prevent spam/limit stupid comments without causing distracting emotions
If Hypothetical Person X tends to write what you call “stupid comments”, and if they want to be participating on Website Y, and if Website Y wants to prevent Hypothetical Person X from doing that, then there’s an irreconcilable conflict here, and it seems almost inevitable that Hypothetical Person X is going to wind up feeling annoyed by this interaction. Like, Website Y can do things on the margin to make the transaction less unpleasant, but it’s surely going to be somewhat unpleasant under the best of circumstances.
(Pick any popular forum on the internet, and I bet that either (1) there’s no moderation process and thus there’s a ton of crap, or (2) there is a moderation process, and many of the people who get warned or blocked by that process are loudly and angrily complaining about how terrible and unjust and cruel and unpleasant the process was.)
Anyway, I don’t know why you’re saying that here-in-particular. I’m not a moderator, I have no special knowledge about running forums, and it’s way off-topic. (But if it helps, here’s a popular-on-this-site post related to this topic.)
[EDIT: reworded this part a bit.]
what would be a valid EA response to the arguments coming from people fitting these bullets:
Some are over-optimistic based on mistaken assumptions about the behavior of humans;
Some are over-optimistic based on mistaken assumptions about the behavior of human institutions;
That’s off-topic for this post so I’m probably not going to chat about it, but see this other comment too.
I think of myself as having high ability and willingness to respond to detailed object-level AGI-optimist arguments, for example:
Response to Dileep George: AGI safety warrants planning ahead
Response to Blake Richards: AGI, generality, alignment, & loss functions
LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem
…and more.
I don’t think this OP involves “picturing AI optimists as stubborn simpletons not being able to get persuaded finally that AI is a terrible existential risk”. (I do think AGI optimists are wrong, but that’s different!) At least, I didn’t intend to do that. I can potentially edit the post if you help me understand how you think I’m implying that, and/or you can suggest concrete wording changes etc.; I’m open-minded.
Yeah, the word “consummatory” isn’t great in general (see here), maybe I shouldn’t have used it. But I do think walking is an “innate behavior”, just as sneezing and laughing and flinching and swallowing are. E.g. decorticate rats can walk. As for human babies, they’re decorticate-ish in effect for the first months but still have a “walking / stepping reflex” from day 1 I think.
There can be an innate behavior, but also voluntary cortex control over when and whether it starts—those aren’t contradictory, IMO. This is always true to some extent—e.g. I can voluntarily suppress a sneeze. Intuitively, yeah, I do feel like I have more voluntary control over walking than I do over sneezing or vomiting. (Swallowing is maybe the same category as walking?) I still want to say that all these “innate behaviors” (including walking) are orchestrated by the hypothalamus and brainstem, but that there’s also voluntary control coming via cortex→hypothalamus and/or cortex→brainstem motor-type output channels.
I’m just chatting about my general beliefs. :) I don’t know much about walking in particular, and I haven’t read that particular paper (paywall & I don’t have easy access).
Oh I forgot, you’re one of the people who seems to think that the only conceivable reason that anyone would ever talk about AGI x-risk is because they are trying to argue in favor of, or against, whatever AI government regulation was most recently in the news. (Your comment was one of the examples that I mockingly linked in the intro here.)
If I think AGI x-risk is >>10%, and you think AGI x-risk is 1-in-a-gazillion, then it seems self-evident to me that we should be hashing out that giant disagreement first; and discussing what if any government regulations would be appropriate in light of AGI x-risk second. We’re obviously not going to make progress on the latter debate if our views are so wildly far apart on the former debate!! Right?
So that’s why I think you’re making a mistake whenever you redirect arguments about the general nature & magnitude & existence of the AGI x-risk problem into arguments about certain specific government policies that you evidently feel very strongly about.
(If it makes you feel any better, I have always been mildly opposed to the six month pause plan.)
I’ve long had a tentative rule-of-thumb that:
medial hypothalamus neuron groups are mostly “tracking a state variable”;
lateral hypothalamus neuron groups are mostly “turning on a behavior” (especially a “consummatory behavior”).
(…apart from the mammillary areas way at the posterior end of the hypothalamus. They’re their own thing.)
State variables are things like hunger, temperature, immune system status, fertility, horniness, etc.
I don’t have a great proof of that, just some indirect suggestive evidence. (Orexin, contiguity between lateral hypothalamus and PAG, various specific examples of people studying particular hypothalamus neurons.) Anyway, it’s hard to prove directly because changing a state variable can lead to taking immediate actions. And it’s really just a rule of thumb; I’m sure there’s exceptions, and it’s not really a bright-line distinction anyway.
The literature on the lateral hypothalamus is pretty bad. The main problem IIUC is that LH is “reticular”, i.e. when you look at it under the microscope you just see a giant mess of undifferentiated cells. That appearance is probably deceptive—appropriate stains can reveal nice little nuclei hiding inside the otherwise-undifferentiated mess. But I think only one or a few such hidden nuclei are known (the example I’m familiar with is “parvafox”).
Yup! I think discourse with you would probably be better focused on the 2nd or 3rd or 4th bullet points in the OP—i.e., not “we should expect such-and-such algorithm to do X”, but rather “we should expect people / institutions / competitive dynamics to do X”.
I suppose we can still come up with “demos” related to the latter, but it’s a different sort of “demo” than the algorithmic demos I was talking about in this post. As some examples:
Here is a “demo” that a leader of a large active AGI project can declare that he has a solution to the alignment problem, specific to his technical approach, but where the plan doesn’t stand up to a moment’s scrutiny.
Here is a “demo” that a different AGI project leader can declare that even trying to solve the alignment problem is already overkill, because misalignment is absurd and AGIs will just be nice, again for reasons that don’t stand up to a moment’s scrutiny.
(And here’s a “demo” that at least one powerful tech company executive might be fine with AGI wiping out humanity anyway.)
Here is a “demo” that if you give random people access to an AI, one of them might ask it to destroy humanity, just to see what would happen. Granted, I think this person had justified confidence that this particular AI would fail to destroy humanity …
… but here is a “demo” that people will in fact do experiments that threaten the whole world, even despite a long track record of rock-solid statistical evidence that the exact thing they’re doing is indeed a threat to the whole world, far out of proportion to its benefit, and that governments won’t stop them, and indeed that governments might even fund them.
Here is a “demo” that, given a tradeoff between AI transparency (English-language chain-of-thought) and AI capability (inscrutable chain-of-thought but the results are better), many people will choose the latter, and pat themselves on the back for a job well done.
Every week we get more “demos” that, if next-token prediction is insufficient to make a powerful autonomous AI agent that can successfully pursue long-term goals via out-of-the-box strategies, then many people will say “well so much the worse for next-token prediction”, and they’ll try to figure some other approach that is sufficient for that.
Here is a “demo” that companies are capable of ignoring or suppressing potential future problems when they would interfere with immediate profits.
Here is a “demo” that it’s possible for there to be a global catastrophe causing millions of deaths and trillions of dollars of damage, and then immediately afterwards everyone goes back to not even taking trivial measures to prevent similar or worse catastrophes from recurring.
Here is a “demo” that the arrival of highly competent agents with the capacity to invent technology and to self-reproduce is a big friggin’ deal.
Here is a “demo” that even small numbers of such highly competent agents can maneuver their way into dictatorial control over a much much larger population of humans.
I could go on and on. I’m not sure your exact views, so it’s quite possible that none of these are crux-y for you, and your crux lies elsewhere. :)
- 15 Dec 2024 21:32 UTC; 2 points) 's comment on A shortcoming of concrete demonstrations as AGI risk advocacy by (
Thanks!
I feel like the actual crux between you and OP is with the claim in post #2 that the brain operates outside the neuron doctrine to a significant extent.
I don’t think that’s quite right. Neuron doctrine is pretty specific IIUC. I want to say: when the brain does systematic things, it’s because the brain is running a legible algorithm that relates to those things. And then there’s a legible explanation of how biochemistry is running that algorithm. But the latter doesn’t need to be neuron-doctrine. It can involve dendritic spikes and gene expression and astrocytes etc.
All the examples here are real and important, and would impact the algorithms of an “adequate” WBE, but are mostly not “neuron doctrine”, IIUC.
Basically, it’s the thing I wrote a long time ago here: “If some [part of] the brain is doing something useful, then it’s humanly feasible to understand what that thing is and why it’s useful, and to write our own CPU code that does the same useful thing.” And I think “doing something useful” includes as a special case everything that makes me me.
I don’t get what you mean when you say stuff like “would be conscious (to the extent that I am), and it would be my consciousness (to a similar extent that I am),” since afaik you don’t actually believe that there is a fact of the matter as to the answers to these questions…
Just, it’s a can of worms that I’m trying not to get into right here. I don’t have a super well-formed opinion, and I have a hunch that the question of whether consciousness is a coherent thing is itself a (meta-level) incoherent question (because of the (A) versus (B) thing here). Yeah, just didn’t want to get into it, and I haven’t thought too hard about it anyway. :)
Right, what I actually think is that a future brain scan with future understanding could enable a WBE to run on a reasonable-sized supercomputer (e.g. <100 GPUs), and it would be capturing what makes me me, and would be conscious (to the extent that I am), and it would be my consciousness (to a similar extent that I am), but it wouldn’t be able to reproduce my exact train of thought in perpetuity, because it would be able to reproduce neither the input data nor the random noise of my physical brain. I believe that OP’s objection to “practical CF” is centered around the fact that you need an astronomically large supercomputer to reproduce the random noise, and I don’t think that’s relevant. I agree that “abstraction adequacy” would be a step in the right direction.
Causal closure is just way too strict. And it’s not just because of random noise. For example, suppose that there’s a tiny amount of crosstalk between my neurons that represent the concept “banana” and my neurons that represent the concept “Red Army”, just by random chance. And once every 5 years or so, I’m thinking about bananas, and then a few seconds later, the idea of the Red Army pops into my head, and if not for this cross-talk, it counterfactually wouldn’t have popped into my head. And suppose that I have no idea of this fact, and it has no impact on my life. This overlap just exists by random chance, not part of some systematic learning algorithm. If I got magical brain surgery tomorrow that eliminated that specific cross-talk, and didn’t change anything else, then I would obviously still be “me”, even despite the fact that maybe some afternoon 3 years from now I would fail to think about the Red Army when I otherwise might. This cross-talk is not randomness, and it does undermine “causal closure” interpreted literally. But I would still say that “abstraction adequacy” would be achieved by an abstraction of my brain that captured everything except this particular instance of cross-talk.
Yeah duh I know you’re not talking about MCMC. :) But MCMC is a simpler example to ensure that we’re on the same page on the general topic of how randomness can be involved in algorithms. Are we 100% on the same page about the role of randomness in MCMC? Is everything I said about MCMC super duper obvious from your perspective? If not, then I think we’re not yet ready to move on to the far-less-conceptually-straightforward topic of brains and consciousness.
I’m trying to get at what you mean by:
But imagine instead that (for sake of argument) it turned out that high-resolution details of temperature fluctuations throughout the brain had a causal effect on the execution of the algorithm such that the algorithm doesn’t do what it’s meant to do if you just take the average of those fluctuations.
I don’t understand what you mean here. For example:
If I run MCMC with a PRNG given random seed 1, it outputs 7.98 ± 0.03. If I use a random seed of 2, then the MCMC spits out a final answer of 8.01 ± 0.03. My question is: does the random seed entering MCMC “have a causal effect on the execution of the algorithm”, in whatever sense you mean by the phrase “have a causal effect on the execution of the algorithm”?
My MCMC code uses a PRNG that returns random floats between 0 and 1. If I replace that PRNG with
return 0.5
, i.e. the average of the 0-to-1 interval, then the MCMC now returns a wildly-wrong answer of 942. Is that replacement the kind of thing you have in mind when you say “just take the average of those fluctuations”? If so, how do you reconcile the fact that “just take the average of those fluctuations” gives the wrong answer, with your description of that scenario as “what it’s meant to do”? Or if not, then what would “just take the average of those fluctuations” mean in this MCMC context?
I’m confused by your comment. Let’s keep talking about MCMC.
The following is true: The random inputs to MCMC have “a causal effect on the execution of the algorithm such that the algorithm doesn’t do what it’s meant to do if you just take the average of those fluctuations”.
For example, let’s say the MCMC accepts a million inputs in the range (0,1), typically generated by a PRNG in practice. If you replace the PRNG by the function
return 0.5
(“just take the average of those fluctuations”), then the MCMC will definitely fail to give the right answer.
The following is false: “the signals entering…are systematic rather than random”. The random inputs to MCMC are definitely expected and required to be random, not systematic. If the PRNG has systematic patterns, it screws up the algorithm—I believe this happens from time to time, and people doing Monte Carlo simulations need to be quite paranoid about using an appropriate PRNG. Even very subtle long-range patterns in the PRNG output can screw up the calculation.
The MCMC will do a highly nontrivial (high-computational-complexity) calculation and give a highly non-arbitrary answer. The answer does depend to some extent on the stream of random inputs. For example, suppose I do MCMC, and (unbeknownst to me) the exact answer is 8.00. If I use a random seed of 1 in my PRNG, then the MCMC might spit out a final answer of 7.98 ± 0.03. If I use a random seed of 2, then the MCMC might spit out a final answer of 8.01 ± 0.03. Etc. So the algorithm run is dependent on the random bits, but the output is not totally arbitrary.
All this is uncontroversial background, I hope. You understand all this, right?
executions would branch conditional on specific charge trajectories, and it would be a rubbish computer.
As it happens, almost all modern computer chips are designed to be deterministic, by putting every signal extremely far above the noise floor. This has a giant cost in terms of power efficiency, but it has a benefit of making the design far simpler and more flexible for the human programmer. You can write code without worrying about bits randomly flipping—except for SEUs, but those are rare enough that programmers can basically ignore them for most purposes.
(Even so, such chips can act non-deterministically in some cases—for example as discussed here, some ML code is designed with race conditions where sometimes (unpredictably) the chip calculates
(a+b)+c
and sometimesa+(b+c)
, which are ever-so-slightly different for floating point numbers, but nobody cares, the overall algorithm still works fine.)But more importantly, it’s possible to run algorithms in the presence of noise. It’s not how we normally do things in the human world, but it’s totally possible. For example, I think an ML algorithm would basically work fine if a small but measurable fraction of bits randomly flipped as you ran it. You would need to design it accordingly, of course—e.g. don’t use floating point representation, because a bit-flip in the exponent would be catastrophic. Maybe some signals would be more sensitive to bit-flips than others, in which case maybe put an error-correcting code on the super-sensitive ones. But for lots of other signals, e.g. the lowest-order bit of some neural net activation, we can just accept that they’ll randomly flip sometimes, and the algorithm still basically accomplishes what it’s supposed to accomplish—say, image classification or whatever.
I agree that certain demos might change the mind of certain people. (And if so, they’re worthwhile.) But I also think other people would be immune. For example, suppose someone has the (mistaken) idea: “Nobody would be so stupid as to actually press go on an AI that would then go on to kill lots of people! Or even if theoretically somebody might be stupid enough to do that, management / government / etc. would never let that happen.” Then that mistaken idea would not be disproven by any demo, except a “demo” that involved lots of actual real-life people getting killed. Right?
Hmm, I wasn’t thinking about that because that sentence was nominally in someone else’s voice. But you’re right. I reworded, thanks.
In a perfect world, everyone would be concerned about the risks for which there are good reasons to be concerned, and everyone would be unconcerned about the risks for which there are good reasons to be unconcerned, because everyone would be doing object-level checks of everyone else’s object-level claims and arguments, and coming to the correct conclusion about whether those claims and arguments are valid.
And those valid claims and arguments might involve demonstrations and empirical evidence, but also might be more indirect.
I do think Turing and von Neumann reached correct object-level conclusions via sound reasoning, but obviously I’m stating that belief without justifying it.
I’m not sure what argument you think I’m making.
In a perfect world, I think people would not need any concrete demonstration to be very concerned about AGI x-risk. Alan Turing and John von Neumann were very concerned about AGI x-risk, and they obviously didn’t need any concrete demonstration for that. And I think their reasons for concern were sound at the time, and remain sound today.
But many people today are skeptical that AGI poses any x-risk. (That’s unfortunate from my perspective, because I think they’re wrong.) The point of this post is to suggest that we AGI-concerned people might not be able to win over those skeptics via concrete demonstrations of AI doing scary (or scary-adjacent) things, either now or in the future—or at least, not all of the skeptics. It’s probably worth trying anyway—it might help for some of the skeptics. Regardless, understanding the exact failure modes is helpful.
These algorithms are useful maps of the brain and mind. But is computation also the territory? Is the mind a program? Such a program would need to exist as a high-level abstraction of the brain that is causally closed and fully encodes the mind.
I said it in one of your previous posts but I’ll say it again: I think causal closure is patently absurd, and a red herring. The brain is a machine that runs an algorithm, but algorithms are allowed to have inputs! And if an algorithm has inputs, then it’s not causally closed.
The most obvious examples are sensory inputs—vision, sounds, etc. I’m not sure why you don’t mention those. As soon as I open my eyes, everything in my field of view has causal effects on the flow of my brain algorithm.
Needless to say, algorithms are allowed to have inputs. For example, the mergesort algorithm has an input (namely, a list). But I hope we can all agree that mergesort is an algorithm!
The other example is: the brain algorithm has input channels where random noise enters in. Again, that doesn’t prevent it from being an algorithm. Many famous, central examples of algorithms have input channels that accept random bits—for example, MCMC.
And in regards to “practical CF”, if I run MCMC on my computer while sitting outside, and I use an anemometer attached to the computer as a source of the random input bits entering the MCMC run, then it’s true that you need an astronomically complex hyper-accurate atmospheric simulator in order to reproduce this exact run of MCMC, but I don’t understand your perspective wherein that fact would be important. It’s still true that my computer is implementing MCMC “on a level of abstraction…higher than” atoms and electrons. The wind flowing around the computer is relevant to the random bits, but is not part of the calculations that comprise MCMC (which involve the CPU instruction set etc.). By the same token, if thermal noise mildly impacts my train of thought (as it always does), then it’s true that you need to simulate my brain down to the jiggling atoms in order to reproduce this exact run of my brain algorithm, but this seems irrelevant to me, and in particular it’s still true that my brain algorithm is “implemented on a level of abstraction of the brain higher than biophysics”. (Heck, if I look up at the night sky, then you’d need to simulate the entire Milky Way to reproduce this exact run of my brain algorithm! Who cares, right?)
I agree!! (if I understand correctly). See https://www.lesswrong.com/posts/RrG8F9SsfpEk9P8yi/robin-hanson-s-grabby-aliens-model-explained-part-1?commentId=wNSJeZtCKhrpvAv7c
Huh, this is helpful, thanks, although I’m not quite sure what to make of it and how to move forward.
I do feel confused about how you’re using the term “equanimity”. I sorta have in mind a definition kinda like: neither very happy, nor very sad, nor very excited, nor very tired, etc. Google gives the example: “she accepted both the good and the bad with equanimity”. But if you’re saying “apply equanimity to positive sensations and it makes them better”, you’re evidently using the term “equanimity” in a different way than that. More specifically, I feel like when you say “apply equanimity to X”, you mean something vaguely like “do a specific tricky learned attention-control maneuver that has something to do with the sensory input of X”. That same maneuver could contribute to equanimity, if it’s applied to something like anxiety. But the maneuver itself is not what I would call “equanimity”. It’s upstream. Or sorry if I’m misunderstanding.
Also, I also want to distinguish two aspects of an emotion. In one, “duration of an emotion” is kinda like “duration of wearing my green hat”. I don’t have to be thinking about it the whole time, but it’s a thing happening with my body, and if I go to look, I’ll see that it’s there. Another aspect is the involuntary attention. As long as it’s there, I can’t not think about it, unlike my green hat. I expect that even black-belt PNSE meditators are unable to instantly turn off anger / anxiety / etc. in the former sense. I think these things are brainstem reactions that can be gradually unwound but not instantly. I do expect that those meditators would be able to more instantly prevent the anger / anxiety / etc. from controlling their thought process. What do you think?
Also, just for context, do you think you’ve experienced PNSE? Thanks!
I don’t think any of the challenges you mentioned would be a blocker to aliens that have infinite compute and infinite time. “Is the data big-endian or little-endian?” Well, try it both ways and see which one is a better fit to observations. If neither seems to fit, then do a combinatorial listing of every one of the astronomical number of possible encoding schemes, and check them all! Spend a trillion years studying the plausibility of each possible encoding before moving onto the next one, just to make sure you don’t miss any subtelty. Why not? You can do all sorts of crazy things with infinite compute and infinite time.
The main insight of the post (as I understand it) is this:
In the context of a discussion of whether we should be worried about AGI x-risk, someone might say “LLMs don’t seem like they’re trying hard to autonomously accomplish long-horizon goals—hooray, why were people so worried about AGI risk?”
In the context of a discussion among tech people and VCs about how we haven’t yet made an AGI that can found and run companies as well as Jeff Bezos, someone might say “LLMs don’t seem like they’re trying hard to autonomously accomplish long-horizon goals—alas, let’s try to fix that problem.”
One sounds good and the other sounds bad, but there’s a duality connecting them. They’re the same observation. You can’t get one without the other.
This is an important insight because it helps us recognize the fact that people are trying to solve the second-bullet-point problem (and making nonzero progress), and to the extent that they succeed, they’ll make things worse from the perspective of the people in the first bullet point.
This insight is not remotely novel! (And OP doesn’t claim otherwise.) …But that’s fine, nothing wrong with saying things that many readers will find obvious.
(This “duality” thing is a useful formula! Another related example that I often bring up is the duality between positive-coded “the AI is able to come up with out-of-the-box solutions to problems” versus the negative-coded “the AI sometimes engages in reward hacking”.)
(…and as comedian Mitch Hedberg sagely noted, there’s a duality between positive-coded “cheese shredder” and negative-coded “sponge ruiner”.)
The post also chats about two other (equally “obvious”) topics:
Instrumental convergence: “the AI seems like it’s trying hard to autonomously accomplish long-horizon goals” involves the AI routing around obstacles, and one might expect that to generalize to “obstacles” like programmers trying to shut it down
Goal (mis)generalization: If “the AI seems like it’s trying hard to autonomously accomplish long-horizon goal X”, then the AI might actually “want” some different Y which partly overlaps with X, or is downstream from X, etc.
But the question on everyone’s mind is: Are we doomed?
In and of itself, nothing in this post proves that we’re doomed. I don’t think OP ever explicitly claimed it did? In my opinion, there’s nothing in this post that should constitute an update for the many readers who are already familiar with instrumental convergence, and goal misgeneralization, and the fact that people are trying to build autonomous agents. But OP at least gives a vibe of being an argument for doom going beyond those things, which I think was confusing people in the comments.
Why aren’t we necessarily doomed? Now this is my opinion, not OP’s, but here are three pretty-well-known outs (at least in principle):
The AI can “want” to autonomously accomplish a long-horizon goal, but also simultaneously “want” to act with integrity, helpfulness, etc. Just like it’s possible for humans to do. And if the latter “want” is strong enough, it can outvote the former “want” in cases where they conflict. See my post Consequentialism & corrigibility.
The AI can behaviorist-“want” to autonomously accomplish a long-horizon goal, but where the “want” is internally built in such a way as to not generalize OOD to make treacherous turns seem good to the AI. See e.g. my post Thoughts on “Process-Based Supervision”, which is skeptical about the practicalities, but I think the idea is sound in principle.
We can in principle simply avoid building AIs that autonomously accomplish long-horizon goals, notwithstanding the economic and other pressures—for example, by keeping humans in the loop (e.g. oracle AIs). This one came up multiple times in the comments section.
There’s plenty of challenges in these approaches, and interesting discussions to be had, but the post doesn’t engage in any of these topics.
Anyway, I’m voting strongly against including this post in the 2023 review. It’s not crisp about what it’s arguing for and against (and many commenters seem to have gotten the wrong idea about what it’s arguing for), it’s saying obvious things in a meandering way, and it’s not refuting or even mentioning any of the real counterarguments / reasons for hope. It’s not “best of” material.