If so, why were US electricity stocks down 20-28% (wouldn’t we expect them to go up if the US wants to strengthen its domestic AI-related infrastructure) and why did TSMC lose less, percentage-wise, than many other AI-related stocks (wouldn’t we expect it to get hit hardest)?
Lukas_Gloor
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.
So I think the more rational and cognitively capable a human is, the more likely they’ll optimize more strictly and accurately for future reward.
If this is true at all, it’s not going to be a very strong effect, meaning you can find very rational and cognitively capable people who do the opposite of this in decision situations that directly pit reward against the things they hold most dearly. (And it may not be true because a lot of personal hedonists tend to “lack sophistication,” in the sense that they don’t understand that their own feelings of valuing nothing but their own pleasure is not how everyone else who’s smart experiences the world. So, there’s at least a midwit level of “sophistication” where hedonists seem overrepresented.)
Maybe it’s the case that there’s a weak correlation that makes the quote above “technically accurate,” but that’s not enough to speak of reward being the optimization target. For comparison, even if it is the case that more intelligent people prefer classical music over k-pop, that doesn’t mean classical music is somehow inherently superior to k-pop, or that classical music is “the music taste target” in any revealing or profound sense. After all, some highly smart people can still be into k-pop without making any mistake.
I’ve written about this extensively here and here. Some relevant exercepts from the first linked post:One of many takeaways I got from reading Kaj Sotala’s multi-agent models of mind sequence (as well as comments by him) is that we can model people as pursuers of deep-seated needs. In particular, we have subsystems (or “subagents”) in our minds devoted to various needs-meeting strategies. The subsystems contribute behavioral strategies and responses to help maneuver us toward states where our brain predicts our needs will be satisfied. We can view many of our beliefs, emotional reactions, and even our self-concept/identity as part of this set of strategies. Like life plans, life goals are “merely” components of people’s needs-meeting machinery.[8]
Still, as far as components of needs-meeting machinery go, life goals are pretty unusual. Having life goals means to care about an objective enough to (do one’s best to) disentangle success on it from the reasons we adopted said objective in the first place. The objective takes on a life of its own, and the two aims (meeting one’s needs vs. progressing toward the objective) come apart. Having a life goal means having a particular kind of mental organization so that “we” – particularly the rational, planning parts of our brain – come to identify with the goal more so than with our human needs.[9]
To form a life goal, an objective needs to resonate with someone’s self-concept and activate (or get tied to) mental concepts like instrumental rationality and consequentialism. Some life goals may appeal to a person’s systematizing tendencies and intuitions for consistency. Scrupulosity or sacredness intuitions may also play a role, overriding the felt sense that other drives or desires (objectives other than the life goal) are of comparable importance.
[...]Adopting an optimization mindset toward outcomes inevitably leads to a kind of instrumentalization of everything “near term.” For example, suppose your life goal is about maximizing the number of your happy days. The rational way to go about your life probably implies treating the next decades as “instrumental only.” On a first approximation, the only thing that matters is optimizing the chances of obtaining indefinite life extension (potentially leading to more happy days). Through adopting an outcome-focused optimizing mindset, seemingly self-oriented concerns such as wanting to maximize the number of happiness moments turn into an almost “other-regarding” endeavor. After all, only one’s far-away future selves get to enjoy the benefits – which can feel essentially like living for someone else.[12]
[12] This points at another line of argument (in addition to the ones I gave in my previous post) to show why hedonist axiology isn’t universally compelling:
To be a good hedonist, someone has to disentangle the part of their brain that cares about short-term pleasure from the part of them that does long-term planning. In doing so, they prove they’re capable of caring about something other than their pleasure. It is now an open question whether they use this disentanglement capability for maximizing pleasure or for something else that motivates them to act on long-term plans.
I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison—after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?
I’m in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who “lack a conscience” forever the first time they get caught for any serious crime. On the other hand, they didn’t choose how they were born, and some people without prosocial system-1 emotions do in fact learn how to become a decent citizen.
It seems worth mentioning that punishments for financial crime often include measures like “person gets banned from their industry” or them getting banned from participating in all kinds of financial schemes. In reality, the rules there are probably too lax and people who got banned in finance or pharma just transition to running crypto scams or sell predatory online courses on how to be successful (lol). But in theory, I like the idea of adding things to the sentencing that make re-offending less likely. This way, you can maybe justify giving people second chances.
Suppose that a researcher’s conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.
Good point. This would be reasonable if you think someone can be super bad at F and still great at M.
Still, I think estimating “how big is this gap?” and “how long will it take to cross it?” might quite related, so I expect the skills to be correlated or even strongly correlated.
It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased.
Doesn’t that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn’t thought about this question sufficiently and/or sanely?
It seems irresponsible to me to update even just a small bit to the specific reference class of which your above statement is true.
If you take people who follow progress closely and have thought more and longer about AGI as a research target specifically, my sense is that the ones who have longer timeline medians tend to say more like 10-20y rather than 23y+. (At the same time, there’s probably a bubble effect in who I follow or talk to, so I can get behind maybe lengthening that range a bit.)
Doing my own reasoning, here are the considerations that I weigh heavily:we’re within the human range of most skill types already (which is where many of us in the past would have predicted that progress speeds up, and don’t see any evidence of anything that should change our minds on that past prediction – deep learning visibly hitting a wall would have been one conceivable way, but it hasn’t happened yet)
that time for “how long does it take to cross and overshoot the human range at a given skill?” has historically gotten a lot smaller and is maybe even decreasing(?) (e.g., it admittedly took a long time to cross the human expert range in chess, but it took less long in Go, less long at various academic tests or essays, etc., to the point that chess certainly doesn’t constitute a typical baseline anymore)
that progress has been quite fast lately, so that it’s not intuitive to me that there’s a lot of room left to go (sure, agency and reliability and “get even better at reasoning”)
that we’re pushing through compute milestones rather quickly because scaling is still strong with some more room to go, so on priors, the chance that we cross AGI compute thresholds during this scale-up is higher than that we’d cross it once compute increases slow down
that o3 seems to me like significant progress in reliability, one of the things people thought would be hard to make progress on
Given all that, it seems obvious that we should have quite a lot of probability of getting to AGI in a short time (e.g., 3 years). Placing the 50% forecast feels less obvious because I have some sympathy for the view that says these things are notoriously hard to forecast and we should smear out uncertainty more than we’d intuitively think (that said, lately the trend has been that people consistently underpredict progress, and maybe we should just hard-update on that.) Still, even on that “it’s prudent to smear out the uncertainty” view, let’s say that implies that the median would be like 10-20 years away. Even then, if we spread out the earlier half of probability mass uniformly over those 10-20 years, with an added probability bump in the near-term because of the compute scaling arguments (we’re increasing training and runtime compute now but this will have to slow down eventually if AGI isn’t reached in the next 3-6 years or whatever), that IMO very much implies at least 10% for the next 3 years. Which feels practically enormously significant. (And I don’t agree with smearing things out too much anyway, so my own probability is closer to 50%.)
Well, the update for me would go both ways.
On one side, as you point out, it would mean that the model’s single pass reasoning did not improve much (or at all).
On the other side, it would also mean that you can get large performance and reliability gains (on specific benchmarks) by just adding simple stuff. This is significant because you can do this much more quickly than the time it takes to train a new base model, and there’s probably more to be gained in that direction – similar tricks we can add by hardcoding various “system-2 loops” into the AI’s chain of thought and thinking process.You might reply that this only works if the benchmark in question has easily verifiable answers. But I don’t think it is limited to those situations. If the model itself (or some subroutine in it) has some truth-tracking intuition about which of its answer attempts are better/worse, then running it through multiple passes and trying to pick the best ones should get you better performance even without easy and complete verifiability (since you can also train on the model’s guesses about its own answer attempts, improving its intuition there).
Besides, I feel like humans do something similar when we reason: we think up various ideas and answer attempts and run them by an inner critic, asking “is this answer I just gave actually correct/plausible?” or “is this the best I can do, or am I missing something?.”
(I’m not super confident in all the above, though.)
Lastly, I think the cost bit will go down by orders of magnitude eventually (I’m confident of that). I would have to look up trends to say how quickly I expect $4,000 in runtime costs to go down to $40, but I don’t think it’s all that long. Also, if you can do extremely impactful things with some model, like automating further AI progress on training runs that cost billions, then willingness to pay for model outputs could be high anyway.
When the issue is climate change, a prevalent rationalist take goes something like this:
“Climate change would be a top priority if it weren’t for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn’t prioritize climate change too much.”
We could say the same thing about these trends of demographic aging that you highlight. So, I’m curious why you’re drawn to this topic and where the normative motivation in your writing is coming from.
In the post, you use normative language like, “This suggests that we need to lower costs along many fronts of both money and time, and also we need to stop telling people to wait until they meet very high bars.” (In the context of addressing people’s cited reasons for why they haven’t had kids – money, insecurity about money, not being able to affords kids or the house to raise them in, and mental health.)
The way I conceptualize it, one can zoom in on different, plausibly-normatively-central elements of the situation:
(1) The perspective of existing people.
1a Nation-scale economic issues from an aging demographic, such as collapse of pension schemes, economic stagnation from the aging workforce, etc.
1b Individual happiness and life satisfaction (e.g., a claim that having children tends to make people happier, also applying to parents ‘on the margin,’ people who, if we hadn’t enouraged them, would have decided against children).
(2) Some axiological perspective that considers the interests of both existing and newly created people/beings.
It seems uncontroversial that both 1a and 1b are important perspectives, but it’s not obvious to me whether 1a is a practical priority for us in light of technological progress (cf the parallel to climate change) or how the empirics of 1b shake out (whether parents ‘on the margin’ are indeed happier). (I’m not saying 1b is necessarily controversial – for all I know, maybe the science already exists and is pretty clear. I’m just saying: I’m not personally informed on the topic even though I have read your series of posts on fertility.)
And then, (2) seems altogether subjective and controversial in the sense that smart people hold different views on whether it’s all-things-considered good to encourage people to have lower standards for bringing new people into existence. Also, there are strong reasons (I’ve written up a thorough case for this here and here) why we shouldn’t expect there to be an objective answer on “how to do axiology?.”
This series would IMO benefit from a “Why I care about this?” note, because without it, I get the feeling of “Zvi is criticizing things government do/don’t do in a way that might underhandedly bias readers into thinking that the implied normative views on population ethics are unquestioningly correct.” The way I see it, governments are probably indeed behaving irrationally here given them not being bought into the prevalent rationalist worldview on imminent technological progress (and that’s an okay thing to sneer at), but this doesn’t mean that we have to go “boo!” to all things associated with not choosing children, and “yeah!” to all things associated with choosing them.
That said, I still found the specific information in these roundups interesting, since this is clearly a large societal trend and it’s interesting to think through causes, implications, etc.
The tabletop game sounds really cool!
Interesting takeaways.
The first was exactly the above point, and that at some point, ‘I or we decide to trust the AIs and accept that if they are misaligned everyone is utterly f***ed’ is an even stronger attractor than I realized.
Yeah, when you say it like that… I feel like this is gonna be super hard to avoid!
The second was that depending on what assumptions you make about how many worlds are wins if you don’t actively lose, ‘avoid turning wins into losses’ has to be a priority alongside ‘turn your losses into not losses, either by turning them around and winning (ideal!) or realizing you can’t win and halting the game.’
There’s also the option of, once you realize that winning is no longer achievable, trying to lose less badly than you could have otherwise. For instance, if out of all the trajectories where humans lose, you can guess that some of them seem more likely to bring about some extra bad dystopian scenario, you can try to prevent at least those. Some examples that I’m thinking of are AIs being spiteful or otherwise anti-social (on top of not caring about humans) or AIs being conflict-prone in AI-vs-AI interactions (including perhaps AIs aligned to alien civilizations). Of course, it may not be possible to form strong opinions over what makes for a better or worse “losing” scenario – if you remain very uncertain, all losing will seem roughly equally not valuable.
The third is that certain assumptions about how the technology progresses had a big impact on how things play out, especially the point at which some abilities (such as superhuman persuasiveness) emerge.
Yeah, but I like the idea of rolling dice for various options that we deem plausible (and having this built into the game).
I’m curious to read takeaways from more groups if people continue to try this. Also curious on players’ thoughts on good group sizes (how many people played at once and whether you would have preferred more or fewer players).
I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk’s true and only reason for founding OpenAI, then I agree that this was a communication fuckup.
However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (and Altman too) tend to have a desire to do the most important thing and the belief that they can do this thing a lot better than anyone else. On that assumption, it’s not too surprising that Musk found a reason for having to step in and build AGI himself. In fact, on this view, we should expect to see surprisingly little sincere exploration of “joining someone else’s project to improve it” solutions.I don’t think this is necessarily a bad attitude. Sometimes people who think this way are right in the specific situation. It just means that we see the following patterns a lot:
Ambitious people start their own thing rather than join some existing thing.
Ambitious people have fallouts with each other after starting a project together where the question of “who eventually gets de facto ultimate control” wasn’t totally specified from the start.
(Edited away a last paragraph that used to be here 50mins after posting. Wanted to express something like “Sometimes communication only prolongs the inevitable,” but that sounds maybe a bit too negative because even if you’re going to fall out eventually, probably good communication can help make it less bad.)
I thought the part you quoted was quite concerning, also in the context of what comes afterwards:
Hiatus: Sam told Greg and Ilya he needs to step away for 10 days to think. Needs to figure out how much he can trust them and how much he wants to work with them. Said he will come back after that and figure out how much time he wants to spend.
Sure, the email by Sutskever and Brockman gave some nonviolent communication vibes and maybe it isn’t “the professional thing” to air one’s feelings and perceived mistakes like that, but they seemed genuine in what they wrote and they raised incredibly important concerns that are difficult in nature to bring up. Also, with hindsight especially, it seems like they had valid reasons to be concerned about Altman’s power-seeking tendencies!
When someone expresses legitimate-given-the-situation concerns about your alignment and your reaction is to basically gaslight them into thinking they did something wrong for finding it hard to trust you, and then you make it seem like you are the poor victim who needs 10 days off of work to figure out whether you can still trust them, that feels messed up! (It’s also a bit hypocritical because the whole “I need 10 days to figure out if I can still trust you for thinking I like being CEO a bit too much,” seems childish too.)
(Of course, these emails are just snapshots and we might be missing things that happened in between via other channels of communication, including in-person talks.)
Also, I find it interesting that they (Sutskever and Brockman) criticized Musk just as much as Altman (if I understood their email correctly), so this should make it easier for Altman to react with grace. I guess given Musk’s own annoyed reaction, maybe Altman was calling the others’ email childish to side with Musks’s dismissive reaction to that same email.
Lastly, this email thread made me wonder what happened between Brockman and Sutskever in the meantime, since it now seems like Brockman no longer holds the same concerns about Altman even though recent events seem to have given a lot of new fire to them.
Some of the points you make don’t apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game.
If you did want to focus on the latter goal, you can play online poker (many players can >12 tables at once) and after every session, run your hand histories through a program (e.g., “GTO Wizard”) that will tell you where you made mistakes compared to optimal strategy, and how much they would cost you against an optimal-playing opponent. Then, for any mistake, you can even input the specific spot into the trainer program and practice it with similar hands 4-tabling against the computer, with immediate feedback every time on how you played the spot.
It seems important to establish whether we are in fact going to be in a race and whether one side isn’t already far ahead.
With racing, there’s a difference between optimizing the chance of winning vs optimizing the extent to which you beat the other party when you do win. If it’s true that China is currently pretty far behind, and if TAI timelines are fairly short so that a lead now is pretty significant, then the best version of “racing” shouldn’t be “get to the finish line as fast as possible.” Instead, it should be “use your lead to your advantage.” So, the lead time should be used to reduce risks.
Not sure this is relevant to your post in particular; I could’ve made this point also in other discussions about racing. Of course, if a lead is small or non-existent, the considerations will be different.
I wrote a long post last year saying basically that.
Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don’t think it’s unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I’m curious as to what might feed an opposing intuition for you! I’d be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case.
My sense is that existing mindfulness studies don’t show the sort of impressive results that we’d expect if this were a great solution.
Also, I think people who would benefit most from having less day-to-day suffering often struggle with having no “free room” available for meditation practice, and that seems like an issue that’s hard to overcome even if meditation practice would indeed help them a lot.
It’s already sign of having a decently good life when you’re able to start dedicating time for something like meditation, which I think requires a bit more mental energy than just watching series or scrolling through the internet. A lot of people have leisure time, but it’s a privilege to be mentally well off enough to do purposeful activities during your leisure time. The people who have a lot of this purposeful time probably (usually) aren’t among the ones that suffer most (whereas the people who don’t have it will struggle sticking to regular meditation practice, for good reasons).
For instance, if someone has a chronic illness with frequent pain and nearly constant fatigue, I can see how it might be good for them to practice meditation for pain management, but higher up on their priority list are probably things like “how do I manage to do daily chores despite low energy levels?” or “how do I not get let go at work?.”
Similarly, for other things people may struggle with (addictions, financial worries, anxieties of various sorts; other mental health issues), meditation is often something that would probably help, but it doesn’t feel like priority number one for people with problem-ridden, difficult lives. It’s pretty hard to keep up motivation for training something that you’re not fully convinced of it being your top priority, especially if you’re struggling with other things.
I see meditation as similar to things like “eat healthier, exercise more, go to sleep on time and don’t consume distracting content or too much light in the late evenings, etc.” And these things have great benefits, but they’re also hard, so there are no low-hanging fruit and interventions in this space will have limited effectiveness (or at least limited cost-effectiveness; you could probably get quite far if you gifted people their private nutritionist cook, fitness trainer and motivator, house cleaner and personal assistant, meditation coach, give them enough money for financial independence, etc.).
And then the people who would have enough “free room” to meditate may be well off enough to not feel like they need it? In some ways, the suffering of a person who is kind of well off in life isn’t that bad and instead of devoting 1h per day for meditation practice to reduce the little suffering that they have, maybe the well-off person would rather take Spanish lessons, or train for a marathon, etc.
(By the way, would it be alright if I ping you privately to set up a meeting? I’ve been a fan of your writing since becoming familiar with you during my time at CLR and would love a chance to pick your brain about SFE stuff and hear about what you’ve been up to lately!)
I’ll send you a DM!
[...] I am certainly interested to know if anyone is aware of sources that make a careful distinction between suffering and pain in arguing that suffering and its reduction is what we (should) care about.
I did so in my article on Tranquilism, so I broadly share your perspective!
I wouldn’t go as far as what you’re saying in endnote 9, though. I mean, I see some chance that you’re right in the impractical sense of, “If someone gave up literally all they cared about in order to pursue ideal meditation training under ideal circumstances (and during the training they don’t get any physical illness issues or otherwise have issues crop up that prevent successfully completion of the training), then they could learn to control their mental states and avoid nearly all future sources of suffering.” But that’s pretty impractical even if true!It’s interesting, though, what you say about CBT. I agree it makes sense to be accurate about these distinctions, and that it could affect specific interventions (though maybe not at the largest scale of prioritization, the way I see the landscape).
This would be a valid rebuttal if instruction-tuned LLMs were only pretending to be benevolent as part of a long-term strategy to eventually take over the world, and execute a treacherous turn. Do you think present-day LLMs are doing that? (I don’t)
Or that they have a sycophancy drive. Or that, next to “wanting to be helpful,” they also have a bunch of other drives that will likely win over the “wanting to be helpful” part once the system becomes better at long-term planning and orienting its shards towards consequentialist goals.
On that latter model, the “wanting to be helpful” is a mask that the system is trained to play better and better, but it isn’t the only thing the system wants to do, and it might find that once its gets good at trying on various other masks to see how this will improve its long-term planning, it for some reason prefers a different “mask” to become its locked-in personality.
I thought the first paragraph and the boldened bit of your comment seemed insightful. I don’t see why what you’re saying is wrong – it seems right to me (but I’m not sure).
I am not convinced MIRI has given enough evidence to support the idea that unregulated AI will kill everyone and their children.
The way you’re expressing this feels like an unnecessarily strong bar.
I think advocacy for an AI pause already seems pretty sensible to me if we accept the following premises:The current AI research paradigm mostly makes progress in capabilities before progress in understanding.
(This puts AI progress in a different reference class from most other technological progress, so any arguments with base rates from “technological progress normally doesn’t kill everyone” seem misguided.)AI could very well kill most of humanity, in the sense that it seems defensible to put this at anywhere from 20-80% (we can disagree on the specifics of that range, but that’s where I’d put it looking at the landscape of experts who seem to be informed and doing careful reasoning (so not LeCun)).
If we can’t find a way to ensure that TAI is developed by researchers and leaders who act with a degree of responsibility proportional to the risks/stakes, it seems better to pause.
Edited to add the following:
There’s also a sense in which whether to pause is quite independent from the default risk level. Even if the default risk were only 5%, if there were a solid and robust argument that pausing for five years will reduce it to 4%, that’s clearly very good! (It would be unfortunate for the people who will die preventable deaths in the next five years, but it still helps overall more people to pause under these assumptions.)
[Edit: I wrote my whole reply thinking that you were talking about “organizational politics.” Skimming the OP again, I realize you probably meant politics politics. :) Anyway, I guess I’m leaving this up because it also touches on the track record question.]
I thought Eliezer was quite prescient on some of this stuff. For instance, I remember this 2017 dialogue (so less than 2y after OpenAI was founded), which on the surface talks about drones, but if you read the whole post, it’s clear that it’s meant as an analogy to building AGI:
[...]
These passages read to me a bit as though Eliezer called in 2017 that EAs working at OpenAI as their ultimate path to impact (as opposed to for skill building or know-how acquisistion) were wasting their time.
Maybe a critic would argue that this sequence of posts was more about Eliezer’s views on alignment difficulty than on organizational politics. True, but it still reads as prescient and contains thoughts on org dynamics that apply even if alignment is just hard rather than super duper hard.