(Self-review.) I claim that this post is significant for articulating a solution to the mystery of disagreement (why people seem to believe different things, in flagrant violation of Aumann’s agreement theorem): much of the mystery dissolves if a lot of apparent “disagreements” are actually disguised conflicts. The basic idea isn’t particularly original, but I’m proud of the synthesis and writeup. Arguing that the distinction between deception and bias is less decision-relevant than commonly believed seems like an improvement over hang-wringing over where the boundary is.
Zack_M_Davis
Some have delusional optimism about [...]
I’m usually not a fan of tone-policing, but in this case, I feel motivated to argue that this is more effective if you drop the word “delusional.” The rhetorical function of saying “this demo is targeted at them, not you” is to reassure the optimist that pessimists are committed to honestly making their case point by point, rather than relying on social proof and intimidation tactics to push a predetermined “AI == doom” conclusion. That’s less credible if you imply that you have warrant to dismiss all claims of the form “Humans and institutions will make reasonable decisions about how to handle AI development and deployment because X” as delusional regardless of the specific X.
I don’t think Vance is e/acc. He has said positive things about open source, but consider that the context was specifically about censorship and political bias in contemporary LLMs (bolding mine):
There are undoubtedly risks related to AI. One of the biggest:
A partisan group of crazy people use AI to infect every part of the information economy with left wing bias. Gemini can’t produce accurate history. ChatGPT promotes genocidal concepts.
The solution is open source
If Vinod really believes AI is as dangerous as a nuclear weapon, why does ChatGPT have such an insane political bias? If you wanted to promote bipartisan efforts to regulate for safety, it’s entirely counterproductive.
Any moderate or conservative who goes along with this obvious effort to entrench insane left-wing businesses is a useful idiot.
I’m not handing out favors to industrial-scale DEI bullshit because tech people are complaining about safety.
The words I’ve bolded indicate that Vance is at least peripherally aware that the “tech people [...] complaining about safety” are a different constituency than the “DEI bullshit” he deplores. If future developments or rhetorical innovations persuade him that extinction risk is a serious concern, it seems likely that he’d be on board with “bipartisan efforts to regulate for safety.”
The next major update can be Claude 4.0 (and Gemini 2.0) and after that we all agree to use actual normal version numbering rather than dating?
Date-based versions aren’t the most popular, but it’s not an unheard of thing that Anthropic just made up: see CalVer, as contrasted to SemVer. (For things that change frequently in small ways, it’s convenient to just slap the date on it rather than having to soul-search about whether to increment the second or the third number.)
‘You acted unwisely,’ I cried, ‘as you see
By the outcome.’ He calmly eyed me:
‘When choosing the course of my action,’ said he,
‘I had not the outcome to guide me.’
The claim is pretty clearly intended to be about relative material, not absolute number of pawns: in the end position of the second game, you have three pawns left and Stockfish has two; we usually don’t describe this as Stockfish having given up six pawns. (But I agree that it’s easier to obtain resources from an adversary that values them differently, like if Stockfish is trying to win and you’re trying to capture pawns.)
This is a difficult topic (in more ways than one). I’ll try to do a better job of addressing it in a future post.
Was my “An important caveat” parenthetical paragraph sufficient, or do you think I should have made it scarier?
Thanks, I had copied the spelling from part of the OP, which currently says “Arnalt” eight times and “Arnault” seven times. I’ve now edited my comment (except the verbatim blockquote).
if there’s a bunch of superintelligences running around and they don’t care about you—no, they will not spare just a little sunlight to keep Earth alive.
Yes, I agree that this conditional statement is obvious. But while we’re on the general topic of whether Earth will be kept alive, it would be nice to see some engagement with Paul Christiano’s arguments (which Carl Shulman “agree[s] with [...] approximately in full”) that superintelligences might care about what happens to you a little bit, articulated in a comment thread on Soares’s “But Why Would the AI Kill Us?” and another thread on “Cosmopolitan Values Don’t Come Free”.
The reason I think this is important is because “[t]o argue against an idea honestly, you should argue against the best arguments of the strongest advocates”: if you write 3000 words inveighing against people who think comparative advantage means that horses can’t get sent to glue factories, that doesn’t license the conclusion that superintelligence Will Definitely Kill You if there are other reasons why superintelligence Might Not Kill You that don’t stop being real just because very few people have the expertise to formulate them carefully.
(An important caveat: the possibility of superintelligences having human-regarding preferences may or may not be comforting: as a fictional illustration of some relevant considerations, the Superhappies in “Three Worlds Collide” cared about the humans to some extent, but not in the specific way that the humans wanted to be cared for.)
Now, you are on the record stating that you “sometimes mention the possibility of being stored and sold to aliens a billion years later, which seems to [you] to validly incorporate most all the hopes and fears and uncertainties that should properly be involved, without getting into any weirdness that [you] don’t expect Earthlings to think about validly.” If that’s all you have to say on the matter, fine. (Given the premise of AIs spending some fraction of their resources on human-regarding preferences, I agree that uploads look a lot more efficient than literally saving the physical Earth!)
But you should take into account that if you’re strategically dumbing down your public communication in order to avoid topics that you don’t trust Earthlings to think about validly—and especially if you have a general policy of systematically ignoring counterarguments that it would be politically inconvenient for you to address—you should expect that Earthlings who are trying to achieve the map that reflects the territory will correspondingly attach much less weight to your words, because we have to take into account how hard you’re trying to epistemically screw us over by filtering the evidence.
No more than Bernard Arnalt, having $170 billion, will surely give you $77.
Bernald Arnault has given eight-figure amounts to charity. Someone who reasoned, “Arnault is so rich, surely he’ll spare a little for the less fortunate” would in fact end up making a correct prediction about Bernald Arnault’s behavior!
Obviously, it would not be valid to conclude ”… and therefore superintelligences will, too”, because superintelligences and Bernald Arnault are very different things. But you chose the illustrative example! As a matter of local validity, It doesn’t seem like a big ask for illustrative examples to in fact illustrate what what they purport to.
Arguments from moral realism, fully robust alignment, that ‘good enough’ alignment is good enough in practice, and related concepts.
What is moral realism doing in the same taxon with fully robust and good-enough alignment? (This seems like a huge, foundational worldview gap; people who think alignment is easy still buy the orthogonality thesis.)
Arguments from good outcomes being so cheap the AIs will allow them.
If you’re putting this below the Point of No Return, then I don’t think you’ve understood the argument. The claim isn’t that good outcomes are so cheap that even a paperclip maximizer would implement them. (Obviously, a paperclip maximizer kills you and uses the atoms to make paperclips.)
The claim is that it’s plausible for AIs to have some human-regarding preferences even if we haven’t really succeeded at alignment, and that good outcomes for existing humans are so cheap that AIs don’t have to care about the humans very much in order to spend a tiny fraction of their resources on them. (Compare to how some humans care enough about animal welfare to spend an tiny fraction of our resources helping nonhuman animals that already exist, in a way that doesn’t seem like it would be satisfied by killing existing animals and replacing them with artificial pets.)
There are lots of reasons one might disagree with this: maybe you don’t think human-regarding preferences are plausible at all, maybe you think accidental human-regarding preferences are bad rather than good (the humans in “Three Worlds Collide” didn’t take the Normal Ending lying down), maybe you think it’s insane to have such a scope-insensitive concept of good outcomes—but putting it below arguments from science fiction or blind faith (!) is silly.
in a world where the median person is John Wentworth [...] on Earth (as opposed to Wentworld)
Who? There’s no reason to indulge this narcissistic “Things would be better in a world where people were more like meeeeeee, unlike stupid Earth [i.e., the actually existing world containing all actually existing humans]” meme when the comparison relevant to the post’s thesis is just “a world in which humans have less need for dominance-status”, which is conceptually simpler, because it doesn’t drag in irrelevant questions of who this Swentworth person is and whether they have an unusually low need for dominance-status.
(The fact that I feel motivated to write this comment probably owes to my need for dominance-status being within the normal range; I construe statements about an author’s medianworld being superior to the real world as a covert status claim that I have an interest in contesting.)
2019 was a more innocent time. I grieve what we’ve lost.
It’s a fuzzy Sorites-like distinction, but I think I’m more sympathetic to trying to route around a particular interlocutor’s biases in the context of a direct conversation with a particular person (like a comment or Tweet thread) than I am in writing directed “at the world” (like top-level posts), because the more something is directed “at the world”, the more you should expect that many of your readers know things that you don’t, such that the humility argument for honesty applies forcefully.
Just because you don’t notice when you’re dreaming, doesn’t mean that dream experiences could just as well be waking experiences. The map is not the territory; Mach’s principle is about phenomena that can’t be told apart, not just anything you happen not to notice the differences between.
When I was recovering from a psychotic break in 2013, I remember hearing the beeping of a crosswalk signal, and thinking that it sounded like some sort of medical monitor, and wondering briefly if I was actually on my deathbed in a hospital, interpreting the monitor sound as a crosswalk signal and only imagining that I was healthy and outdoors—or perhaps, both at once: the two versions of reality being compatible with my experiences and therefore equally real. In retrospect, it seems clear that the crosswalk signal was real and the hospital idea was just a delusion: a world where people have delusions sometimes is more parsimonious than a world where people’s experiences sometimes reflect multiple alternative realities (exactly when they would be said to be experiencing delusions in at least one of those realities).
(I’m interested (context), but I’ll be mostly offline the 15th through 18th.)
Here’s the comment I sent using the contact form on my representative’s website.
Dear Assemblymember Grayson:
I am writing to urge you to consider voting Yes on SB 1047, the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act. How our civilization handles machine intelligence is of critical importance to the future of humanity (or lack thereof), and from what I’ve heard from sources I’ve trust, this bill seems like a good first step: experts such as Turing Award winners Yoshua Bengio and Stuart Russell support the bill (https://time.com/7008947/california-ai-bill-letter/), and Eric Neyman of the Alignment Research Center described it as “narrowly tailored to address the most pressing AI risks without inhibiting innovation” (https://x.com/ericneyman/status/1823749878641779006). Thank you for your consideration. I am,
Your faithful constituent,
Zack M. Davis
This is awful. What do most of these items have to do with acquiring the map that reflects the territory? (I got 65, but that’s because I’ve wasted my life in this lame cult. It’s not cool or funny.)
On the one hand, I also wish Shulman would go into more detail on the “Supposing we’ve solved alignment and interpretability” part. (I still balk a bit at “in democracies” talk, but less so than I did a couple years ago.) On the other hand, I also wish you would go into more detail on the “Humans don’t benefit even if you ‘solve alignment’” part. Maybe there’s a way to meet in the middle??
“[A] common English expletive which may be shortened to the euphemism bull or the initialism B.S.”