yams

Karma: 385

MIRI, formerly MATS, sometimes Palisade

yams May 21, 2025, 12:26 AM
1 point
0
in reply to: Matthew Barnett’s comment on: AI Doomerism in 1879
To the extent that you’re saying “I’d like to have more conversations about why creating powerful agentic systems might not go well by default; for others this seems like a given, and I just don’t see it”, I applaud you and hope you get to talk about this a whole bunch with smart people in a mutually respectful environment. However, I do not believe analogizing the positions of those who disagree with you with luddites from the 19th century (in particular when thousands of pages of publicly available writings, with which you are familiar, exist) is the best way to invite those conversations.
Quoting the first page of a book as though it contained a detailed roadmap of the central (60,000-word) argument’s logical flow (which to you is apparently the same as a rigorous historical account of how the authors came to believe what they believe) — while it claims to do nothing of the sort — simply does not parse at all. If you read the book (which I recommend, based on your declared interests here), or modeled the pre-existing knowledge of the median book website reader, you would not think “anything remotely like current techniques” meant “we are worried exclusively about deep learning for deep learning-exclusive reasons; trust us because we know so much about deep learning.”
If you find evidence of Eliezer, Nate, or similar saying “The core reason I am concerned about AI safety is [something very specific about deep learning]; otherwise I would not be concerned”, I would take your claims about MIRI’s past messaging very seriously. As is, no evidence exists before me that I may consider in support of this claim.
Based on what you’ve said so far, you seem to think that all of the cruxes (or at least the most important ones) must either be purely intuitive or purely technical. If they’re purely intuitive, then you dismiss them as the kind of reactionary thinking someone from the 19th century might have come up with. If they’re purely technical, you’d be well-positioned to propose clever technical solutions (or else to discredit your interlocutor on the basis of their credentials).
Reality’s simply messier than that. You likely have both intuitive and technical cruxes, as well as cruxes with irreducible intuitive and technical components (that is, what you see when you survey the technical evidence is shaped by your prior, and your motivations, as is true for anyone; as was true for you when interpreting that book excerpt).
I think you’re surrounded by smart people who would be excited to pour time into talking to you about this, conditional on not opening that discussion with a straw man of their position.

yams May 18, 2025, 6:04 PM
1 point
0
in reply to: ryan_greenblatt’s comment on: ryan_greenblatt’s Shortform
Is the crux that the more optimistic folks plausibly agree (2) is cause for concern, but believe that mundane utility can be reaped with (1), and they don’t expect us to slide from (1) into (2) without noticing?

yams May 17, 2025, 10:28 PM
8 points
4
in reply to: Matthew Barnett’s comment on: AI Doomerism in 1879
I don’t think the mainline doom arguments claim to be rooted in deep learning?
Mostly they’re rigorized intuitive models about the nature of agency/intelligence/goal-directedness, which may go some way toward explaining certain phenomena we see in the behavior of LLMs (ie the Palisade Stockfish experiment). They’re theoretical arguments related to a broad class of intuitions and in many cases predate deep learning as a paradigm.
We can (and many do) argue over whether our lens ought to be top-down or bottom-up, but leaning toward the top down approach isn’t the same thing as relying on a-rigorous anxieties of the kind some felt 100 years ago.

yams May 14, 2025, 8:37 PM
12 points
6
in reply to: Zack_M_Davis’s comment on: Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
This strikes me as straightforwardly not the purpose of the book. This is a general-audience book that makes the case, as Nate and Eliezer see it, for both the claim in the title and the need for a halt. This isn’t inside baseball on the exact probability of doom, whether the risks are acceptable given the benefits, whether someone should work at a lab, or any of the other favorite in-group arguments. This is For The Out Group.
Many people (like >100 is my guess), with many different view points, have read the book and offered comments. Some of those comments can be shared publicly and some can’t, as is normal in the publishing industry. Some of those comments shaped the end result, some didn’t.

yams May 9, 2025, 3:25 PM
6 points
3
in reply to: Zack_M_Davis’s comment on: Eukryt Wrts Blg
Oh, this is all familiar to me and I have my reservations about democracy (although none of them are race-flavored).
The thing I’m curious about is the story that makes the voting habits of 2-3 percent of the population The Problem.

yams May 9, 2025, 5:23 AM
3 points
2
in reply to: Said Achmiz’s comment on: Eukryt Wrts Blg
What’s the model here?

yams May 9, 2025, 2:08 AM
30 points
29
in reply to: localdeity’s comment on: Eukryt Wrts Blg
Then your opponent can counter-argue that your statements are true but cherry-picked, or that your argument skips logical steps xyz and those steps are in fact incorrect. If your opponent instead chooses to say that for you to make those statements is unacceptable behavior, then it’s unfortunate that your opposition is failing to represent its side well. As an observer, depending on my purposes and what I think I already know, I have many options, ranging from “evaluating the arguments presented” to “researching the issue myself”.
My entire point is that logical steps in the argument are being skipped, because they are, and that the facts are cherrypicked, because they are, and my comment says as much, as well as pointing out a single example (which admits to being non-comprehensive) of an inconvenient (and obvious!) fact left out of the discussion altogether, as a proof of concept, precisely to avoid arguing the object level point (which is irrelevant to whether or not Crimieux’s statement has features that might lead one to reasonably dis-prefer being associated with him).
We move into ‘this is unacceptable’ territory when someone shows themselves to have a habit of forcefully representing their side using these techniques in order to motivate their conclusion, which many have testified Cremieux does, and which is evident from his banning in a variety of (not especially leftist, not especially IQ and genetics hostile) spaces. If your rhetorical policies fail to defend against transparently adversarial tactics predictably pedaled in the spirit of denying people their rights, you have a big whole in your map.
OP didn’t use the word “threat”. He said he was “very curious about aboriginals” and asked how do you live with them.
You quoted a section that has nothing to do with any of what I was saying. The exact line I’m referring to is:
How do you have a peaceable democracy (or society in general) with a population composed of around 3% (and growing) mentally-retarded people whose vote matters just as much as yours?
The whole first half of your comment is only referencing the parenthetical ‘society in general’ case, and not the voting case. I assume this is accidental on your part and not a deliberate derailment. To be clear about the stakes:
This is the conclusion of the statement. This is the whole thrust he is working up to. These facts are selected in service of an argument to deny people voting rights on the basis of their race. If the word ‘threat’ was too valenced for you, how about ‘barrier’ or ‘impediment’ to democracy? This is the clear implication of the writing. This is the hypothesis he’s asking us to entertain: Australia would be a better country if Aborigines were banned from voting. Not just because their IQs are low, or because their society is regressive, but because they are retarded.
He’s not expressing curiosity in this post. He’s expressing bald-faced contempt (“Uncouth.. dullards”). I’m not a particularly polite person, and this is language I reserve for my enemies. His username is a transphobic slur. Why are you wasting your charity on this person?
Decoupling isn’t ignoring all relevant context within a statement to read it in the most generous possible light; decoupling is distinguishing the relevant from the irrelevant to better see the truth. Cremieux has displayed a pattern of abhorrent bigotry, and I am personally ashamed that my friends and colleagues would list him as an honored guest at their event.

yams May 8, 2025, 5:52 PM
5 points
−3
in reply to: localdeity’s comment on: Eukryt Wrts Blg
“How do you have a peaceable democracy (or society in general) with a population...?”
easy: we already do this. Definitionally, 2 percent of people are <70 IQ. I don’t think we would commonly identify this as one of the biggest problems with democracy.
I think this demonstrates a failure mode of the ‘is it true?’ heuristic as a comprehensive methodology for evaluating statements. I can string together true premises (and omit others) to support a much broader range of conclusions than are supported by the actual preponderance of the evidence. (i.e., even if we accept all the premises presented here, the suggestion that letting members of a certain racial group vote is a threat to democracy completely dissolves with the introduction of one additional observation).
[for transparency: my actual belief here is that IQ is a very crude implement with results mediated by many non-genetic population-level factors, but I don’t think I need to convince you of this in order to update you toward believing the author is engaged in motivated reasoning!]

yams Apr 29, 2025, 12:31 AM
1 point
0
in reply to: Neel Nanda’s comment on: MichaelDickens’s Shortform
Oh man — I sure hope making ‘defectors’ and lab safety staff walk the metaphorical plank isn’t on the table. Then we’re really in trouble.

yams Apr 28, 2025, 5:42 PM
1 point
0
in reply to: Neel Nanda’s comment on: MichaelDickens’s Shortform
This looks closer to 2 to me?
Also, from the outside, can you describe how an observer would distinguish between [any of the items on the list] and the situation you lay out in your comment / what the downsides are to treating them similarly? I think Michael’s point is that it’s not useful/worth it to distinguish.
Whether someone is dishonest, incompetent, or underweighting x-risk (by my lights) mostly doesn’t matter for how I interface with them, or how I think the field ought to regard them, since I don’t think we should brow beat people or treat them punitively. Bottom line is I’ll rely (as an unvalenced substitute for ‘trust’) on them a little less.
I think you’re right to point out the valence of the initial wording, fwiw. I just think taxonomizing apparent defection isn’t necessary if we take as a given that we ought to treat people well and avoid claiming special knowledge of their internals, while maintaining the integrity of our personal and professional circles of trust.

yams Apr 26, 2025, 6:40 PM
2 points
0
in reply to: Kaj_Sotala’s comment on: Worries About AI Are Usually Complements Not Substitutes
1. Importantly, the conclusion of the above paper is ‘x-risk concerns don’t diminish near-term risk concerns’, and not ‘near-term concerns don’t diminish x-risk concerns.’ There’s no strong reason to assume that this property would be commutative, and I’d want to see some research going the other way around before claiming the other thing.
2. I want to talk a bit about how I receive the kind of thing from Eliezer that you linked to above. There’s something like a fallacy of composition that occurs when talking about ‘the problem with AI.’ Like, if we accidentally make an AI that is racist, that is very bad! If we make an AI that breaks markets, without some other mechanisms in place, that is also very bad! If we make an AI that enables authoritarianism — again — bad!
  
  Fortunately, a sufficiently powerful aligned intelligence wouldn’t do any of those things and, the ones it would do, it would actually put mechanisms in place to dissolve the downside. This is ~definitionally true of an aligned ASI. The current solutions to the imminent threat of doom (‘don’t fucking build it’) also address many of these other concerns inherently (or, at least, give you extra time to figure out what to do about them), making the positions truly synergistic, conditional on prioritizing x-risk.
  
  However, the inverse is not true. If someone thinks the ‘real problem with AI’ is one of the short-term issues above, then they’re tempted to come up with Clever Solutions, and even to syphon funding/talent away from, i.e., technical governance, alignment research, comms (setting aside for now that these directions are complicated and one can plausibly not have much credence on them working out), thus feeding into a different goal entirely (the goal of ‘no just build it as long as it doesn’t demonstrate this particular failure mode’). Alignment is (probably) not an elephant you can eat one bite at a time (or, at least, we don’t have reason to believe it is), and trying to eat it one bite at a time via the current paradigm largely does more acceleration than it does safety-ification.
  
  But instead of saying this kind of complicated thing, Eliezer says the true short-hand version that looks like he’s just calling people with short-term concerns stupid and, most ungenerously, like he’s actually just FINE with the world where we get the racist/economy-breaking/authoritarianism-enabling AI that manifests some awful world for all current and future humans (this is not true; I think I can go out on a limb and say ‘Eliezer wants good things, and racist AIs, mass unemployment with no fundamental structural change or mitigation of what it means to be unemployed, or God-King Sama are all bad things Eliezer doesn’t want’).
  
  I think, historically, coalition-building has been less important (if the Good Story you’re trying to tell is ‘we’re just gonna align the damn thing’ or ‘we’re just gonna die while shouting the truth as loud as possible’), and so saying the short version of the point probably didn’t appear especially near-term costly at the time. Now it’s much more costly, since The Thing We [here meaning ‘the part of the ai safety ecosystem I identify with’; not ‘everyone on lw’ or ‘MIRI’] Are Trying To Do is get a wide range of people to understand that halting development is what’s best for their interests, not only because of x-risk (although this is the most important part), but also because it averts/mitigates near-term dystopias. I really hope people start saying the long version, or just say the short version of “Yes, that also matters, and the solution I have for you also goes a long way toward addressing that particular concern.”
  
  (I’ll admit that this explanation is somewhat motivated; I have some probability on AI winter, and think those worlds still look really bad if we’re not doing anything to mitigate short-term risks and societal upheaval; i.e. “Good luck enforcing your x-risk mitigating governance regime through economic/political/social transformation.” Fortunately, halting / building the off switch seem to me like great first-line solutions to this kind of problem, and getting the issue in the Overton window, in the ways required for such policies to pass, would create opportunities to more robustly address these short-term risks as they come up.)

yams Apr 22, 2025, 9:09 AM
2 points
1
in reply to: Ebenezer Dukakis’s comment on: jacquesthibs’s Shortform
Rather than make things worse as a means of compelling others to make things better, I would rather just make things better.
Brinksmanship and accelerationism (in the Marxist sense) are high variance strategies ill-suited to the stakes of this particular game.
[one way this makes things worse is stimulating additional investment on the frontier; another is attracting public attention to the wrong problem, which will mostly just generate action on solutions to that problem, and not to the problem we care most about. Importantly, the contingent of people-mostly-worried-about-jobs are not yet our allies, and it’s likely their regulatory priorities would not address our concerns, even though I share in some of those concerns.]

yams Mar 21, 2025, 5:21 PM
21 points
15
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Ah, I think this just reads like you don’t think of romantic relationships as having any value proposition beyond the sexual, other than those you listed (which are Things but not The Thing, where The Thing is some weird discursive milieu). Also the tone you used for describing the other Things is as though they are traps that convince one, incorrectly, to ‘settle’, rather than things that could actually plausibly outweigh sexual satisfaction.
Different people place different weight on sexual satisfaction (for a lot of different reasons, including age).
I’m mostly just trying to explain all the disagree votes. I think you’ll get the most satisfying answer to your actual question by having a long chat with one of your asexual friends (as something like a control group, since the value of sex to them is always 0 anyway, so whatever their cause is for having romantic relationships is probably the kind of thing that you’re looking for here).

yams Mar 20, 2025, 4:49 PM
13 points
5
in reply to: Richard_Ngo’s comment on: Elite Coordination via the Consensus of Power
I read your comment as conflating ‘talking about the culture war at all’ and ‘agreeing with / invoking Curtis Yarvin’, which also conflates ‘criticizing Yarvin’ with ‘silencing discussion of the culture war’.
This reinforces a false binary between totally mind-killed wokists and people (like Yarvin) who just literally believe that some folks deserve to suffer, because it’s their genetic destiny.
This kind of tribalism is exactly what fuels the culture war, and not what successfully sidesteps, diffuses, or rectifies it. NRx, like the Cathedral, is a mind-killing apparatus, and one can cautiously mine individual ideas presented by either side, on the basis of the merits of that particular idea, while understanding that there is, in fact, very little in the way of a coherent model underlying those claims. Or, to the extent that there is such a model, it doesn’t survive (much) contact with reality.
[it feels useful for me to point out that Yarvin has ever said things I agree with, and that I’m sympathetic to some of the main-line wokist positions, to avoid the impression that I’m merely a wokist cosplaying centrism; in fact, the critiques of wokism I find most compelling are the critiques that come from the left, but it’s also true that Yarvin has some views here that are more in contact with reality]
edit: I agree that people should say things they believe and be engaged with in good faith (conditional on they, themselves, are engaging in good faith)

yams Mar 20, 2025, 4:30 PM
2 points
0
in reply to: Garrett Baker’s comment on: Elite Coordination via the Consensus of Power
I think you’re saying something here but I’m going to factor it a bit to be sure.
1. “not exactly hard-hitting”
2. “not… at all novel”
3. “not… even interesting”
4. “not even criticisms of the humanities”
One and three I’m just going to call ‘subjective’ (and I think I would just agree with you if the Wikipedia article were actually representative of the contents of the book, which it is not).
Re 4: The book itself is actually largely about his experiences as a professor, being subjected to the forces of elite coordination and bureaucracy, and reads a lot like Yarvin’s critiques of the Cathedral (although Fisher identifies these as representative of a pseudo-left).
Re 2: The novelty comes from the contemporaneity of the writing. Fisher is doing a very early-20th century Marxist thing of actually talking about one’s experience of the world, and relating that back to broader trends, in plain language. The world has changed enough that the work has become tragically dated, and I personally wouldn’t recommend it to people who aren’t already somewhat sympathetic to his views, since its strength around the time of its publication (that contemporaneity) has, predictably, becomes its weakness.
The work that more does the thing testingthewaters is gesturing toward, imo, is Exiting the Vampire Castle. The views expressed in this work are directly upstream of his death: his firm (and early) rebuke of cancel culture and identity politics precipitated rejection and bullying from other leftists on twitter, deepening his depression. He later killed himself.
Important note if you actually read the essay: he’s setting his aim at similar phenomena to Yarvin, but is identifying the cause differently // he is a leftist talking to other leftists, so is using terms like ‘capital’ in a valenced way. I think the utility of this work, for someone who is not part of the audience he is critiquing, is that it shows that the left has any answer at all to the phenomena Yarvin and Ngo are calling out; that they’re not, wholesale, oblivious to these problems and, in fact, the principal divide in the contemporary left is between those who reject the Cathedral and those who seek to join it.
(obligatory “Nick Land was Mark Fisher’s dissertation advisor.”)

yams Mar 16, 2025, 11:58 PM
8 points
4
on: I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
(I basically endorse Daniel and Habryka’s comments, but wanted to expand the ‘it’s tricky’ point about donation. Obviously, I don’t know what they think, and they likely disagree on some of this stuff.)
There are a few direct-work projects that seem robustly good (METR, Redwood, some others) based on track record, but afaict they’re not funding constrained.
Most incoming AI safety researchers are targeting working at the scaling labs, which doesn’t feel especially counterfactual or robust against value drift, from my position. For this reason, I don’t think prosaic AIS field-building should be a priority investment (and Open Phil is prioritizing this anyway, so marginal value per dollar is a good deal lower than it was a few years ago).
There are various governance things happening, but much of that work is pretty behind the scenes.
There are also comms efforts, but the community as a whole has just been spinning up capacity in this direction for ~a year, and hasn’t really had any wild successes, beyond a few well-placed op-eds (and the juries out on if / which direction these moved the needle).
Comms is a devilishly difficult thing to do well, and many fledgling efforts I’ve encountered in this direction are not in the hands of folks whose strategic capacities I especially trust. I could talk at length about possible comms failure modes if anyone has questions.
I’m very excited about Palisade and Apollo, which are both, afaict, somewhat funding constrained in the sense that they have fewer people than they should, and the people currently working there are working for less money than they could get at another org, because they believe in the theory of change over other theories of change. I think they should be better supported than they are currently, on a raw dollars level (but this may change in the future, and I don’t know how much money they need to receive in order for that to change).
I am not currently empowered to make a strong case for donating to MIRI using only publicly available information, but that should change by the end of this year, and the case to be made there may be quite strong. (I say this because you may click my profile and see I work at MIRI, and so it would seem a notable omission from my list if I didn’t mention why it’s omitted; reasons for donating to MIRI exist, but they’re not public, and I wouldn’t feel right trying to convince anyone of that, especially when I expect it to become pretty obvious later).
I don’t know how much you know about AI safety and the associated ecosystem but, from my (somewhat pessimistic, non-central) perspective, many of the activities in the space are likely (or guaranteed, in some instances) to have the opposite of their stated intended impact. Many people will be happy to take your money and tell you it’s doing good, but knowing that it is doing good by your own lights (as opposed to doing evil or, worse, doing nothing*) is the hard part. There is ~no consensus view here, and no single party that I would trust to make this call with my money without my personal oversight (which I would also aim to bolster through other means, in advance of making this kind of call).
*this was a joke. Don’t Be Evil.

yams Feb 12, 2025, 1:46 AM
1 point
0
in reply to: davekasten’s comment on: davekasten’s Shortform
Preliminary thoughts from Ryan Greenblatt on this here.

yams 1 Feb 2025 18:53 UTC
1 point
0
on: yams’s Shortform
[errant thought pointing a direction, low-confidence musing, likely retreading old ground]

There’s a disagreement that crops up in conversations about changing people’s minds. Sides are roughly:
1. You should explain things by walking someone through your entire thought process, as it actually unfolded. Changing minds is best done by offering an account of how your own mind was changed.
2. You should explain things by back-chaining the most viable (valid) argument, from your conclusions, with respect to your specific audience.
This first strategy invites framing your argument around the question “How did I come to change my mind?”, and this second invites framing your argument around the question “How might I change my audience’s mind?”. I am sometimes characterized as advocating for approach 2, and have never actually taken that to be my position.I think there’s a third approach here, which will look to advocates of approach 1 as if it were approach 2, and look to advocates of approach 2 as if it were approach 1. That is, you should frame the strategy around the question “How might my audience come to change their mind?”, and then not even try to change it yourself.
This third strategy is about giving people handles and mechanisms that empower them to update based on evidence they will encounter in the natural course of their lives, rather than trying to do all of the work upfront. Don’t frame your own position as some competing argument in the market place of ideas; hand your interlocutor a tool, tell them what they might expect, and let their experience confirm your predictions.I think this approach has a few major differences over the other two approaches, from the perspective of its impact:
1. It requires much less authority. (Strength!)
2. It can be executed in a targeted, light-weight fashion. (Strength!)
3. It’s less likely to slip into deception than option 2, and less confrontational than option 1. (Strength!)
4. Even if it works, they won’t end up thinking exactly what you think (Weakness?)
5. ….but they’ll be better equipped to make sense of new evidence (Strength!)
6. Plausibly more mimetically fit than option 1 or 2 (a failure mode of 1 is that your interlocutor won’t be empowered to stand up to criticism while spreading the ideas, even to people very much like themselves, and for option 2 it’s that they will ONLY be successful in spreading the idea to people who are like themselves, since they only know the argument that works on them).
I think Eliezer has talked about some version of this in the past, and this is part of why people like predictions in general, but I think pasting a prediction at the end of an argument built around strategy 1 or 2 isn’t actually Doing The Thing I mean here.
Friends report Logan’s writing strongly has this property.

yams 1 Feb 2025 18:53 UTC
1 point
0
on: yams’s Shortform
Do you think of rationality as a similar sort of ‘object’ or ‘discipline’ to philosophy? If not, what kind of object do you think of it as being?
(I am no great advocate for academic philosophy; I left that shit way behind ~a decade ago after going quite a ways down the path. I just want to better understand whether folks consider Rationality as a replacement for philosophy, a replacement for some of philosophy, a subset of philosophical commitments, a series of cognitive practices, or something else entirely. I can model it, internally, as aiming to be any of these things, without other parts of my understanding changing very much, but they all have ‘gaps’, where there are things that I associate with Rationality that don’t actually naturally fall out of the core concepts as construed as any of these types of category [I suppose this is the ‘being a subculture’ x-factor]).

yams 1 Feb 2025 2:15 UTC
5 points
0
on: The Failed Strategy of Artificial Intelligence Doomers
Question for Ben:
Are you inviting us to engage with the object level argument, or are you drawing attention to the existence of this argument from a not-obviously-unreasonable-source as a phenomenon we are responsible for (and asking us to update on that basis)?
On my read, he’s not saying anything new (concerns around military application are why ‘we’ mostly didn’t start going to the government until ~2-3 years ago), but that he’s saying it, while knowing enough to paint a reasonable-even-to-me picture of How This Thing Is Going, is the real tragedy.