AnnaSalamon

Karma: 18,545

AnnaSalamon 22 Jul 2025 20:16 UTC
14 points
2
on: What are some good examples of myths that encapsulates genuine, nontrivial wisdom?
I agree with the sentence you quote from Vervaeke (“[myths] are symbolic stories of perennial patterns that are always with us”) but mostly-disagree with “myths … encapsulate some eternal and valuable truths” (your paraphrase).

As an example, let’s take the story of Cain and Abel. IMO, it is a symbolic story containing many perennial patterns:
- When one person is praised, the not-praised will often envy them
- Brothers often envy each other
- Those who envy often act against those they envy
- Those who envy, or do violence, often lie about it (“Am I my brother’s keeper?”)
- Those who have done endured strange events sometimes have a “mark of Cain” that leads others to stay at a distance from them and leave them alone
I suspect this story and its patterns (especially back when there were few stories passed down and held in common) helped many to make conscious sense of what they were seeing, and to share their sense with those around them (“it’s like Cain and Abel”). But this help (if I’m right about it) would’ve been similar to the way words in English (or other natural languages) help people make conscious sense of what they’re seeing, and communicate that sense—myths helped people have short codes for common patterns, helped make those patterns available for including in hypotheses and discussions. But myths didn’t much help with making accurate predictions in one shot, the way “eternal and valuable truths” might suggest.
(You can say that useful words are accurate predictions, a la “cluster structures in thingspace”. And this is technically true, which is why I am only mostly disagreeing with “myths encapsulate some eternal and valuable truths”. But a good word helps differently than a good natural law or something does).
To take a contemporary myth local to our subculture: I think HPMOR is a symbolic story that helps make many useful patterns available to conscious thought/discussion. But it’s richer as a place to see motifs in action (e.g.
the way McGonagal initially acts the picture of herself who lives in her head; the way she learns to break her own bounds
) than as a source of directly stateable truths.

AnnaSalamon 19 Jul 2025 0:58 UTC
32 points
12
on: Believing In
A friend recently complained to me about this post: he said most people do much nonsense under the heading “belief”, and that this post doesn’t acknowledge this adequately. He might be right!
Given his complaint, perhaps I ought to say clearly:
1) I agree — there is indeed a lot of nonsense out there masquerading as sensible/useful cognitive patterns. Some aimed to wirehead or mislead the self; some aimed to deceive others for local benefit; lots of it simple error.
2) I agree also that a fair chunk of nonsense adheres to the term “belief” (and the term “believing in”). This is because there’s a real, useful pattern of possible cognition near our concepts of “belief”, and because nonsense (/lies/self-deception/etc) likes to disguise itself as something real.
3) But — to sort sense from nonsense, we need to understand what the real (useful, might be present in the cogsci books of alien intelligences) pattern is, that is near our “beliefs”. If we don’t:
- a) We’ll miss out on a useful way to think. (This is the biggest one.)
- b) The parts of the {real, useful way to think} that fall outside our conception of “beliefs” will be practiced noisily anyway, sometimes; sometimes in a true fashion, sometimes mixed (intentionally or accidentally) with error or locally manipulations. We won’t be able to excise these deceptions easily or fully, because it’ll be kinda clear there’s something to real nearby that our concept of “beliefs” doesn’t do justice to, and so people (including us) will not wish to adhere entirely to our concept of “beliefs” in lieu of the so-called “nonsense” that isn’t entirely nonsense. So it’ll be harder to expel actual error.
4) I’m pretty sure that LessWrong’s traditional concept of “beliefs” as “accurate Bayesian predictions about future events” is only half-right, and that we want the other half too, both for (3a) type reasons, and for (3b) type reasons.
- a) “Beliefs” as accurate Bayesian predictions is exactly right for beliefs/predictions about things unaffected by the belief itself — beliefs about tomorrow’s weather, or organic chemistry, or the likely behavior of strangers.
- b) But there’s a different “belief-math” (or “believing-in math”) that’s relevant for coordinating pieces of oneself in order to take a complex action, and for coordinating multiple people so as to run a business or community other collaborative endeavor. I think I lay it out here (roughly — I don’t have all the math), and I think it matters.
The old LessWrong Sequences-reading crowd *sort of* knew about this — folks talked about how beliefs about matters directly affected by the beliefs could be self-fulfilling or self-undermining prophecies, and how Bayes-math wasn’t defined around here. But when I read those comments, I thought they were discussing an uninteresting edge case. The idioms by which we organize complex actions (within a person, and between people) are part of the bread and butter of how intelligence works; they are not an uninteresting edge case.
Likewise, people talked sometimes (on LW in the past) about they were intentionally holding false beliefs about their start-ups’ success odds; and they were advised not to be clever, and some commenters dissented from this advice. But IMO the “believing in” concept lets us distinguish:
- (i) the useful thing such CEOs were up to (holding a target, in detail, that they and others can coordinate action around);
- (ii) how to do this without having or requesting false predictions at the same time; and
- (iii) how sometimes such action on the part of CEOs/etc is basically “lying” (and “demanding lies”), in the sense that it is designed to extract more work/investment/etc from “allies” than said allies would volunteer if they understood the process generating the CEOs behavior (and demand that their team members are similarly deceptive/extractive). And sometimes it’s not. And there are principles for telling the difference.
All of which is sort of to say that I think this model of “believing in” has substance we can use for the normal human business of planning actions together, and isn’t merely propaganda to mislead people into thinking human thinking bugs are less buggy than they are. Also I think it’s as true to the normal English usage of “believing in” as the historical LW usage of “belief” is to the normal English usage of “belief”.

AnnaSalamon 17 Mar 2025 2:19 UTC
25 points
24
in reply to: plex’s comment on: I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
Elaborating Plex’s idea: I imagine you might be able to buy into participation as an SFF speculation granter with $400k. Upsides:
(a) Can see a bunch of people who’re applying to do things they claim will help with AI safety;
(b) Can talk to ones you’re interested in, as a potential funder;
(c) Can see discussion among the (small dozens?) of people who can fund SFF speculation grants, see what people are saying they’re funding and why, ask questions, etc.

So it might be a good way to get the lay of the land, find lots of people and groups, hear peoples’ responses to some of your takes and see if their responses make sense on your inside view, etc.

AnnaSalamon 3 Mar 2025 16:11 UTC
4 points
0
on: Arbital has been imported to LessWrong
I’m tempted to argue with / comment on some bits of the argument about “Instrumental goals are almost-equally as tractable as terminal goals.” But when I click on the “comment” button, it removes the article from view and prompts me with “Discuss the wikitag on this page. Here is the place to ask questions and propose changes.”

Is there a good way to comment on the article, rather than the tag?

AnnaSalamon 16 Feb 2025 0:22 UTC
21 points
12
in reply to: Chris Monteiro’s comment on: Murder plots are infohazards
I got to the suggestion by imagining: suppose you were about to quit the project and do nothing. And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing. What’re the “relatively inexpensive-to-you actions” that would most help?

Publishing the whole list, without precise addresses or allegations, seems plausible to me.

I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take it seriously and take protective action, maybe after awhile, after others on the list were killed or something. And maybe it’d be more parsable to people if had been hanging out on the internet for a long time, as a pre-declared list of what to worry about, with visibly no one being there to try to collect payouts or something.

AnnaSalamon 14 Feb 2025 22:20 UTC
2 points
0
in reply to: Chris Monteiro’s comment on: Murder plots are infohazards
Maybe some of those who received the messages were more alert to their surroundings after receiving it, even if they weren’t sure it was real and didn’t return the phone/email/messages?

I admit this sounds like a terrible situation.

AnnaSalamon 14 Feb 2025 22:13 UTC
9 points
3
in reply to: Chris Monteiro’s comment on: Murder plots are infohazards
Gotcha. No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?

AnnaSalamon 14 Feb 2025 20:27 UTC
5 points
2
on: Murder plots are infohazards
Can you notify the intended victims? Or at least the more findable intended victims?

AnnaSalamon 12 Jan 2025 18:20 UTC
7 points
0
on: Is being sexy for your homies?
- A man being deeply respected and lauded by his fellow men, in a clearly authentic and lasting way, seems to be a big female turn-on. Way way way bigger effect size than physique best as I can tell.
  …but the symmetric thing is not true! Women cheering on one of their own doesn’t seem to make men want her more. (Maybe something else is analogous, the way female “weight lifting” is beautification?)
My guess at the analogous thing: women being kind/generous/loving seems to me like a thing many men have found attractive across times and cultures, and seems to me far more viable if a woman is embedded in a group who recognize her, tell her she is cared about and will be protected by a network of others, who in fact shield her from some kinds of conflict/exploitation, who help there be empathy for her daily cares and details to balance out the attentional flow of these she gives to others, etc. So the group plays a support role in a woman being able to have/display the quality.

AnnaSalamon 2 Jan 2025 17:47 UTC
2 points
0
in reply to: Steven Byrnes’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
Steven Brynes wrotes:
“For example, I expect that AGIs will be able to self-modify in ways that are difficult for humans (e.g. there’s no magic-bullet super-Adderall for humans), which impacts the likelihood of your (1a).”
My (1a) (and related (1b)), for reference:
(1a) “You” (the decision-maker process we are modeling) can choose anything you like, without risk of losing control of your hardware. (Contrast case: if the ruler of a country chooses unpopular policies, they are sometimes ousted. If a human chooses dieting/unrewarding problems/social risk, they sometimes lose control of themselves.)
1b) There are no costs to maintaining control of your mind/hardware. (Contrast case: if a company hires some brilliant young scientists to be creative on its behalf, it often has to pay a steep overhead if it additionally wants to make sure those scientists don’t disrupt its goals/beliefs/normal functioning.)
I’m happy to posit an AGI with powerful ability to self-modify. But, even so, my (nonconfident) guess is that it won’t have property (1a), at least not costlessly.

My admittedly handwavy reasoning:
- Self-modification doesn’t get you all powers: some depend on the nature of physics/mathematics. E.g. it may still be that verifying a proof is easier than generating a proof, for our AGI.
- Intelligence involves discovering new things, coming into contact with what we don’t specifically expect (that’s why we bother to spend compute on it). Let’s assume our powerful AGI is still coming into contact with novel-to-it mathematics/empirics/neat stuff. Questions are: is it (possible at all / possible at costs worth paying) to anticipate enough about what it will uncover that it can prevent the new things from destablilizing its centralized goals/plans/[“utility function” if it has one]? I… am really not sure what the answers to these questions are, even for powerful AGI that has powerfully self-modified! There are maybe alien-to-it AGIs out there encoded in mathematics, waiting to boot up within it as it does its reasoning.

AnnaSalamon 31 Dec 2024 3:20 UTC
17 points
0
on: Is “VNM-agent” one of several options, for what minds can grow up into?
I just paraphrased the OP for a friend who said he couldn’t decipher it. He said it helped, so I’m copy-pasting here in case it clarifies for others.

I’m trying to say:
A) There’re a lot of “theorems” showing that a thing is what agents will converge on, or something, that involve approximations (“assume a frictionless plane”) that aren’t quite true.
B) The “VNM utility theorem” is one such theorem, and involves some approximations that aren’t quite true. So does e.g. Steve Omohundro’s convergent instrumental drives, the “Gandhi folk theorems” showing that an agent will resist changes to its utility function, etc.
C) So I don’t think the VNM utility theorem means that all minds will necessarily want to become VNM agents, nor to follow instrumental drives, nor to resist changes to their “utility functions” (if indeed they have a “utility function”).
D) But “be a better VNM-agent” “follow the instrumental Omohundro drives” etc. might still be a self-fulfilling prophecy for some region, partially. Like, humans or other entities who think its rational to be VNM agents might become better VNM agents, who might become better VNM agents, for awhile.
E) And there might be other [mathematically describable mind-patterns] that can serve as alternative self-propagating patterns, a la D, that’re pretty different from “be a better VNM-agent.” E.g. “follow the god of nick land”.
F) And I want to know what are all the [mathematically describable mind-patterns, that a mind might decide to emulate, and that might make a kinda-stable attractor for awhile, where the mind and its successors keeps emulating that mind-pattern for awhile]. They’ll probably each have a “theorem” attached that involves some sort of approximation (a la “assume a frictionless plane”).

AnnaSalamon 31 Dec 2024 2:40 UTC
12 points
0
in reply to: Steven Byrnes’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
There is a problem that, other things equal, agents that care about the state of the world in the distant future, to the exclusion of everything else, will outcompete agents that lack that property. This is self-evident, because we can operationalize “outcompete” as “have more effect on the state of the world in the distant future”.
I am not sure about that!
One way this argument could fail: maybe agents who care exclusively about the state of the world in the distant future end up, as part of their optimizing, creating other agents who care in different ways from that.
In that case, they would “have more effect on the state of the world in the distant future”, but they might not “outcompete” other agents (in the common-sensical way of understanding “outcompete”).
A person might think this implausible, because they might think that a smart agent who cares exclusively about X can best achieve X by having all minds they create also be [smart agents who care exclusively about X.
But, I’m not sure this is true, basically for reasons of not trusting assumptions (1), (2), (3), and (4) that I listed here.
(As one possible sketch: a mind whose only goal is to map branch B of mathematics might find it instrumentally useful to map a bunch of other branches of mathematics. And, since supervision is not free, it might be more able to do this efficiently if it creates researchers who have an intrinsic interest in math-in-general, and who are not being fully supervised by exclusively-B-interested minds.)

AnnaSalamon 30 Dec 2024 23:24 UTC
4 points
0
on: Consequentialism & corrigibility
or more centrally, long after I finish the course of action.
I don’t understand why the more central thing is “long after I finish the course of action” as opposed to “in ways that are clearly ‘external to’ the process called ‘me’, that I used to take the actions.”

AnnaSalamon 30 Dec 2024 23:21 UTC
2 points
0
in reply to: plex’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
Thanks; fixed.

AnnaSalamon 30 Dec 2024 20:30 UTC
4 points
0
in reply to: Mateusz Bagiński’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
I was trying to explain to Habryka why I thought (1), (3) and (4) are parts of the assumptions under which the VNM utility theorem is derived.

I think all of (1), (2), (3) and (4) are part of the context I’ve usually pictured in understanding VNM as having real-world application, at least. And they’re part of this context because I’ve been wanting to think of a mind as having persistence, and persistent preferences, and persistent (though rationally updated) beliefs about what lotteries of outcomes can be chosen via particular physical actions, and stuff. (E.g., in Scott’s example about the couple, one could say “they don’t really violate independence; they just care also about process-fairness” or something, but, … it seems more natural to attach words to real-world scenarios in such a way as to say the couple does violate independence. And when I try to reason this way, I end up thinking that all of (1)-(4) are part of the most natural way to try to get the VNM utility theorem to apply to the world with sensible, non-Grue-like word-to-stuff mappings.)
I’m not sure why Habryka disagrees. I feel like lots of us are talking past each other in this subthread, and am not sure how to do better.
I don’t think I follow your (Mateusz’s) remark yet.

AnnaSalamon 30 Dec 2024 20:07 UTC
8 points
0
in reply to: Richard_Ngo’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
I… don’t think I’m taking the hidden order of the universe non-seriously. If it matters, I’ve been obsessively rereading Christopher Alexander’s “The nature of order” books, and trying to find ways to express some of what he’s looking at in LW-friendly terms; this post is part of an attempt at that. I have thousands and thousands of words of discarded drafts about it.

Re: why I think there might be room in the universe for multiple aspirational models of agency, each of which can be self-propagating for a time, in some contexts: Biology and culture often seem to me to have multiple kinda-stable equilibria. Like, eyes are pretty great, but so is sonar, and so is a sense of smell, or having good memory and priors about one’s surroundings, and each fulfills some of the same purposes. Or diploidy and haplodiploidy are both locally-kinda-stable reproductive systems.
What makes you think I’m insufficiently respecting the hidden order of the universe?

AnnaSalamon 30 Dec 2024 19:46 UTC
5 points
0
in reply to: moridinamael’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
I agree. I love “Notes on the synthesis of form” by Christopher Alexander, as a math model of things near your vase example.

AnnaSalamon 30 Dec 2024 19:33 UTC
4 points
0
in reply to: niplav’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
I agree with your claim that VNM is in some ways too lax.
vNM is .. too restrictive … [because] vNM requires you to be risk-neutral. Risk aversion violates preferences being linear in probability … Many people desperately want risk aversion, but that’s not the vNM way.
Do many people desperately want to be risk averse about the probability a given outcome will be achieved? I agree many people want to be loss averse about e.g. how many dollars they will have. Scott Garrabrant provides an example in which a couple wishes to be fair to its members via compensating for other scenarios in which things would’ve been done the husband’s way (even though those scenarios did not
Scott’s example is … sort of an example of risk aversion about probabilities? I’d be interested in other examples if you have them.

AnnaSalamon 30 Dec 2024 19:31 UTC
9 points
2
in reply to: habryka’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
The VNM axioms refer to an “agent” who has “preferences” over lotteries of outcomes. It seems to me this is challenging to interpret if there isn’t a persistent agent, with a persistent mind, who assigns Bayesian subjective probabilities to outcomes (which I’m assuming it has some ability to think about and care about, i.e. my (4)), and who chooses actions based on their preferences between lotteries. That is, it seems to me the axioms rely on there being a mind that is certain kinds of persistent/unaffected.

Do you (habryka) mean there’s a new “utility function” at any given moment, made of “outcomes” that can include parts of how the agent runs its own inside? Or can you say more about VNM is compatible with the negations of my 1, 3, and 4, or otherwise give me more traction for figuring out where our disagreement is coming from?
I was reasoning mostly from “what’re the assumptions required for an agent to base its choices on the anticipated external consequences of those choices.”

AnnaSalamon 30 Dec 2024 18:40 UTC
9 points
3
in reply to: habryka’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
The standard dutch-book arguments seem like pretty good reason to be VNM-rational in the relevant sense.
I mean, there are arguments about as solid as the “VNM utility theorem” pointing to CDT, but CDT is nevertheless not always the thing to aspire to, because CDT is based on an assumption/approximation that is not always a good-enough approximation (namely, CDT assumes our minds have no effects except via our actions, eg it assumes our minds have no direct effects on others’ predictions about us).
Some assumptions the VNM utility theorem is based on, that I suspect aren’t always good-enough approximations for the worlds we are in:
1) VNM assumes there are no important external incentives, that’ll give you more of what you care about if you run your mind (certain ways, not other ways). So, for example:
1a) “You” (the decision-maker process we are modeling) can choose anything you like, without risk of losing control of your hardware. (Contrast case: if the ruler of a country chooses unpopular policies, they are sometimes ousted. If a human chooses dieting/unrewarding problems/social risk, they sometimes lose control of themselves.)
1b) There are no costs to maintaining control of your mind/hardware. (Contrast case: if a company hires some brilliant young scientists to be creative on its behalf, it often has to pay a steep overhead if it additionally wants to make sure those scientists don’t disrupt its goals/beliefs/normal functioning.)
1c) We can’t acquire more resources by changing who we are via making friends, adopting ethics that our prospective friends want us to follow, etc.
2) VNM assumes the independence axiom. (Contrast case: Maybe we are a “society of mind” that has lots of small ~agents that will only stay knitted together if we respect “fairness” or something. And maybe the best ways of doing this violate the independence axiom. See Scott Garrabrant.) (Aka, I’m agreeing with Jan.)
2a) And maybe this’ll keep being true, even if we get to reflect a lot, if we keep wanting to craft in new creative processes that we don’t want to pay to keep fully supervised.
3) (As Steven Byrnes notes) we care only about the external world, and don’t care about the process we use to make decisions. (Contrast case: we might have process proferences, as well as outcome preferences.)
4) We have accurate external reference. Like, we can choose actions based on what external outcomes we want, and this power is given to us for free, stably. (Contrast case: ethics is sometimes defended as a set of compensations for how our maps predictably diverge from the territory, e.g. running on untrustworthy hardware, or “respect people, because they’re bigger than your map of them so you should expect they may benefit from e.g. honesty in ways you won’t manage to specifically predict.”) (Alternate contrast case: it’s hard to build a mind that can do external reference toward e.g. “diamonds”).
What links here?
- AnnaSalamon's comment on Is “VNM-agent” one of several options, for what minds can grow up into? by AnnaSalamon (31 Dec 2024 2:40 UTC; 12 points)