Martín Soto 8 Feb 2024 19:41 UTC
24 points
22
on: Updatelessness doesn’t solve most problems
Another coarse, on-priors consideration that I could have added to the “Other lenses” section:
Eliezer says something like “surely superintelligences will be intelligent enough to coordinate on Pareto-optimality (and not fall into something like commitment races), and easily enact logical or value handshakes”. But here’s why I think this outside-view consideration need not hold. It is a generally good heuristic to think superintelligences will be able to solve tasks that seem impossible to us. But I think this stops being the case for tasks whose difficulty / complexity grows with the size / computational power / intelligence level of the superintelligence. For a task like “beating a human at Go” or “turning the solar system into computronium”, the difficulty of the task is constant (relative to the size of the superintelligence you’re using to solve it). For a task like “beat a copy of yourself in Go”, that’s clearly not the case (well, unless Go has a winning strategy that a program within our universe can enact, which would be a ceiling on difficulty). I claim “ensuring Pareto-optimality” is more like the latter. When the intelligence or compute of all players grows, it is true they can find more clever and sure-fire ways to coordinate robustly, but it’s also true that they can individually find more clever ways of tricking the system and getting a bit more of the pie (and in some situations, they are individually incentivized to do this). Of course, one might still hold that the first will grow much more than the latter, and so after a certain level of intelligence, agents of a similar intelligence level will easily coordinate. But that’s an additional assumption, relative to the “constant-difficulty” cases.
Of course, if Eliezer believes this it is not really because of outside-view considerations like the above, but because of inside-views about decision theory. But I generally disagree with his takes there (for example here), and have never found convincing arguments (from him or anyone) for the easy coordination of superintelligences.
What links here?
- Evidential Correlations are Subjective, and it might be a problem by Martín Soto (7 Mar 2024 18:37 UTC; 26 points)

Martín Soto 11 Jul 2023 21:01 UTC
23 points
22
on: Consciousness as a conflationary alliance term
While the variety of answers is indeed surprising, I think many of them could be read as different accounts of a single central intuition, so that we’d end up with ~7 deep interpretations instead of 17 different answers.
For example, I think all of the following can be understood as different accounts of the “there being something that it feels like to be me”, or “experiencing the Cartesian theatre”:
experience of distinctive affective states, proprioception, awakeness, and maybe mind-location

Martín Soto 16 Feb 2024 19:12 UTC
22 points
7
on: OpenAI’s Sora is an agent
Guy who reinvents predictive processing through Minecraft

The lattice of partial updatelessness

Martín Soto10 Feb 2024 17:34 UTC

21 points

5 comments5 min readLW link

Sources of evidence in Alignment

Martín Soto2 Jul 2023 20:38 UTC

20 points

0 comments11 min readLW link

The Alignment Problems

Martín Soto12 Jan 2023 22:29 UTC

19 points

0 comments4 min readLW link

Quantitative cruxes in Alignment

Martín Soto2 Jul 2023 20:38 UTC

19 points

0 comments23 min readLW link

Vanessa Kosoy’s PreDCA, distilled

Martín Soto12 Nov 2022 11:38 UTC

17 points

19 comments5 min readLW link

Martín Soto 15 Feb 2024 5:37 UTC
17 points
12
in reply to: Eliezer Yudkowsky’s comment on: Updatelessness doesn’t solve most problems
Thank you for engaging, Eliezer.
I completely agree with your point: an agent being updateless doesn’t mean it won’t learn new information. In fact, it might perfectly decide to “make my future action A depend on future information X”, if the updateless prior so finds it optimal. While in other situations, when the updateless prior deems it net-negative (maybe due to other agents exploiting this future dependence), it won’t.
This point is already observed in the post (see e.g. footnote 4), although without going deep into it, due to the post being meant for the layman (it is more deeply addressed, for example, in section 4.4 of my report). Also for illustrative purposes, in two places I have (maybe unfairly) caricaturized an updateless agent as being “scared” of learning more information. While really, what this means (as hopefully clear from earlier parts of the post) is “the updateless prior assessed whether it seemed net-positive to let future actions depend on future information, and decided no (for almost all actions)”.
The problem I present is not “being scared of information”, but the trade-off between “letting your future action depend on future information X” vs “not doing so” (and, in more detail, how exactly it should depend on such information). More dependence allows you to correctly best-respond in some situations, but also could sometimes get you exploited. The problem is there’s no universal (belief-independent) rule to assess when to allow for dependence: different updateless priors will decide differently. And need to do so in advance of letting their deliberation depend on their interactions (they still don’t know if that’s net-positive).
Due to this prior-dependence, if different updateless agents have different beliefs, they might play very different policies, and miscoordinate. This is also analogous to different agents demanding different notions of fairness (more here). I have read no convincing arguments as to why most superintelligences will converge on beliefs (or notions of fairness) that successfully coordinate on Pareto optimality (especially in the face of the problem of trapped priors i.e. commitment races), and would be grateful if you could point me in their direction.
I interpret you as expressing a strong normative intuition in favor of ex ante optimization. I share this primitive intuition, and indeed it remains true that, if you have some prior and simply want to maximize its EV, updatelessness is exactly what you need. But I think we have discovered other pro tanto reasons against updatelessness, like updateless agents probably performing worse on average (in complex environments) due to trapped priors and increased miscoordination.
What links here?
- Martín Soto's comment on Updatelessness doesn’t solve most problems by Martín Soto (20 Feb 2024 20:23 UTC; 2 points)

Why are counterfactuals elusive?

Martín Soto3 Mar 2023 20:13 UTC

14 points

6 comments2 min readLW link

Alignment being impossible might be better than it being really difficult

Martín Soto25 Jul 2022 23:57 UTC

13 points

2 comments2 min readLW link

Martín Soto 17 Sep 2023 21:53 UTC
13 points
0
in reply to: Elizabeth’s comment on: Vegan Nutrition Testing Project: Interim Report
Hi Elizabeth! Thanks for reaching out. Excuse my delay in response, and the length of this reply. It felt important to communicate the nuances in my views (and the anecdotal experiences, which in our past exchanges might not have come through.
you’d rather nutritional difficulties with veganism weren’t discussed, even when the discussion is truthful and focuses on mitigations within veganism
That’s not my position. To the extent the naive transition accounts are representative of what’s going on in rat/EA spheres, some intervention that reduces the number of transitions that are naive (while fixing the number of transitions) would be a Pareto-improvement. And an intervention that reduces the number of transitions that are naive, and decreases way less the number of transitions, would also be net-positive.
My worry, though, is that signaling out veganism for this is not the most efficient way to achieve this. I hypothesize that
1. Naive transitions are more correlated with social dynamics about insuficient self-care not exclusive (nor close to exclusive) to veganism in rat/EA spheres.
2. Independently of that, a message focused on veganism will turn out net-negative, because of the following aggregated collateral effects:
  1. Decreasing the number of overall transitions too much. Or better said, incentivizing some thought-patterns and dynamics upstream of that decrease, that have even worse consequences than the decrease itself.
  2. Relatedly, incentivizing a community that’s more prone to ignoring important parts of the holistic picture when that goes to the selfish benefit of individuals. (And that’s certainly something we don’t want to happen around the people taking important ethical decisions for the future.)
More on 2. below (On framing), but let me get into 1. first.
I was very surprised to hear those anecdotal stories of naive transitions, because in my anecdotal experience across many different vegan and animalist spaces, serious talk about nutrition, and a constant reminder to put health first, has been an ever-present norm. And, at the same time, a recognition that turning vegan, even with all these nutrition subtleties, is not nearly as difficult as people imagine it (certainly in part due to selection effects).^[1]
I hypothesize that the distributional shift is due to properties of the social dynamics and individual mindspace that rat/EA circles inadvertently encourage, especially on wide-eyed newcomers. The same optimizing mindset leading to “burn-out / overwork / too much Huel / exotic unregulated diets / not taking care of your image / dangerous drug practices linked to work” around these spaces seems to me to be one of the central causes of these naive transitions. I think this is also psychologically linked to the rational justification I’ve heard from some x-riskers: their work is just too important to care about anything else. Obviously that backfires.
Now, even given the above, it is coherent to believe that, despite this common root, veganism in particular is such a prominent example, with so many negative consequences, that a straightforward intervention pushing the motto that veganism presents tradeoffs and can be difficult or not for everyone, is net-positive. After all, this is kind of a quantitative question. I claim that’s not the case, and it’s related to collective blind spots we shouldn’t ignore, which brings me back to 2.
On framing:
One thing that might be happening here, is that we’re speaking at different simulacra levels. I’m not claiming you’re saying anything untrue, just that the consequences of pushing this line in this way are probably net-negative.
Now, I understand the benefits of adopting the general adoption of the policy “state transparently the true facts you know, and that other people seem not to know”. Unfortunately, my impression is this community is not yet in a position in which implementing this policy will be viable or generally beneficial for many topics. And indeed, on some priors it makes sense that a community suddenly receiving an influx of external attention will have to slowly work up to that, if it is at all possible.^[2]
I believe one of the topics is veganism, because of the strong intuitive aversion individuals across the board feel towards changing their diet for ethical reasons (and, I claim, this aversion is irrational and should be counteracted and scrutinized accordingly). In my anecdotal experience (of many years discussing veganism with vegans and non-vegans), an almost-total fraction of the justifications for not transitioning to a vegan diet easily fall to “Is that your true rejection?” (even if the best route is not always to mention that explicitly, of course).
I expected to see that change in EA circles. Unfortunately, that has not been my anecdotal experience. The animalist part of EA, so to speak, is very strongly concentrated in a few individuals (or better said, sub-communities), who have taken this issue seriously enough, and many times are even directly working on animal welfare. But when you move slightly away from that space into neighboring sub-communities (say, a randomly sampled alignment researcher), defensive motivated reasoning on the topic seems to go up again.^[3]
And there are instances in which I’ve obtained direct evidence that individuals in decisive positions are not reasoning correctly about the tradeoffs involved. As an example, consider that some offices / programs / retreats don’t offer (as the free lunch for their members) a completely vegetarian menu (let alone vegan).
From the outside, I don’t understand how this decision can make sense for any rat/EA space. Even if it were true that veganism requires some more efforts, it would be because of the complications related to health tracking or food planning. Those are not present in this case. The food is served, for free. Organizers can put the extra effort to make sure that the offer is nutritionally complete each day, or across the week (as also happens with omnivore menus). But whoever is not vegan need not worry that hard about all that, since they’ll be eating omnivore outside the office. And whoever’s vegan should worry about that, just because they’re vegan. Having meat in this menu doesn’t directly improve anyone’s health.
And the few times I’ve seen this “from the inside”, that is, I’ve heard organisers’ reasoning about this decision, they really didn’t seem to have meaningful arguments, and appealed to a general notion of individual freedom which I think is not a good ethical proxy, and if translated to other domains would lead to bad “not taking side-effects seriously”, or “not taking the dangers of social dynamics seriously”.
I have, of course, heard the obvious argument (although not from organizers) that x-risk research is so important that, if having a vegan menu might slightly turn off a single valuable researcher, it’s not worth it. This of course resonates a lot with the optimizy mindspace referenced above. And here I’ll just say that I don’t think this is the kind of desperate community we want to build. That this can just the same turn off ethically conscious people, who we do want in our community. And that this mindset is very correlated with the “unconstrained obsession with talent” that has led the community to being partly captured by ML community, weird epistemic areas incentivizing bad elitism and power dynamics, etc. In simpler words, I think this blows past some healthy and necessary deontological fences (more in the next section).
I also cannot help but feel suspicious that these practices are so comfortably presented as the default, and alarmed (in some precautionary sense) that grant money is being used to finance something as horrible, and vast, and openly debated, as animal exploitation (more in the next section).
Let me also note that I don’t agree that your posts (and the ensuing comments and conversations) were focused on mitigations within veganism, due to their framing. Even if you truthfully discussed these mitigations, the general tone skeptical of the viability and importance of veganism was very clear, and it is obvious which message most people will get out of this post. I’d love to live in a world were I can trust your readers to Bayesian update and ignore framings, but it’d be self-delusional to think this will be the case in this situation, given the obvious strong pulls everyone has towards motivated ignorance of veganism (and the evidence I’ve obtained about that also inside this community), and how the framing / headliners / first-order updates from your posts resonate with those repeated one-dimensional rationalizations.
A background ethical disagreement:
One thing that might also be happening is just that we disagree ethically. After all, if I didn’t care at all about veganism (or related individual ethical practices), I wouldn’t care about how many vegans are lost, as long as naive vegans are decrease (ignoring second-order effects on community epistemics, as discussed above). And indeed, it is popular amongst some rationalists to doubt the possibility that animals can suffer, something I strongly ethically disagree with.^[4]
But I’m not sure that’s the main driver of our disagreement. If we disagree about how hard to push veganism, or how deeply to consider the negative consequences of having a less vegan community, it might be because of a disagreement about where the utilitarianism / deontology line should be in this topic. (After all, you could very strongly worry about animal suffering, and nonetheless bet absolutely all your efforts on x-risk research, because of being a naive Expected Value Maximizer.) Or equivalently, about how bad the consequences for community dynamics can be, and whether it’s better to resort to rule utilitarianism on this one.
As an extreme example, I very strongly feel like financing the worst moral disaster of current times so that “a few more x-risk researchers are not slightly put off from working in our office” is way past the deontological failsafes. As a less extreme example, I strongly feel like sending a message that will predictably be integrated by most people as “I can put even less mental weight on this one ethical issue that sometimes slightly annoyed me” also is. And in both cases, especially because of what they signal, and the kind of community they incentivize.
The right way to discuss these challenges:
As must be clear, I’d be very happy with treating the root causes, related to the internalized optimizy and obsessive mindset, instead of the single symptom of naive vegan transitions. This is an enormously complex issue, but I a prior think available health and wellbeing resources, and their continuous establishment as a resource that should be used by most people (as an easy route to having that part of life under control and not spiraling, similar to how “food on weekdays” is solved for us by our employers), would provide the individualization and nuance that these problems require. Something like “hey, from now on, those of you that follow any slightly unconventional diet, or have this other thing, or suffer that other thing, can go talk to these people, and they will help you do blah” would sound pretty good, and possibly a welfare multiplier for some people. It certainly sounds better to me than just broadcasting the message “veganism is hard, consider the tradeoffs and either search for help, or drop veganism”. Additionally, “hard” is very variable, and the nuance of it and your observations will get lost.
Something like running small group analytics on some naive vegans as an excuse for them to start thinking more seriously about their health? Yes, nice! That’s individualized, that’s immediately useful. But additionally extracting some low-confidence conclusions and using them to broadcast the above message (or a message that will get first-order approximated to that by 75%) seems negative.
It feels weird for me to think about solutions to this community problem, since in my other spheres it hasn’t arisen. But thinking about which things that happened in those spheres could have contributed positively, the first things that come to mind are: talks / events / activities about sports and health (or even explicitly nutrition), memes about nutrition (post-ironic B12 slander, etc.), communal environments where this knowledge is likely to be shared (like literal cooking).
I also observe that the more individualized approach might work better for a more close-knit community, and that might be especially unattainable now. Maybe there’s some other way to bootstrap this habit. Relatedly, I’d feel safer about some more oversight with regards to some health practices in general (especially drugs, and especially newcomers). But I observe that anything looking like policing is complicated.
In any event, I’m no expert in community health, and my separate point stands that I think broadcasting that message is net-negative right now, because of the obvious bottom-line people would extract from it.
Thanks for reading all of that. Next weeks are busy so I might again take a bit long to reply. Nonetheless, I saw in your last post that you’re thinking about vegan epistemics. Just in case you’d find that valuable, I’d be willing to discuss those thoughts as well, or provide opinions on concrete topics, or just talk about my experience. But of course, no problem if you don’t.
1. ^
  And, to be fair, I have even observed this health consciousness in the few EAs I know who are very vocal about their veganism and animalism (they are few because inside EA I’ve been closer to AI safety than animal welfare).
2. ^
  I could also note that this situation is slightly different from a straightforward “these people believe this false fact”. More accurately, the truth value of this fact hasn’t been brought to their attention, because of a complex web of learned emotional and mental habits. I’m talking here about focusing on those habits, as opposed to the practice of veganism in particular.
3. ^
  It’s obviously hard to draw the boundary of motivated reasoning. I have observed pretty clear-cut cases, but let me just leave it at “these intelligent people are not thinking about / taking seriously / maintaining up to their epistemic standards this aspect of their life as much as they should according to some of their own stated preferences or revealed preferences (and as clearly they are able to)”.
4. ^
  Due to my views on consciousness and moral antirealism, I think deciding which physical systems count as suffering is an ethical choice (equivalent to saying “I care about this process not happening”), and not a purely descriptive one.
What links here?

Martín Soto 14 Sep 2023 15:26 UTC
13 points
0
on: UDT shows that decision theory is more puzzling than ever
Hi Wei! Abram and I have been working on formalizing Logical Updatelessness for a few months. We’ve been mostly setting a framework and foundations using Logical Inductors, and building the obvious UDT algorithms. But we’ve also stumbled upon some of the above problems (especially pitfalls of EVM / commitment races, and logical conditionals vs counterfactuals / natural accounts of logically uncertain reasoning), and soon we’ll turn more thoroughly to the Game Theory enabled by this Learning Theory.
You’re welcome to join the PIBBSS Symposium on Friday 22nd 18:30 CEST, where I’ll be presenting some of our ideas (more info). We still have a lot of open avenues, so no in-depth write-up yet, but soon a First Report will exist.
Also, of course, feel free to hit me with a DM anytime.