MichaelStJules

Karma: 573

Repugnance and replacement

MichaelStJulesApr 11, 2024, 2:41 AM

4 points

0 comments LW link

Gradations of moral weight

MichaelStJulesFeb 29, 2024, 11:08 PM

1 point

0 comments LW link

MichaelStJules Feb 28, 2024, 6:09 AM
2 points
−2
on: Counting arguments provide no evidence for AI doom
The reason SDG doesn’t overfit large neural networks is probably because of various measures specifically intended to prevent overfitting, like weight penalties, dropout, early stopping, data augmentation + noise on inputs, and large enough learning rates that prevent convergence. If you didn’t do those, running SDG to parameter convergence would probably cause overfitting. Furthermore, we test networks on validation datasets on which they weren’t trained, and throw out the networks that don’t generalize well to the validation set and start over (with new hyperparameters, architectures or parameter initializations). These measures bias us away from producing and especially deploying overfit networks.
Similarly, we might expect scheming without specific measures to prevent it. What could those measures look like? Catching scheming during training (or validation), and either heavily penalizing it, or fully throwing away the network and starting over? We could also validate out-of-training-distribution. Would networks whose caught scheming has been heavily penalized or networks selected for not scheming during training (and validation) generalize to avoid all (or all x-risky) scheming? I don’t know, but it seems more likely than counting arguments would suggest.

Which animals realize which types of subjective welfare?

MichaelStJulesFeb 27, 2024, 7:31 PM

4 points

0 comments LW link

Solution to the two envelopes problem for moral weights

MichaelStJulesFeb 19, 2024, 12:15 AM

9 points

1 comment LW link

MichaelStJules Feb 2, 2024, 8:39 PM
1 point
0
in reply to: Dagon’s comment on: Types of subjective welfare
Thanks!
I would say experiments, introspection and consideration of cases in humans have pretty convincingly established the dissociation between the types of welfare (e.g. see my section on it, although I didn’t go into a lot of detail), but they are highly interrelated and often or even typically build on each other like you suggest.
I’d add that the fact that they sometimes dissociate seems morally important, because it makes it more ambiguous what’s best for someone if multiple types seem to matter, and there are possible beings with some types but not others.

Types of subjective welfare

MichaelStJulesFeb 2, 2024, 9:56 AM

10 points

3 comments LW link

Increasingly vague interpersonal welfare comparisons

MichaelStJulesFeb 1, 2024, 6:45 AM

5 points

0 comments LW link

MichaelStJules Jan 15, 2024, 12:07 AM
1 point
0
on: Against most AI risk analogies
If someone wants to establish probabilities, they should be more systematic, and, for example, use reference classes. It seems to me that there’s been little of this for AI risk arguments in the community, but more in the past few years.
Maybe reference classes are kinds of analogies, but more systematic and so less prone to motivated selection? If so, then it seems hard to forecast without “analogies” of some kind. Still, reference classes are better. On the other hand, even with reference classes, we have the problem of deciding which reference class to use or how to weigh them or make other adjustments, and that can still be subject to motivated reasoning in the same way.
We can try to be systematic about our search and consideration of reference classes, and make estimates across a range of reference classes or weights to them. Do sensitivity analysis. Zach Freitas-Groff seems to have done something like this in AGI Catastrophe and Takeover: Some Reference Class-Based Priors, for which he won a prize from Open Phil’s AI Worldviews Contest.
Of course, we don’t need to use direct reference classes for AI risk or AI misalignment. We can break the problem down.

MichaelStJules Jan 4, 2024, 6:18 PM
1 point
0
in reply to: MichaelStJules’s comment on: Spirit Airlines Merger Play
There’s also a decent amount of call option volume+interest at strike prices of $17.5, $20, $22.5, $25, (same links as the comment I’m replying to) which suggests to me that the market is expecting lower upside on successful merger than you. The current price is about $15.8/share, so $17.5 is only +10% and $25 is only +58%.
There’s also of course volume+interest for call option at higher strike prices, $27.5, $30, $32.5.
I think this also suggests the market-implied odds calculations giving ~40% to successful merger are wrong, because the expected upside is overestimated. The market-implied odds are higher.

MichaelStJules Jan 4, 2024, 7:25 AM
1 point
0
in reply to: MichaelStJules’s comment on: Spirit Airlines Merger Play
From https://archive.ph/SbuXU, for calculating the market-implied odds:
Author’s analysis—assumed break price of $5 for Hawaiian and $6 for Spirit.
also:
- Without a merger, Spirit may be financially distressed based on recent operating results. There’s some risk that Spirit can’t continue as a going concern without a merger.
- Even if JetBlue prevails in court, there is some risk that the deal is recut as the offer was made in a much more favorable environment for airlines, though clauses in the merger agreement may prevent this.
So maybe you’re overestimating the upside?
From https://archive.ph/rmZOX:
In my opinion, Spirit Airlines, Inc. equity is undervalued at around $15, but you’re signing up for tremendous volatility over the coming months. The equity can get trashed under $5 or you can get the entire upside.

MichaelStJules Jan 4, 2024, 6:03 AM
1 point
0
in reply to: sapphire’s comment on: Spirit Airlines Merger Play
Unless I’m misreading, it looks like there’s a bunch of volume+interest in put options with strike prices of around $5, but little volume+interest in options with lower strike prices (some in $2.50, but much less). $5.5 for January 5th, $5 for January 19th, $5 for February 16th. Much more volume+interest for put options in general for Feb 16th. So if we take those seriously and I’m not misunderstanding, the market expects a chance it’ll drop below $5 per share, so a drop of at least ~70%.
There’s more volume+interest in put options with strike prices of $7.50 and even more for $10 for February 16th.

MichaelStJules Jan 4, 2024, 5:11 AM
1 point
0
in reply to: sapphire’s comment on: Spirit Airlines Merger Play
Why is the downside only −60%?

MichaelStJules Jan 4, 2024, 4:05 AM
1 point
0
on: Spirit Airlines Merger Play
Why think this is underpriced by the markets?

MichaelStJules Oct 27, 2023, 9:48 PM
1 point
0
in reply to: Slapstick’s comment on: Book Review: Going Infinite
I would be surprised if iguanas find things meaningful that humans don’t find meaningful, but maybe they desire some things pretty alien to us. I’m also not sure they find anything meaningful at all, but that depends on how we define meaningfulness.

Still, I think focusing on meaningfulness is also too limited. Iguanas find things important to them, meaningful or not. Desires, motivation, pleasure and suffering all assign some kind of importance to things.

In my view, either
1. capacity for welfare is something we can measure and compare based on cognitive effects, like effects on attention, in which case it would be surprising if other verteberates, say, had tiny capacities for welfare relative to humans, or
2. interpersonal utility comparisons can’t be grounded, so there aren’t any grounds to say iguanas have lower (or higher) capacities for welfare than humans, assuming they have any at all.

MichaelStJules Oct 27, 2023, 1:46 AM
2 points
1
in reply to: Ben Pace’s comment on: Book Review: Going Infinite
I think that’s true, but also pretty much the same as what many or most veg or reducetarian EAs did when they decided what diet to follow (and other non-food animal products to avoid), including what exceptions to allow. If the consideration of why not to murder counts as involving math, so does veganism for many or most EAs, contrary to Zvi’s claim. Maybe some considered too few options or possible exceptions ahead of time, but that doesn’t mean they didn’t do any math.

This is also basically how I imagine rule consequentialism to work: you decide what rules to follow ahead of time, including prespecified exceptions, based on math. And then you follow the rules. You don’t redo the math for each somewhat unique decision you might face, except possibly very big infrequent decisions, like your career or big donations. You don’t change your rule or make a new exception right in the situation where the rule would apply, e.g. a vegan at a restaurant, someone’s house or a grocery store. If you change or break your rules too easily, you undermine your own ability to follow rules you set for yourself.

But also, EA is compatible with the impermissibility of instrumental harm regardless of how the math turns out (although I have almost no sympathy for absolutist deontological views). AFAIK, deontologists, including absolutist deontologists, can defend killing in self-defense without math and also think it’s better to do more good than less, all else equal.

MichaelStJules Oct 26, 2023, 7:12 PM
5 points
2
in reply to: evhub’s comment on: Book Review: Going Infinite
Well, there could be ways to distinguish, but it could be like a dream, where much of your reasoning is extremely poor, but you’re very confident in it anyway. Like maybe you believe that your loved ones in your dream saying the word “pizza” is overwhelming evidence of their consciousness and love for you. But if you investigated properly, you could find out they’re not conscious. You just won’t, because you’ll never question it. If value is totally subjective and the accuracy of beliefs doesn’t matter (as would seem to be the case on experientialist accounts), then this seems to be fine.
Do you think simulations are so great that it’s better for people to be put into them against their wishes, as long as they perceive/judge it as more meaningful or fulfilling, even if they wouldn’t find it meaningful/fulfilling with accurate beliefs? Again, we can make it so that they don’t find out.
Similarly, would involuntary wireheading or drugging to make people find things more meaningful or fulfilling be good for those people?
Or, something like a “meaning” shockwave, similar to a hedonium shockwave, — quickly killing and replacing everyone with conscious systems that take no outside input or even have sensations (or only the bare minimum) other than to generate feelings or judgements of meaning, fulfillment, or love? (Some person-affecting views could avoid this while still matching the rest of your views.)
Of course, I think there are good practical reasons to not do things to people against their wishes, even when it’s apparently in their own best interests, but I think those don’t capture my objections. I just think it would be wrong, except possibly in limited cases, e.g. to prevent foreseeable regret. The point is that people really do often want their beliefs to be accurate, and what they value is really intended — by their own statements — to be pointed at something out there, not just the contents of their experiences. Experientialism seems like an example of Goodhart’s law to me, like hedonism might (?) seem like an example of Goodhart’s law to you.
I don’t think people and their values are in general replaceable, and if they don’t want to be manipulated, it’s worse for them (in one way) to be manipulated. And that should only be compensated for in limited cases. As far as I know, the only way to fundamentally and robustly capture that is to care about things other than just the contents of experiences and to take a kind of preference/value-affecting view.
Still, I don’t think it’s necessarily bad or worse for someone to not care about anything but the contents of their experiences. And if the state of the universe was already hedonium or just experiences of meaning, that wouldn’t be worse. It’s the fact that people do specifically care about things beyond just the contents of their experiences. If they didn’t, and also didn’t care about being manipulated, then it seems like it wouldn’t necessarily be bad to manipulate them.

MichaelStJules Oct 26, 2023, 10:02 AM
1 point
0
in reply to: Zvi’s comment on: Book Review: Going Infinite
I think a small share of EAs would do the math before deciding whether or not to commit fraud or murder, or otherwise cause/risk involuntary harm to other people, and instead just rule it out immediately or never consider such options in the first place. Maybe that’s a low bar, because the math is too obvious to do?

What other important ways would you want (or make sense for) EAs to be more deontological? More commitment to transparency and against PR?

MichaelStJules Oct 26, 2023, 8:20 AM
1 point
0
in reply to: Bird Concept’s comment on: Book Review: Going Infinite
Maximizing just for expected total pleasure, as a risk neutral classical utilitarian? Maybe being okay with killing everyone or letting everyone die (from AGI, say), as long as the expected payoff in total pleasure is high enough?
I don’t really see a very plausible path for SBF to have ended up with enough power to do this, though. Money only buys you so much, against the US government and military, unless you can take them over. And I doubt SBF would destroy us with AGI if others weren’t already going to.

MichaelStJules Oct 26, 2023, 7:55 AM
2 points
0
in reply to: evhub’s comment on: Book Review: Going Infinite
Where I agree with classical utilitarianism is that we should compute goodness as a function of experience, rather than e.g. preferences or world states
Isn’t this incompatible with caring about genuine meaning and fulfillment, rather than just feelings of them? For example, it’s better for you to feel like you’re doing more good than to actually do good. It’s better to be put into an experience machine and be systematically mistaken about everything you care about, i.e. that the people you love even exist (are conscious, etc.) at all, even against your own wishes, as long as it feels more meaningful and fulfilling (and you never find out it’s all fake, or that can be outweighed). You could also have what you find meaningful changed against your wishes, e.g. made to find counting blades of grass very meaningful, more so than caring for your loved ones.
FWIW, this is also an argument for non-experientialist “preference-affecting” views, similar to person-affecting views. On common accounts of weigh or aggregate, if there are subjective goods, then they can be generated and outweigh the violation and abandonment of your prior values, even against your own wishes, if they’re strong enough.

MichaelStJules

Repug­nance and replacement

Gra­da­tions of moral weight

Which an­i­mals re­al­ize which types of sub­jec­tive welfare?

Solu­tion to the two en­velopes prob­lem for moral weights

Types of sub­jec­tive welfare

In­creas­ingly vague in­ter­per­sonal welfare comparisons

Repugnance and replacement

Gradations of moral weight

Which animals realize which types of subjective welfare?

Solution to the two envelopes problem for moral weights

Types of subjective welfare

Increasingly vague interpersonal welfare comparisons