Links to Dan Murfet’s AXRP interview:
DanielFilan
Frankfurt-style counterexamples for definitions of optimization
In “Bottle Caps Aren’t Optimizers”, I wrote about a type of definition of optimization that says system S is optimizing for goal G iff G has a higher value than it would if S didn’t exist or were randomly scrambled. I argued against these definitions by providing a examples of systems that satisfy the criterion but are not optimizers. But today, I realized that I could repurpose Frankfurt cases to get examples of optimizers that don’t satisfy this criterion.
A Frankfurt case is a thought experiment designed to disprove the following intuitive principle: “a person is morally responsible for what she has done only if she could have done otherwise.” Here’s the basic idea: suppose Alice is considering whether or not to kill Bob. Upon consideration, she decides to do so, takes out her gun, and shoots Bob. But little-known to her, a neuroscientist had implanted a chip in her brain that would have forced her to shoot Bob if she had decided not to. That said, the chip didn’t activate, because she did decide to shoot Bob. The idea is that she’s morally responsible, even tho she couldn’t have done otherwise.
Anyway, let’s do this with optimizers. Suppose I’m playing Go, thinking about how to win—imagining what would happen if I played various moves, and playing moves that make me more likely to win. Further suppose I’m pretty good at it. You might want to say I’m optimizing my moves to win the game. But suppose that, unbeknownst to me, behind my shoulder is famed Go master Shin Jinseo. If I start playing really bad moves, or suddenly die or vanish etc, he will play my moves, and do an even better job at winning. Now, if you remove me or randomly rearrange my parts, my side is actually more likely to win the game. But that doesn’t mean I’m optimizing to lose the game! So this is another way such definitions of optimizers are wrong.
That said, other definitions treat this counter-example well. E.g. I think the one given in “The ground of optimization” says that I’m optimizing to win the game (maybe only if I’m playing a weaker opponent).
Update: there’s now a YouTube link
I’ve added a link to listen on Apple Podcasts.
Sorry—YouTube’s taking an abnormally long time to process the video.
Is there going to be some sort of slack or discord for attendees?
What are the two other mechanisms of action?
In my post, I didn’t require the distribution over meanings of words to be uniform. It could be any distribution you wanted—it just resulted in the prior ratio of “which utterance is true” being 1:1.
Is this just the thing where evidence is theory-laden? Like, for example, how the evidentiary value of the WHO report on the question of COVID origins depends on how likely one thinks it is that people would effectively cover up a lab leak?
To be clear, this is an equivalent way of looking at normal prior-ful inference, and doesn’t actually solve any practical problem you might have. I mostly see it as a demonstration of how you can shove everything into stuff that gets expressed as likelihood functions.
Why wouldn’t this construction work over a continuous space?
Thanks for finding this! Will link it in the transcript.
oops, thanks for the reminder
Sorry, it will be a bit before the video uploads. I’ll hide the link until then.
Proposal: merge with the separate tag “AI Control”
How would you rate the Book of Mormon as a book? What’s your favourite part?
I recently heard of the book How to leave the Mormon church by Alyssa Grenfell, which might be good. Based on an interview with the author, it seemed like it was focussed on nuts-and-bolts stuff (e.g. “practically how do you explore alcohol in a way that isn’t dangerous”) and explicitly avoiding a permanent state of having an “ex-mormon” identity, which strikes me as healthy (altho I think some doubt is warranted on how good the advice is, given that the author’s social media presence is primarily focussed on being ex-mormon). The book is associated with a website.
NB: I have a casual interest in high-demand religions, but have never been a part of one (with the arguable exception of the rationality/EA community).
My guess is this won’t work in all cases, because norm enforcement is usually yes/no, and needs to be judged by people with little information. They can’t handle “you can do any 2 of these 5 things, but no more” or “you can do this but only if you implement it really skillfully”. So either everyone is allowed to impose 80 hour weeks, or no one can work 80 hour weeks, and I don’t like either of those options.
I think this might be wrong—for example, my understanding is that there are some kinds of jobs where it’s considered normal for people to work 80-hour weeks, and other kinds where it isn’t. Maybe the issue is that the “kind of job” norms can easily operate on lets you pick out things like “finance” but not “jobs that have already made one costly vulnerability bid”?
Katja responds on substack:
I’m calling people who know where they are (i.e. are not confused) not in simulations, for the sake of argument. But this shouldn’t matter, except for understanding each other.
It sounds like you are saying that ~100x more people live in confused simulations than base reality, but I’m questioning that. The resources to run a brain are about the same whether it’s a ‘simulation’ or a mind in touch with the real world. Why would future civilization spend radically more resources on simulations than on minds in the world? (Or if the non-confused simulations are also relevantly minds in the world, then there are a lot more of them than the confused simulations, so we are back to quite low probability of being mistaken.)
(I plan on continuing the conversation there, not here)
This is maybe a dumb question, but I would have imagined that successful implantation would be related to good health outcomes (based on some intiution that successful implantation represents an organ of your body functioning properly, and imagining that the higher success rates of younger people has to do with their health). Is that not true?