phd student in comp neuroscience @ mpi brain research frankfurt. https://twitter.com/janhkirchner and https://universalprior.substack.com/
Jan
Ohhh, that’s a very good point 🤔I guess that makes the comparison a bit less direct. I’ll think about whether it can be fixed or if I’ll rewrite that part. Thank you for pointing it out!
The Unreasonable Feasibility Of Playing Chess Under The Influence
Belief-conditional things—things that only exist when you believe in them
On (Not) Reading Papers
Slightly advanced decision theory 102: Four reasons not to be a (naive) utility maximizer
Interesting point, I haven’t thought about it from that perspective yet! Do you happen to have a reference at hand, I’d love to read more about that. (No worries if not, then I’ll do some digging myself).
+1 I like the idea :)
But is a greedy doctor still a greedy doctor when they love anything more than they love money? This, of course, is a question for the philosophers.
Thank you for the comment! :) Since this one is the most upvoted one I’ll respond here, although similar points were also brought up in other comments.
I totally agree, this is something that I should have included (or perhaps even focused on). I’ve done a lot of thinking about this prior to writing the post (and lots of people have suggested all kinds of fancy payment schemes to me, f.e. increasing payment rapidly for every year above life expectancy). I’ve converged on believing that all payment schemes that vary as a function of time can probably be goodharted in some way or other (f.e. through medical coma like you suggest, or by just making you believe you have great life quality). But I did not have a great idea for how to get a conceptual handle on that family of strategies, so I just subsumed them under “just pay the doctor, dammit”.
After thinking about it again, (assuming we can come up with something that cannot be goodharted) I have the intuition that all of the time-varying payment schemes are somehow related to assassination markets, since you basically get to pick the date of your own death by fixing the payment scheme (at some point the amount of effort the doctor puts in will be higher than the payment you can offer, at which point the greedy doctor will just give up). So ideally you would want to construct the time-varying payment scheme in exactly that way that pushed the date of assassination as far into the future as possible. When you have a mental model of how the doctor makes decisions, this is just a “simple” optimization process.
But when you don’t have this (since the doctor is smarter), you’re kind of back to square one. And then (I think) it possibly again comes down to setting up multiple doctors to cooperate or compete to force them to be truthful through a time-invariant payment scheme. Not sure at all though.
The Greedy Doctor Problem
Uhhh, another thing for my reading list (LW is an amazing knowledge retrieval system). Thank you!
I remember encountering that argument/definition of suffering before. It certainly has a bit of explanatory power (you mention meditation) and it somehow feels right. But I don’t understand self-referentiality deep enough to have a mechanistic model of how that should work in my mind. And I’m a bit wary that this perspective conveniently allows us to continue animal eating and (some form of) mass farming. That penalizes the argument for me a bit, motivated cognition etc.
Thanks for the reference! I was aware of some shortcomings of PANAS, but the advantages (very well-studied, and lots of freely available human baseline data) are also pretty good.
The cool thing about doing these tests with large language models is that it almost costs nothing to get insanely large sample sizes (for social science standards) and that it’s (by design) super replicable. When done in a smart way, this procedure might even produce insight on biases of the test design or it might verify shaky results from psychology (as GPT should capture a fair bit of human psychology). The flip side of that is of course that there will be a lot of different moving parts and interpreting the output is challenging.
Thank you for the input, super useful! I did not know the concept of transparency in this context, interesting. This does seem to capture some important qualitative differences between pain and suffering, although I’m hesitant to use the terms conscious/qualia. Will think about this more.
Drug addicts and deceptively aligned agents—a comparative analysis
Frankfurt Declaration on the Cambridge Declaration on Consciousness
Good idea! Did it!
Thank you! (:
Very interesting point, I didn’t know that. Do you know (/have a reference that explains) how those counterfactuals are evaluated then?
Thank you, glad you enjoyed reading it! (:
Also, cool that you mention Scott Page’s book! I have it on my shelf but haven’t gotten around to reading it yet. When I do I’ll write an update.
Ahh, thanks for letting me know! (: Yeah, they also don’t work for me either… I guess the problem arises because footnotes have to be entered in Markdown mode (see this) but formatting the images only works in the WYSIWYG editor… Bummer. I’ll figure out a different solution for the next post.
Yes, that’s a good description! And a cool datapoint, I’ve never played (or even watched) Go, but the principle should of course translate.