Shapley value is not that kind of Solution. Coherent agents can have notions of fairness outside of these constraints. You can only prove that for a specific set of (mostly natural) constraints, Shapeley value is the only solution. But there’s no dutchbooking for notions of fairness.
I was talking more about “dumb” in the sense of violates the “common-sense” axioms that were earlier established (in this case including order invariance by assumption), not “dumb” in the dutchbookable sense, but I think elsewhere I use “dumb” as a stand-in for dutchbookable so fair point.
Looks interesting, haven’t had a chance to dig into yet though!
Something that I feel is missing from this review is the amount of intuitions about how minds work and optimization that are dumped at the reader. There are multiple levels at which much of what’s happening to the characters is entirely about AI. Fiction allows to communicate models; and many readers successfully get an intuition for corrigibility before they read the corrigibility tag, or grok why optimizing for nice readable thoughts optimizes against interpretability.
I think an important part of planecrash isn’t in its lectures but in It’s story and the experiences of its characters. While Yudkowsky jokes about LeCun refusing to read it, it is actually arguably one of the most comprehensive ways to learn about decision theory, with many of the lessons taught through experiences of characters and not through lectures.
I do think there are some ways in which the worlds Yudkowksy writes about are ones where his worldview wins. The Planecrash god setup, for example, is quite fine-tuned to make FDT and corrigibility important. This is almost tautological, since as a writer you can hardly do anything else than write the world as you think it works. But it still means that “works in this fictional world” doesn’t transfer as much to “works in the real world”, even when the fictional stuff is very coherent and well-argued.
I was talking more about “dumb” in the sense of violates the “common-sense” axioms that were earlier established (in this case including order invariance by assumption), not “dumb” in the dutchbookable sense, but I think elsewhere I use “dumb” as a stand-in for dutchbookable so fair point.
Looks interesting, haven’t had a chance to dig into yet though!
Yeah I think this is very true, and I agree it’s a good way to communicate your worldview.
I do think there are some ways in which the worlds Yudkowksy writes about are ones where his worldview wins. The Planecrash god setup, for example, is quite fine-tuned to make FDT and corrigibility important. This is almost tautological, since as a writer you can hardly do anything else than write the world as you think it works. But it still means that “works in this fictional world” doesn’t transfer as much to “works in the real world”, even when the fictional stuff is very coherent and well-argued.