In Eliezer’s mad investor chaos and the woman of asmodeus, the reader experiences (mild spoilers in the spoiler box, heavy spoilers if you click the text):
I thought this part was beautiful. I spent four hours driving yesterday, and nearly all of that time re-listening to Rationality: AI->Zombies using this “probability sight frame. I practiced translating each essay into the frame.
When I think about the future, I feel a directed graph showing the causality, with branched updated beliefs running alongside the future nodes, with my mind enforcing the updates on the beliefs at each time step. In this frame, if I heard the pattering of a four-legged animal outside my door, and I consider opening the door, then I can feel the future observation forking my future beliefs depending on how reality turns out. But if I imagine being blind and deaf, there is no way to fuel my brain with reality-distinguishment/evidence, and my beliefs can’t adapt according to different worlds.
I can somehow feel how the qualitative nature of my update rule changes as my senses change, as the biases in my brain change and attenuate and weaken, as I gain expertise in different areas—thereby providing me with more exact reality-distinguishing capabilities, the ability to update harder on the same amount of information, making my brain more efficient at consuming new observations and turning them into belief-updates.
When I thought about questions of prediction and fact, I experienced unusual clarity and precision. EG R:AZ mentioned MIRI, and my thoughts wandered to “Suppose it’s 2019, MIRI just announced their ‘secret by default’ policy. If MIRI doesn’t make much progress in the next few years, what should my update be on how hard they’re working?”. (EDIT: I don’t have a particular bone to pick here; I think MIRI is working hard.)
Before I’d have hand-waved something about absence of evidence is evidence of absence, but the update was probably small. Now, I quickly booted up the “they’re lazy” and the “working diligently” hypotheses, and quickly saw that I was tossing out tons of information by reasoning so superficially, away from the formalism.
I realized that the form of the negative result-announcements could be very informative. MIRI could, in some worlds, explain the obstacles they hit, in a way which is strong evidence they worked hard, even while keeping most of their work secret. (It’s like if some sadistic CS prof in 1973 assigned proving P?NP over the summer, and his students came back with “but relativization”, you’d know they’d worked hard, that’s very legible and crisp progress showing it’s hard.)
Further, the way in which the announcement was written would matter, I could feel the likelihood ratio P(progress to date | lazy) / P(progress to date | diligent) shift around, reflecting my hypotheses say about what realities induce what communication.
I also very quickly realized that the overall update towards “not much effort” is strongly controlled by my beliefs about how hard alignment is; if the problem had been “prove 1+1=2 in PA” and they came back empty-handed a year later, obviously that’s super strong evidence they were messing around. But if I think alignment is basically impossible, then P(little progress | lazy) > P(little progress | diligent) just barely holds, and the likelihood ratio is correspondingly close to 1.
And all of this seems inane when I type it out, like duh, but the magic was seeing it and feeling it all in less than 20 seconds, deriving it as consequences of the hypotheses and the form updates (should) take, instead of going down a checklist of useful rules of thumb, considerations to keep in mind for situations like this one. And then there were several more thoughts I had which felt unusually insightful given how long I’d thought, it was all so clear to me.
And then there were times when even the soft-spoken Tathagatha listened to the words of his disciple, who had digested all of the things he had preached, had meditated long and fully upon them and now, as though he had found entrance to a secret sea, dipped with his steel-hard hand into places of hidden waters, and then sprinkled a thing of truth and beauty upon the heads of the hearers. Summer passed. There was no doubt now that there were two who had received enlightenment:
I really liked your concrete example. I had first only read your first paragraphs, highlighted this as something interesting with potentially huge upsides, but I felt like it was really hard to tell for me whether the thing you are describing was something I already do or not. After reading the rest I was able to just think about the question myself and notice that thinking about the explicit likelihood ratios is something I am used to doing. Though I did not go into quite as much detail as you did, which I blame partially on motivation and partially as “this skill has a higher ceiling than I would have previously thought”.
In Eliezer’s mad investor chaos and the woman of asmodeus, the reader experiences (mild spoilers in the spoiler box, heavy spoilers if you click the text):
a character deriving and internalizing probability theory for the first time. They learned to introspectively experience belief updates, holding their brain’s processes against the Lawful theoretical ideal prescribed by Bayes’ Law.
I thought this part was beautiful. I spent four hours driving yesterday, and nearly all of that time re-listening to Rationality: AI->Zombies using this “probability sight frame. I practiced translating each essay into the frame.
When I think about the future, I feel a directed graph showing the causality, with branched updated beliefs running alongside the future nodes, with my mind enforcing the updates on the beliefs at each time step. In this frame, if I heard the pattering of a four-legged animal outside my door, and I consider opening the door, then I can feel the future observation forking my future beliefs depending on how reality turns out. But if I imagine being blind and deaf, there is no way to fuel my brain with reality-distinguishment/evidence, and my beliefs can’t adapt according to different worlds.
I can somehow feel how the qualitative nature of my update rule changes as my senses change, as the biases in my brain change and attenuate and weaken, as I gain expertise in different areas—thereby providing me with more exact reality-distinguishing capabilities, the ability to update harder on the same amount of information, making my brain more efficient at consuming new observations and turning them into belief-updates.
When I thought about questions of prediction and fact, I experienced unusual clarity and precision. EG R:AZ mentioned MIRI, and my thoughts wandered to “Suppose it’s 2019, MIRI just announced their ‘secret by default’ policy. If MIRI doesn’t make much progress in the next few years, what should my update be on how hard they’re working?”. (EDIT: I don’t have a particular bone to pick here; I think MIRI is working hard.)
Before I’d have hand-waved something about absence of evidence is evidence of absence, but the update was probably small. Now, I quickly booted up the “they’re lazy” and the “working diligently” hypotheses, and quickly saw that I was tossing out tons of information by reasoning so superficially, away from the formalism.
I realized that the form of the negative result-announcements could be very informative. MIRI could, in some worlds, explain the obstacles they hit, in a way which is strong evidence they worked hard, even while keeping most of their work secret. (It’s like if some sadistic CS prof in 1973 assigned proving P?NP over the summer, and his students came back with “but relativization”, you’d know they’d worked hard, that’s very legible and crisp progress showing it’s hard.)
Further, the way in which the announcement was written would matter, I could feel the likelihood ratio P(progress to date | lazy) / P(progress to date | diligent) shift around, reflecting my hypotheses say about what realities induce what communication.
I also very quickly realized that the overall update towards “not much effort” is strongly controlled by my beliefs about how hard alignment is; if the problem had been “prove 1+1=2 in PA” and they came back empty-handed a year later, obviously that’s super strong evidence they were messing around. But if I think alignment is basically impossible, then P(little progress | lazy) > P(little progress | diligent) just barely holds, and the likelihood ratio is correspondingly close to 1.
And all of this seems inane when I type it out, like duh, but the magic was seeing it and feeling it all in less than 20 seconds, deriving it as consequences of the hypotheses and the form updates (should) take, instead of going down a checklist of useful rules of thumb, considerations to keep in mind for situations like this one. And then there were several more thoughts I had which felt unusually insightful given how long I’d thought, it was all so clear to me.
I really liked your concrete example. I had first only read your first paragraphs, highlighted this as something interesting with potentially huge upsides, but I felt like it was really hard to tell for me whether the thing you are describing was something I already do or not. After reading the rest I was able to just think about the question myself and notice that thinking about the explicit likelihood ratios is something I am used to doing. Though I did not go into quite as much detail as you did, which I blame partially on motivation and partially as “this skill has a higher ceiling than I would have previously thought”.