I think to some degree it makes sense to debate it.
John Wentworth is offering the example of the airconditioner as indicative of a broader problem woth society. Of course the airconditioner can’t prove whether or not there is this broader problem, so it’s not evidence in itself, but we could take John Wentworth’s post to say that he has seen many things like the airconditioner, and that these many things tell him society may fail to fix major problems if they are hard to notice.
Per Aumann’s agreement theorem, this then becomes strong evidence that society has lots of cases where people fail to fix major problems if they are hard to notice. But hold on—Aumann’s agreement theorem assumes that people are rational, that they correctly interpret evidence, is this assumption correct for John Wentworth?
By providing the example of the airconditioner, we can see how he might interpret the evidence about whether or not society may fail to notice major flaws. But this also makes the airconditioner example fairly important if it fails to hold.
Side note: I think that most people are clueless enough of the time that Aumann should mostly be ignored. This also holds for people updating off of what I think: I do not think most readers actually have enough bits of evidence about the reliability of my reasoning that they should Aumann-style update off of it. Instead, I try to make my own reasoning process as legible as possible in my writing, so that people can directly follow the gears and update based on the inside view, rather than just trust my judgement.
This is probably the best argument for why you should care about the surface-level analogy, but I still don’t find it compelling because you need quite different kinds of domain expertise when thinking about how air conditioners work compared to what you need for AI alignment work.
Many critics seem to be concerned about whether people working in alignment have their heads lost in clouds of abstraction or get proper contact with reality. This intuitively seems like it would be tested by whether the examples provided lack direct experience.
Even just a little bit of domain expertise is useful! I understand your point, and even agree to some extent, but I think it’s also great that others are discussing the object-level details of “the surface-level analogy”. Both the argument using the analogy, and the analogy itself, seem like potentially fruitful topics to discuss.
I think to some degree it makes sense to debate it.
John Wentworth is offering the example of the airconditioner as indicative of a broader problem woth society. Of course the airconditioner can’t prove whether or not there is this broader problem, so it’s not evidence in itself, but we could take John Wentworth’s post to say that he has seen many things like the airconditioner, and that these many things tell him society may fail to fix major problems if they are hard to notice.
Per Aumann’s agreement theorem, this then becomes strong evidence that society has lots of cases where people fail to fix major problems if they are hard to notice. But hold on—Aumann’s agreement theorem assumes that people are rational, that they correctly interpret evidence, is this assumption correct for John Wentworth?
By providing the example of the airconditioner, we can see how he might interpret the evidence about whether or not society may fail to notice major flaws. But this also makes the airconditioner example fairly important if it fails to hold.
Side note: I think that most people are clueless enough of the time that Aumann should mostly be ignored. This also holds for people updating off of what I think: I do not think most readers actually have enough bits of evidence about the reliability of my reasoning that they should Aumann-style update off of it. Instead, I try to make my own reasoning process as legible as possible in my writing, so that people can directly follow the gears and update based on the inside view, rather than just trust my judgement.
This is probably the best argument for why you should care about the surface-level analogy, but I still don’t find it compelling because you need quite different kinds of domain expertise when thinking about how air conditioners work compared to what you need for AI alignment work.
Many critics seem to be concerned about whether people working in alignment have their heads lost in clouds of abstraction or get proper contact with reality. This intuitively seems like it would be tested by whether the examples provided lack direct experience.
Even just a little bit of domain expertise is useful! I understand your point, and even agree to some extent, but I think it’s also great that others are discussing the object-level details of “the surface-level analogy”. Both the argument using the analogy, and the analogy itself, seem like potentially fruitful topics to discuss.