Jumping off this (and aware of what you said below), this post makes me uncomfortable in the number of people who are earnestly debating a surface analogy. I get it if people are just having fun and blowing off steam, but it’s pretty weird for me to see people acting as if (and explicitly stating they are!) a metaphor to bad products on Amazon somehow changes whether or not alignment is an issue.
I’m confused by something happening here. I refuse to fall over onto the “Amazon.com is a reliable marker for the seriousness of alignment”, but it seems most people here are. What gives?
I think it’s more relevant in the genre of “rationalists evaluating civilization’s adequacy” than “alignment metaphor.” It’s a big running question how correct these critiques are. (As alignment metaphor I feel it’s more like fable than evidence, though I think others may read more into that.)
There’s something compelling about picking on someone’s cherry-picked example of inadequacy. Weaknesses in it feel at least as compelling as weaknesses in “random piece of evidence that established their current view about inadequacy.”
My initial overly-detailed comments were largely caused by browsing alignment-forum.com while my wife took a nap on vacation (leaving me unusually likely to follow random impulses about what to do without regard for usefulness).
From there I think the conversation was in part sustained by the usual arguing-on-internet-energy.
As a datapoint: I am usually on the Wentworth side of the Paul-John spectrum but I found Paul’s internet arguing about ACs compelling and have updated very slightly to Paul’s side. :)
Some people may simply have been nerd-sniped, but the OP does seem to present the air conditioner thing as a real piece of evidence, not just a shallow illustrative analogy. When they get literal at the end, they say:
admittedly I did not actually learn everything I need to know about takeoff speeds just from air conditioner ratings on Amazon. It took a lot of examples in different industries.
Also, given that the example was presented with such high confidence, and took up a significant portion of a post that was otherwise only moderately detailed, I don’t think it’s unreasonable for people’s confidence in the poster and the post to drop if the example turns out to be built on a misunderstanding.
(I’m not suggesting the OP was right or wrong, I have no object-level knowledge here.)
I think to some degree it makes sense to debate it.
John Wentworth is offering the example of the airconditioner as indicative of a broader problem woth society. Of course the airconditioner can’t prove whether or not there is this broader problem, so it’s not evidence in itself, but we could take John Wentworth’s post to say that he has seen many things like the airconditioner, and that these many things tell him society may fail to fix major problems if they are hard to notice.
Per Aumann’s agreement theorem, this then becomes strong evidence that society has lots of cases where people fail to fix major problems if they are hard to notice. But hold on—Aumann’s agreement theorem assumes that people are rational, that they correctly interpret evidence, is this assumption correct for John Wentworth?
By providing the example of the airconditioner, we can see how he might interpret the evidence about whether or not society may fail to notice major flaws. But this also makes the airconditioner example fairly important if it fails to hold.
Side note: I think that most people are clueless enough of the time that Aumann should mostly be ignored. This also holds for people updating off of what I think: I do not think most readers actually have enough bits of evidence about the reliability of my reasoning that they should Aumann-style update off of it. Instead, I try to make my own reasoning process as legible as possible in my writing, so that people can directly follow the gears and update based on the inside view, rather than just trust my judgement.
This is probably the best argument for why you should care about the surface-level analogy, but I still don’t find it compelling because you need quite different kinds of domain expertise when thinking about how air conditioners work compared to what you need for AI alignment work.
Many critics seem to be concerned about whether people working in alignment have their heads lost in clouds of abstraction or get proper contact with reality. This intuitively seems like it would be tested by whether the examples provided lack direct experience.
Even just a little bit of domain expertise is useful! I understand your point, and even agree to some extent, but I think it’s also great that others are discussing the object-level details of “the surface-level analogy”. Both the argument using the analogy, and the analogy itself, seem like potentially fruitful topics to discuss.
Jumping off this (and aware of what you said below), this post makes me uncomfortable in the number of people who are earnestly debating a surface analogy. I get it if people are just having fun and blowing off steam, but it’s pretty weird for me to see people acting as if (and explicitly stating they are!) a metaphor to bad products on Amazon somehow changes whether or not alignment is an issue.
I’m confused by something happening here. I refuse to fall over onto the “Amazon.com is a reliable marker for the seriousness of alignment”, but it seems most people here are. What gives?
I think it’s more relevant in the genre of “rationalists evaluating civilization’s adequacy” than “alignment metaphor.” It’s a big running question how correct these critiques are. (As alignment metaphor I feel it’s more like fable than evidence, though I think others may read more into that.)
There’s something compelling about picking on someone’s cherry-picked example of inadequacy. Weaknesses in it feel at least as compelling as weaknesses in “random piece of evidence that established their current view about inadequacy.”
My initial overly-detailed comments were largely caused by browsing alignment-forum.com while my wife took a nap on vacation (leaving me unusually likely to follow random impulses about what to do without regard for usefulness).
From there I think the conversation was in part sustained by the usual arguing-on-internet-energy.
As a datapoint: I am usually on the Wentworth side of the Paul-John spectrum but I found Paul’s internet arguing about ACs compelling and have updated very slightly to Paul’s side. :)
Some people may simply have been nerd-sniped, but the OP does seem to present the air conditioner thing as a real piece of evidence, not just a shallow illustrative analogy. When they get literal at the end, they say:
Also, given that the example was presented with such high confidence, and took up a significant portion of a post that was otherwise only moderately detailed, I don’t think it’s unreasonable for people’s confidence in the poster and the post to drop if the example turns out to be built on a misunderstanding.
(I’m not suggesting the OP was right or wrong, I have no object-level knowledge here.)
I think to some degree it makes sense to debate it.
John Wentworth is offering the example of the airconditioner as indicative of a broader problem woth society. Of course the airconditioner can’t prove whether or not there is this broader problem, so it’s not evidence in itself, but we could take John Wentworth’s post to say that he has seen many things like the airconditioner, and that these many things tell him society may fail to fix major problems if they are hard to notice.
Per Aumann’s agreement theorem, this then becomes strong evidence that society has lots of cases where people fail to fix major problems if they are hard to notice. But hold on—Aumann’s agreement theorem assumes that people are rational, that they correctly interpret evidence, is this assumption correct for John Wentworth?
By providing the example of the airconditioner, we can see how he might interpret the evidence about whether or not society may fail to notice major flaws. But this also makes the airconditioner example fairly important if it fails to hold.
Side note: I think that most people are clueless enough of the time that Aumann should mostly be ignored. This also holds for people updating off of what I think: I do not think most readers actually have enough bits of evidence about the reliability of my reasoning that they should Aumann-style update off of it. Instead, I try to make my own reasoning process as legible as possible in my writing, so that people can directly follow the gears and update based on the inside view, rather than just trust my judgement.
This is probably the best argument for why you should care about the surface-level analogy, but I still don’t find it compelling because you need quite different kinds of domain expertise when thinking about how air conditioners work compared to what you need for AI alignment work.
Many critics seem to be concerned about whether people working in alignment have their heads lost in clouds of abstraction or get proper contact with reality. This intuitively seems like it would be tested by whether the examples provided lack direct experience.
Even just a little bit of domain expertise is useful! I understand your point, and even agree to some extent, but I think it’s also great that others are discussing the object-level details of “the surface-level analogy”. Both the argument using the analogy, and the analogy itself, seem like potentially fruitful topics to discuss.
Is this a rhetorical question? If not, it would help if you provided quotes and got the attention of the specific commenters you are referencing.