Obviously the point about air conditioners doesn’t matter
I’d like to remark that, at least for me, the facts-of-the-matter about whether this particular air conditioner works by Goodharting consumer preferences actually affect my views on AI. The OP quite surprised my world model, which did not expect one of the most popular AC units on Amazon to work by deceiving consumers. If lots of the modern world works this way, then John’s intuition that advanced ML systems are almost certain to work by Goodharting our preferences seems much more likely. Before seeing the above comment and jbash’s comment, I was in the process of updating my views, not because I thought the OP was an enlightening allegory, but because it actually changed what I thought the world was like.
Conversely, the world model “sometimes the easiest way to achieve some objective is to actually do the intended thing instead of Goodharting” would predict that air conditioner example was wrong somehow, a prediction which seems to have been right (if Paul’s and jbash’s comments are correct, that is). I was quite impressed by this, and am now more confident in the “Goodharting isn’t omnipresent” world model.
In any case, my main point is that I actually do care about what’s going on in this air conditioning example (and I encourage further discussion on whether the OP’s characterization of it is accurate or not).
I’d like to remark that, at least for me, the facts-of-the-matter about whether this particular air conditioner works by Goodharting consumer preferences actually affect my views on AI. The OP quite surprised my world model, which did not expect one of the most popular AC units on Amazon to work by deceiving consumers. If lots of the modern world works this way, then John’s intuition that advanced ML systems are almost certain to work by Goodharting our preferences seems much more likely. Before seeing the above comment and jbash’s comment, I was in the process of updating my views, not because I thought the OP was an enlightening allegory, but because it actually changed what I thought the world was like.
Conversely, the world model “sometimes the easiest way to achieve some objective is to actually do the intended thing instead of Goodharting” would predict that air conditioner example was wrong somehow, a prediction which seems to have been right (if Paul’s and jbash’s comments are correct, that is). I was quite impressed by this, and am now more confident in the “Goodharting isn’t omnipresent” world model.
In any case, my main point is that I actually do care about what’s going on in this air conditioning example (and I encourage further discussion on whether the OP’s characterization of it is accurate or not).