To me, and I expect to a group of other readers as well given upvotes to my comments and those of @subconvergence, this is a direction we really think LessWrong should not go in.
I think Raemon is able to see some appeals of this post that are something like:
Boggling at the world is good, and we don’t get as much of that as we’d like
People reading this post may be able to form their own takeaway, which is something like “man, we really should be checking more comprehensively whether dogs can talk.”, and ideas like that are worth sharing.
I think there are some significant problems with this, however:
This post doesn’t really advocate boggling at the world, instead it makes specific, strongly worded claims that there is already decently meaningful evidence of something.
This post does not really advocate for that takeaway, and to the extent someone thinks it does, it’s certainly far from the most prominent message.
If LWers took-this-to-heart, there are likely between 10,000-1,000,000 things of similar potential interest and evidence base that could be shared in this way, the vast majority of which are highly unlikely to be proven out.
Furthermore, in addition to thinking that if I’ve guessed at Raemon’s motivations correctly, they’re based on flawed reasoning for the above 3 reasons, I think there are other factors that make this actively damaging:
This makes strongly worded claims that are either untestable (assumptions on LW readership’s priors in title), or not nearly supported by the evidence.
It employs a number of techniques that, to some, are seemingly effective at actively misleading them (see bullets 2 and 3 of my previous comment, I also find a number more employed in the author’s comments).
This, to me, pattern matches with LWers having much less critical readership than expected, and than should be warranted when confronting a post of this type, which I’ve observed in multiple prominent instances over the past ~3 months of newly close engagement. I think continued promotion of articles with poor calibration, with a lack of critical comments (and instead unwarranted enthusiasm), will further degrade LW’s state of discourse. Even if this is the only article of this type, and I’m wrong about some of the others, I think even one instance of promoting something that is a better fit for Buzzfeed is at least somewhat damaging.
I do think there’s a post that fits my guess at Raemon’s appeal of this post (the two bullets above) and avoids all the numbered issues I outline afterward, but this post is a far departure from that hypothetical one.
I appreciate your response, and my apologies that for time-efficiency reasons I’m only going to respond briefly and to some parts of it.
I don’t think it’s fair to say my dismissal of concerns is “cursory” if you include my comments under the post. Maybe the article itself didn’t go deep enough, partly I wanted it to scan well, partly I wanted to see good criticism so I could update/come up with good responses, because it’s not easy to preempt every criticism.
I’m somewhat sympathetic to this. I do feel as though given large claims e.g. “revolutionary” and the definite rather than the hedge in the title, it was worth doing more than the cursory in the article itself. I haven’t read your comments nor looked at the timing of them, but I imagine some to most readers read the article without seeing these comments. I’m saddened that those readers likely had much too strong a takeaway and upvoted this post.
As for cursory evidence, yes it’s mostly that, but cursory evidence can still be good Bayesian evidence. I think there’s enough to conclude there’s something interesting going on.
This stuff is highly suggestive,
I agree with the first and not with the second. I think this is lightly suggestive and I strongly suspect LWers who accept this level of evidence as highly suggestive will have some pretty inaccurate models of the world. For example, I do think most mommy-blogger, or pyramid-scheme, etc. things we see all over social media present similar, if not typically higher, levels of evidence.
What I had in the back of my mind is “if Eliezer gets to do it, then I get to do it too”.
I’m somewhat new to this community, so FWIW, while I certainly know who Eliezer is and have read some of his stuff, I don’t understand this reference.
I think the community simply likes boldly stated (and especially contrarian) claims, as long as it doesn’t go too far off-balance.
I find this quite disappointing, and would have expected the LW community to be better.
I can easily imagine structurally similar arguments from someone who thinks AI alignment or cryonics are weird “nerd woo”. If we’re to be good rationalist we have to recognize that most evidence isn’t neatly packaged for us in papers (or gwern articles) with hard numbers and rigorous analysis. We can’t just exclude the messy parts of the world and expect to arrive at a useful worldview. Sometimes interesting things happen on Instagram and Tiktok.
I don’t necessarily disagree with this, but I do think the arguments for AI alignment and cryonics have been much more thoughtfully presented, with approximately appropriate calibration.
steelman: even if your pet can tell you what they actually want to do instead of your having to guess, that’s a revolution in communication).
For dogs at least, there’s a threshold beyond which this would have to reach, to me, to start to become true (same with the title; the behaviors shown don’t necessarily point to me updating my priors). I’ve had three dogs, each of which had clear indicators for wanting to go out (e.g. pawing at the outside door, showing excitement when I asked) and wanting food.
I didn’t consciously go for any “maneuvers” to misrepresent things.
FWIW I absolutely believe this, and the rest of your points e.g. about the videos are well-taken. Thank you for your thoughtful response.
EDIT:
> Please watch this video even if you have time constraints (it works fine at 1.5x speed).
I’m not sure I understand why this was recommended; it didn’t seem notable to me and is more of a lets-feel-good-about-this video than anything.
This is an interesting response; mine is of the opposite valence. To me, this doesn’t feel too dissimilar from something my cousin-who-is-into-pyramid-schemes would send me. I believe that this post has:
Large claims that are not evidence supported
Mirages of evidence that do not meaningfully constitute such
Cursory dismissal of potential concerns
Claims that set off alarm bells to me in this post include:
Your Dog is Even Smarter Than You Think
Epistemic status: highly suggestive.
There’s a revolution going on and you’re sleeping on it.
her dog started to display capabilities for rudimentary syntax
Once your dog gets the hang of it, you’re able to add more buttons faster, but it’s never quick. Dogs take a while to come up with a response (they’re bright, but they’re not humans), and you can’t force your dog to learn, so you have to work together and find motivation (for the dog and for yourself!). And not every pet has a strong desire to communicate.
Bunny is creative with the limited button vocabulary available to her and tries to use words in novel ways to communicate: “stranger paw” for splinter in her paw, “sound settle” for shut up, “poop play” for fart, “paw” to refer to owner’s hand. … Bunny knows each of her doggy friends by name, thinks about them when they’re not there, asks where they are, requests to play with them. … Bunny understands times of day like today, morning, afternoon, night. … And can recall what time of day she went to the park. … Bunny is quite obsessed over her bowel movements (how Freudian) and about her owners’ poop cycle. … Bunny communicates emotional states like mad, happy, concerned. And “ugh”. … Bunny wants to know what and why is a “dog”. … And whether Mom used to be a dog. And she can recognize herself in the mirror.
To me, these failed to be supported by more than what I think is cursory evidence:
1. The first three bullets are not explicitly supported, but are presumably supported by the rest of the article. Besides the support I quote and address after this, key supporting evidence seems to be: - Under Stella: Explanation of a language learning system for autistic youth, the qualifications of the woman who precipitated this exploration, video of a dog seemingly pressing the buttons “bed”, “all done”, “come”, “outside”, and video of the dog seemingly pressing the buttons “help”, “good”, “want”, “eat”. -- I believe the videos are meant to be what constitutes evidence in this section.
There are some aspects of the videos that I think lend them some credibility: The dog uses the same paw to hit each button (seems more deliberate), approaches the pad slowly (seems more deliberate), may be looking at each button prior to pressing (ambiguous, but possibly lends credibility), and to my untrained eye it seems as if the video was indeed taken in one shot.
I also think there are aspects to the video aren’t compelling: If I were to create a board of what appears to be 40 general words, I’d imagine that I could assign meanings to many, perhaps most, random combinations. The meanings and word combinations portrayed here seem at least somewhat unlikely to be of high utility nor continuous thought. Why would a dog want to tell its owner “bed” “all done”, and why would an owner want to know that? Dogs tend to wake up and fall asleep quickly and frequently throughout the day. They don’t tend to have a nap time and difficulty waking up as if a toddler. There’s also no need for “come” to be paired with “outside”, “outside” is enough to request a walk. I tend to believe that simplicity would dominate here, and the complexity and interpretation necessary for these begs my credulity. The second word pairing, “help” “good” “want” “eat” really doesn’t have an obvious meaning from my perspective. “Help” “want” “eat” is more clear, for example, or “help” “eat”. This, to me, feels more likely to have been reading in to a random combination (at least when considering the first two words separately from the second two) than one coherent thought or expression. When trying to present evidence of a ~talking dog, I would expect there to be many, many videos of more plausible expression; these two as the leading evidence feels particularly questionable. I don’t have any reason to think that these videos are doctored, but FWIW, it would seemingly be easy to replace the audio (or control it remotely) to say whatever is desired.
I planned to go through this post point by point, but am finding myself wanting to move on for time efficiency reasons (I’ve also switched to an anonymous username given time constraints and associated limits to my presentation of this argument). I’ll quickly cover the remainder of the post:
- The Bunny section presents two types of evidence: videos and links to ongoing academic studies (without results). The videos are particularly not compelling, much less so than Stella’s. These videos show large gaps in time between button presses, uses of different paws, not looking at the buttons, word combinations that do not obviously have meaning, seeming disinterest from the dog, and many instances of multiple shots such that you’re trusting that they comprise one, rather than multiple timelines. I really struggle to find anything compelling about these. The links to ongoing studies particularly pattern match to me for those who try to fein credibility; an ongoing study without results is not an indicator of there being a positive result.
The Koko section does not present any evidence, and the honorable mentions section presents more videos, which I didn’t review for time efficiency reasons.
---
I also find that there was only cursory dismissal of potential concerns, which I why I was quite surprised to see your opposite take:
> quite appreciated the epistemic status woven throughout the post (i.e. concerns about Clever Hans, the steps attempted at addressing it, an the current status of how the jury is still out on some studies)
The only mention of Clever Hans merely says that the researcher is “well aware” of it. I don’t see any discussion of attempts to address it. Regarding studies, the jury appears to still be out on ALL studies. I think this is partially recognized but also understated in the article; the only relevant quote is:
> She has partnered with researches from University of California, San Diego to have several cameras looking at the button pad running 24⁄7, for them to do more rigorous analysis. Presumably there’s an actual paper on the way.
On the whole, this being curated (and the size of upvotes) has been one significant contributor to my feeling as though the state of critical readership and response on LW is much worse than what my prior was prior to more closely engaging lately. This article in particular feels not too dissimilar from something I could imagine on e.g. Buzzfeed; it just says some big things with very little substantive evidence and some maneuvers that seem most commonly used to weakly mask the lack of credibility of the argument. I’d love to hear your response; I think there is some likelihood that I’ll update or at least better understand this forum’s readership.
anon_standards
FWIW, I do not think you over-reacted, nor do I think I agree with any of the criticisms of the comment above.
To me, and I expect to a group of other readers as well given upvotes to my comments and those of @subconvergence, this is a direction we really think LessWrong should not go in.
I think Raemon is able to see some appeals of this post that are something like:
Boggling at the world is good, and we don’t get as much of that as we’d like
People reading this post may be able to form their own takeaway, which is something like “man, we really should be checking more comprehensively whether dogs can talk.”, and ideas like that are worth sharing.
I think there are some significant problems with this, however:
This post doesn’t really advocate boggling at the world, instead it makes specific, strongly worded claims that there is already decently meaningful evidence of something.
This post does not really advocate for that takeaway, and to the extent someone thinks it does, it’s certainly far from the most prominent message.
If LWers took-this-to-heart, there are likely between 10,000-1,000,000 things of similar potential interest and evidence base that could be shared in this way, the vast majority of which are highly unlikely to be proven out.
Furthermore, in addition to thinking that if I’ve guessed at Raemon’s motivations correctly, they’re based on flawed reasoning for the above 3 reasons, I think there are other factors that make this actively damaging:
This makes strongly worded claims that are either untestable (assumptions on LW readership’s priors in title), or not nearly supported by the evidence.
It employs a number of techniques that, to some, are seemingly effective at actively misleading them (see bullets 2 and 3 of my previous comment, I also find a number more employed in the author’s comments).
This, to me, pattern matches with LWers having much less critical readership than expected, and than should be warranted when confronting a post of this type, which I’ve observed in multiple prominent instances over the past ~3 months of newly close engagement. I think continued promotion of articles with poor calibration, with a lack of critical comments (and instead unwarranted enthusiasm), will further degrade LW’s state of discourse. Even if this is the only article of this type, and I’m wrong about some of the others, I think even one instance of promoting something that is a better fit for Buzzfeed is at least somewhat damaging.
I do think there’s a post that fits my guess at Raemon’s appeal of this post (the two bullets above) and avoids all the numbered issues I outline afterward, but this post is a far departure from that hypothetical one.
I appreciate your response, and my apologies that for time-efficiency reasons I’m only going to respond briefly and to some parts of it.
I’m somewhat sympathetic to this. I do feel as though given large claims e.g. “revolutionary” and the definite rather than the hedge in the title, it was worth doing more than the cursory in the article itself. I haven’t read your comments nor looked at the timing of them, but I imagine some to most readers read the article without seeing these comments. I’m saddened that those readers likely had much too strong a takeaway and upvoted this post.
I agree with the first and not with the second. I think this is lightly suggestive and I strongly suspect LWers who accept this level of evidence as highly suggestive will have some pretty inaccurate models of the world. For example, I do think most mommy-blogger, or pyramid-scheme, etc. things we see all over social media present similar, if not typically higher, levels of evidence.
I’m somewhat new to this community, so FWIW, while I certainly know who Eliezer is and have read some of his stuff, I don’t understand this reference.
I find this quite disappointing, and would have expected the LW community to be better.
I don’t necessarily disagree with this, but I do think the arguments for AI alignment and cryonics have been much more thoughtfully presented, with approximately appropriate calibration.
For dogs at least, there’s a threshold beyond which this would have to reach, to me, to start to become true (same with the title; the behaviors shown don’t necessarily point to me updating my priors). I’ve had three dogs, each of which had clear indicators for wanting to go out (e.g. pawing at the outside door, showing excitement when I asked) and wanting food.
FWIW I absolutely believe this, and the rest of your points e.g. about the videos are well-taken. Thank you for your thoughtful response.
EDIT:
> Please watch this video even if you have time constraints (it works fine at 1.5x speed).
I’m not sure I understand why this was recommended; it didn’t seem notable to me and is more of a lets-feel-good-about-this video than anything.
This is an interesting response; mine is of the opposite valence. To me, this doesn’t feel too dissimilar from something my cousin-who-is-into-pyramid-schemes would send me. I believe that this post has:
Large claims that are not evidence supported
Mirages of evidence that do not meaningfully constitute such
Cursory dismissal of potential concerns
Claims that set off alarm bells to me in this post include:
To me, these failed to be supported by more than what I think is cursory evidence:
1. The first three bullets are not explicitly supported, but are presumably supported by the rest of the article. Besides the support I quote and address after this, key supporting evidence seems to be:
- Under Stella: Explanation of a language learning system for autistic youth, the qualifications of the woman who precipitated this exploration, video of a dog seemingly pressing the buttons “bed”, “all done”, “come”, “outside”, and video of the dog seemingly pressing the buttons “help”, “good”, “want”, “eat”.
-- I believe the videos are meant to be what constitutes evidence in this section.
There are some aspects of the videos that I think lend them some credibility: The dog uses the same paw to hit each button (seems more deliberate), approaches the pad slowly (seems more deliberate), may be looking at each button prior to pressing (ambiguous, but possibly lends credibility), and to my untrained eye it seems as if the video was indeed taken in one shot.
I also think there are aspects to the video aren’t compelling: If I were to create a board of what appears to be 40 general words, I’d imagine that I could assign meanings to many, perhaps most, random combinations. The meanings and word combinations portrayed here seem at least somewhat unlikely to be of high utility nor continuous thought. Why would a dog want to tell its owner “bed” “all done”, and why would an owner want to know that? Dogs tend to wake up and fall asleep quickly and frequently throughout the day. They don’t tend to have a nap time and difficulty waking up as if a toddler. There’s also no need for “come” to be paired with “outside”, “outside” is enough to request a walk. I tend to believe that simplicity would dominate here, and the complexity and interpretation necessary for these begs my credulity. The second word pairing, “help” “good” “want” “eat” really doesn’t have an obvious meaning from my perspective. “Help” “want” “eat” is more clear, for example, or “help” “eat”. This, to me, feels more likely to have been reading in to a random combination (at least when considering the first two words separately from the second two) than one coherent thought or expression. When trying to present evidence of a ~talking dog, I would expect there to be many, many videos of more plausible expression; these two as the leading evidence feels particularly questionable. I don’t have any reason to think that these videos are doctored, but FWIW, it would seemingly be easy to replace the audio (or control it remotely) to say whatever is desired.
I planned to go through this post point by point, but am finding myself wanting to move on for time efficiency reasons (I’ve also switched to an anonymous username given time constraints and associated limits to my presentation of this argument). I’ll quickly cover the remainder of the post:
- The Bunny section presents two types of evidence: videos and links to ongoing academic studies (without results). The videos are particularly not compelling, much less so than Stella’s. These videos show large gaps in time between button presses, uses of different paws, not looking at the buttons, word combinations that do not obviously have meaning, seeming disinterest from the dog, and many instances of multiple shots such that you’re trusting that they comprise one, rather than multiple timelines. I really struggle to find anything compelling about these. The links to ongoing studies particularly pattern match to me for those who try to fein credibility; an ongoing study without results is not an indicator of there being a positive result.
The Koko section does not present any evidence, and the honorable mentions section presents more videos, which I didn’t review for time efficiency reasons.
---
I also find that there was only cursory dismissal of potential concerns, which I why I was quite surprised to see your opposite take:
> quite appreciated the epistemic status woven throughout the post (i.e. concerns about Clever Hans, the steps attempted at addressing it, an the current status of how the jury is still out on some studies)
The only mention of Clever Hans merely says that the researcher is “well aware” of it. I don’t see any discussion of attempts to address it. Regarding studies, the jury appears to still be out on ALL studies. I think this is partially recognized but also understated in the article; the only relevant quote is:
> She has partnered with researches from University of California, San Diego to have several cameras looking at the button pad running 24⁄7, for them to do more rigorous analysis. Presumably there’s an actual paper on the way.
On the whole, this being curated (and the size of upvotes) has been one significant contributor to my feeling as though the state of critical readership and response on LW is much worse than what my prior was prior to more closely engaging lately. This article in particular feels not too dissimilar from something I could imagine on e.g. Buzzfeed; it just says some big things with very little substantive evidence and some maneuvers that seem most commonly used to weakly mask the lack of credibility of the argument. I’d love to hear your response; I think there is some likelihood that I’ll update or at least better understand this forum’s readership.