Unfortunately some of the collected comments are not actual quotes, but highly voted replies to quotes, the script can’t tell the difference.
It looks like all reply comments are contained (indirectly) by a div with a class name of “child”, and top level comments are not. So your script could filter replies by ignoring anything contained in a “child” div.
I quickly implemented your suggestion, had a look, and decided that I prefer the cleaner version after all. Many-many good quotes are lost this way, but we don’t have to think of that while browsing the list. :) I rewrote the article accordingly.
It looks like all reply comments are contained (indirectly) by a div with a class name of “child”, and top level comments are not. So your script could filter replies by ignoring anything contained in a “child” div.
I investigated this, but it turns out many good quotes are replies. I preferred false positives to false negatives.
I quickly implemented your suggestion, had a look, and decided that I prefer the cleaner version after all. Many-many good quotes are lost this way, but we don’t have to think of that while browsing the list. :) I rewrote the article accordingly.