Availability bias- please help me find a study

So, the ancient example of availability bias is “are there more words with ‘r’ in the first letter or third letter?” Supposedly, most people say “first” when in fact the answer is “third.”

However, I remember reading somewhere recently (I believe on Less Wrong in the last month, probably in a comment?) that this is actually a pretty bad example, as people get it right for a sizable number of letters /​ the results aren’t reliably replicated. Is that the case, and more importantly, where can I find studies that state that (or the opposite)?

The best I’ve been able to find is this study, which compares 1st letter and 2nd letter frequencies, finding that people are good at estimating those. While it challenges some implications of the earlier finding, it doesn’t contradict it. Anyone already familiar with the literature able to point me at something more relevant, or is that the best study out there now?