Note the prompt I used doesn’t actually say anything about Lesswrong, but gpt-4-base only assigned Lesswrong commentors substantial probability, which is not surprising since there are all sorts of giveaways that a comment is on Lesswrong from the content alone.
Filtering for people in the world who have publicly had detailed, canny things to say about language models and alignment and even just that lack regularities shared among most “LLM alignment researchers” or other distinctive groups like academia narrows you down to probably just a few people, including Gwern.
The reason truesight works (more than one might naively expect) is probably mostly that there’s mountains of evidence everywhere (compared to naively expected). Models don’t need to be superhuman except in breadth of knowledge to be potentially qualitatively superhuman in effects downstream of truesight-esque capabilities because humans are simply unable to integrate the plenum of correlations.
The reason truesight works (more than one might naively expect) is probably mostly that there’s mountains of evidence everywhere (compared to naively expected)
Yes, long before LLMs existed, there were some “detective” sites that were scary good at inferring all sorts of stuff, from demographics, ethnicity, to financial status of reddit accounts, based on which subreddits they were on, where and (more importantly) what they posted
Note the prompt I used doesn’t actually say anything about Lesswrong, but gpt-4-base only assigned Lesswrong commentors substantial probability, which is not surprising since there are all sorts of giveaways that a comment is on Lesswrong from the content alone.
Filtering for people in the world who have publicly had detailed, canny things to say about language models and alignment and even just that lack regularities shared among most “LLM alignment researchers” or other distinctive groups like academia narrows you down to probably just a few people, including Gwern.
The reason truesight works (more than one might naively expect) is probably mostly that there’s mountains of evidence everywhere (compared to naively expected). Models don’t need to be superhuman except in breadth of knowledge to be potentially qualitatively superhuman in effects downstream of truesight-esque capabilities because humans are simply unable to integrate the plenum of correlations.
Yes, long before LLMs existed, there were some “detective” sites that were scary good at inferring all sorts of stuff, from demographics, ethnicity, to financial status of reddit accounts, based on which subreddits they were on, where and (more importantly) what they posted
Humans are leaky