Max H comments on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

Max H 9 Mar 2023 16:14 UTC
15 points
0
This is a cool idea; thanks for creating the bounty.
I probably won’t have time to attempt this myself, but my guess is someone who simply follows the langchain example here: https://github.com/hwchase17/chat-langchain can get pretty far, if they choose the right set of documents!
Note, I believe the sequences themselves can be easily (and permissibly) scraped from here: https://www.readthesequences.com/
For non-sequence posts, there’s a small obstacle: LW’s terms specifically prohibit “Using a spider, scraper, or other automated technology to access the Website;”
(https://docs.google.com/viewer?url=https%3A%2F%2Fintelligence.org%2Ffiles%2FPrivacyandTerms-Lesswrong.com.pdf)
Not sure if Arbital, AF, etc. have similar restrictions, though it might suffice to just save the most important posts and papers by hand. (In fact, that might even produce better results—probably there are a lot of bad alignment takes on LW that should be excluded from the index anyway.)
What links here?
- Max H's comment on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent by ArthurB (27 Mar 2023 3:59 UTC; 2 points)
- ArthurB 9 Mar 2023 17:10 UTC
  5 points
  0
  Parent
  It should be possible to ask content owners for permission and get pretty far with that.