lionhearted (Sebastian Marshall) comments on The Fusion Power Generator Scenario

lionhearted (Sebastian Marshall) 15 Aug 2020 13:10 UTC
8 points
Hmm. I’m having a hard time writing this clearly, but I wonder if you could get interesting results by:
- Training on a wide range of notably excellent papers from “narrow-scoped” domains,
- Training on a wide range of papers that explore “we found this worked in X field, and we’re now seeing if it also works in Y field” syntheses,
- Then giving GPT-N prompts to synthesize narrow-scoped domains in which that hasn’t been done yet.
You’d get some nonsense, I imagine, but it would probably at least spit out plausible hypotheses for actual testing, eh?
- Razied 15 Aug 2020 14:54 UTC
  6 points
  Parent
  The practical problem with that is probably that you need to manually decide which papers go in which category. GPT needs such an enormous amount of data that any curating done needs to be automated. So metadata like authors, subject, date, website of provenance are quite easy to obtain for each example, but really high level stuff like “paper is about applying the methods of field X in field Y” is really hard.