soth02 comments on Half-baked AI Safety ideas thread

soth02 22 Jul 2022 23:55 UTC
1 point
Does there have to be a reward? This is using brute force to create the underlying world model. It’s just adjusting weights right?
- Evan R. Murphy 23 Jul 2022 0:47 UTC
  1 point
  Parent
  I think there has to be some kind of reward or loss function, in the current paradigm anyway. That’s what gradient descent uses to know such weights to adjust on each update.
  
  Like what are you imagining is the input output channel of this AI? Maybe discussing this a bit would help us clarify.
  - Hastings 23 Jul 2022 1:49 UTC
    2 points
    Parent
    To steelman, I’d guess this idea applies in the hypothetical where GPT-N gains general intelligence and agency (such as via a mesa-optimizer) just by predicting the next token.