TAG comments on What are the flaws in this argument about p(Doom)?

TAG 11 Aug 2023 16:27 UTC
3 points
0

Designing a better version of itself will increase an AI’s reward function

An AI doesn’t have to have a reward function, or one that implies self improvement. RFs often only apply at the training stage.
- William the Kiwi 12 Aug 2023 13:56 UTC
  1 point
  0
  Parent
  How would an AI be directed without using a reward function? Are there some examples I can read?
  - ProgramCrafter 13 Aug 2023 15:50 UTC
    1 point
    0
    Parent
    Current AIs are mostly not explicit expected-utility-maximizers. I think this is illustrated by RLHF (https://huggingface.co/blog/rlhf).
    - William the Kiwi 18 Aug 2023 19:17 UTC
      1 point
      0
      Parent
      But isn’t that also using a reward function? The AI is trying to maximise the reward it receives from the Reward Model. The Reward Model that was trained using Human Feedback.