How would an AI be directed without using a reward function? Are there some examples I can read?
Current AIs are mostly not explicit expected-utility-maximizers. I think this is illustrated by RLHF (https://huggingface.co/blog/rlhf).
But isn’t that also using a reward function? The AI is trying to maximise the reward it receives from the Reward Model. The Reward Model that was trained using Human Feedback.
How would an AI be directed without using a reward function? Are there some examples I can read?
Current AIs are mostly not explicit expected-utility-maximizers. I think this is illustrated by RLHF (https://huggingface.co/blog/rlhf).
But isn’t that also using a reward function? The AI is trying to maximise the reward it receives from the Reward Model. The Reward Model that was trained using Human Feedback.