None of the prompts tell it what to do, they aren’t even in english. (Or so I think? correct me if I’m wrong!) Instead they are in propositional logic, using atoms that refer to objects, colors, relations, and players. They just give the reward function in disjunctive normal form (i.e. big chain of disjunctions) and present it to the agent to observe.
None of the prompts tell it what to do, they aren’t even in english. (Or so I think? correct me if I’m wrong!) Instead they are in propositional logic, using atoms that refer to objects, colors, relations, and players. They just give the reward function in disjunctive normal form (i.e. big chain of disjunctions) and present it to the agent to observe.