Wei Dai comments on Will humans build goal-directed agents?

Wei Dai 5 Jan 2019 23:14 UTC
LW: 4 AF: 3
AF

Btw, this post also views Paul’s agenda through the lens of constructing imitations of humans.

Right, so I think I wasn’t really making a new observation, but just clearing up a confusion on my own part, where for a long time I didn’t understand how the idea of approval-directed agency fits into IDA because people switched from talking about approval-directed agency to imitation learning (or were talking about them interchangeably) and I didn’t catch the connection. So at this point I understand Paul’s trajectory of views as follows:

goal-directed agent ⇒ approval-directed agent ⇒ use IDA to scale up approval-direct agent ⇒ approval-directed agency as a form of imitation learning / generalize to other forms of imitation learning ⇒ generalize IDA to safely scale up other (including more goal-directed / consequentialist) forms of ML (see An Unaligned Benchmark which I think represents his current views)

(Someone please chime in if this still seems wrong or confused.)