TAG comments on On presenting the case for AI risk

TAG 12 Mar 2022 1:18 UTC
1 point

Why would a program that’s been built with goal function A suddenly switch over to using goal function B just because it’s become smart enough to understand goal function B

Why would a programme have a goal function that’s complete separate from everything else? Our current most advanced AIs don’t. If it did, why would you want one implementation of human semantics in the goal function, and another one in the implementation function? Why reduplicate the effort?
- DaemonicSigil 12 Mar 2022 1:51 UTC
  1 point
  Parent
  What are you taking to be the current most advanced AIs? If it’s something like GPT-3, then the goal function is just to maximize log(probability assigned to the actual next token). This is separate from the rest of the network, though information flows back and forth. (Forwards because the network chooses the probabilities, and backwards though of back-propagation of gradients.) My point here is that GPT-N is not going to suddenly decide “hey, I’m going to use cos(product of all networks outputs together) as my new goal function”.
  - TAG 13 Mar 2022 17:42 UTC
    1 point
    Parent
    What I mean by a goal function is something thar, if changed, without changing anything else, will cause a general purpose AI to do something different. What I don’t mean is the vacuous sense in which a toaster has the goal of making toast. A toaster is not going to suddenly start boiling water, but that is because of its limitations, not because of a goal.
    
    My point here is that GPT-N is not going to suddenly decide “hey, I’m going to use cos(product of all networks outputs together) as my new goal function”.
    
    The idea isn’t that goal functions don’t set goals (where they really exist). The idea is that if you have a very specific GF that’s programmed on plain English, it’s perverse to do instantiate using a poorer NL module than is otherwise available.