TAG comments on Why are we sure that AI will “want” something?

TAG 17 Sep 2022 13:48 UTC
1 point
0

But having strong, precise impacts in open-ended environments is closely related to consequentialism.

But consequentialism only means achieving some kind of goal: it doesn’t have to be a goal you are motivated by. If you are motivated to fulfil goals that are given to you , you can still “use consequentialism”.
- tailcalled 17 Sep 2022 15:05 UTC
  2 points
  0
  Parent
  Sure, and this point is closely related to a setting I commonly think about for alignment, namely what if we had an ASI which modularly allowed specifying any kind of goal we want. Can we come up with any nontrivial goals that it wouldn’t be a catastrophe to give to it?
  As a side-note, this is somewhat complicated by the fact that it matters massively how we define “goal”. Some notions of goals seem to near-provably lead to problems (e.g. an AIXI type situation where the AI is maximizing reward and we have a box outside the AI which presses the reward button in some circumstances—this would almost certainly lead to wireheading no matter what we do), while other notions of goals seem to be trivial (e.g. we could express a goal as a function over its actions, but such a goal would have to contain almost all the intelligence of the AI in order to produce anything useful).
  - TAG 17 Sep 2022 17:25 UTC
    1 point
    0
    Parent
    
    Can we come up with any nontrivial goals that it wouldn’t be a catastrophe to give to it?
    
    We already have some systems with goals. They seem to mostly fail in the direction of wireheading, which is not catastrophic.
    - tailcalled 17 Sep 2022 17:36 UTC
      3 points
      0
      Parent
      Yes but I was talking about artificial superintelligences, not just any system with goals.
      - TAG 17 Sep 2022 19:00 UTC
        1 point
        0
        Parent
        Superintelligences don’t necessarily have goals, and could arrive gradually. A jump to agentive, goal driven ASI is the worst case scenario, but it’s also conjunctive.
        tailcalled 18 Sep 2022 7:00 UTC
        3 points
        1
        Parent
        It’s not meant as a projection for what is likely to happen, it’s meant as a toy model that makes it easier to think baout what sorts of goals we would like to give our AI.
        TAG 18 Sep 2022 14:29 UTC
        2 points
        0
        Parent
        Well, I already answered that question.
        tailcalled 18 Sep 2022 16:21 UTC
        1 point
        0
        Parent
        Maybe, but then I don’t see your answer.