The second one. The logic for increasing capabilities is “if I increase my capabilities, then I’ll better reach my goal”. But Gato does not implement the dynamic of “if I infer [if X then I’ll better reach my goal] then promote X to an instrumental goal”. Nor does it particularly pursue goals by any other means. Gato just acts similar to how its training examples acted in similar situations to the ones it finds itself in.
The second one. The logic for increasing capabilities is “if I increase my capabilities, then I’ll better reach my goal”. But Gato does not implement the dynamic of “if I infer [if X then I’ll better reach my goal] then promote X to an instrumental goal”. Nor does it particularly pursue goals by any other means. Gato just acts similar to how its training examples acted in similar situations to the ones it finds itself in.