Rohin Shah comments on Let’s talk about “Convergent Rationality”

Rohin Shah Jan 2, 2020, 7:28 AM
LW: 2 AF: 2
AF
I think intelligence is just one salient feature of what makes a life-form or individual able to out-compete others.
Sure, but within AI, intelligence is the main feature that we’re trying very hard to increase in our systems that would plausibly let the systems we build outcompete us. We aren’t trying to make AI systems that replicate as fast as possible. So it seems like the main thing to be worried about is intelligence.
My main opposition to this is that it’s not actionable: sure, lots of things could outcompete us; this doesn’t change what I’ll do unless there’s a specific thing that could outcompete us that will plausibly exist in the future.
(It feels similar in spirit, though not in absurdity, to a claim like “it is possible that aliens left an ancient weapon buried beneath the surface of the Earth that will explode tomorrow, we should not make the mistake of ignoring that hypothesis”.)
an AI system that has a finite horizon of 1,000,000 years, but no other restrictions. There may be a sense in which this system is irrational (e.g. having time-inconsistent preferences), but it may still be extremely competently goal-directed.
Idk, if it’s superintelligent, that system sounds both rational and competently goal-directed to me.
- David Scott Krueger (formerly: capybaralet)Jan 3, 2020, 6:30 AM
  LW: 1 AF: 1
  AF Parent
  Sure, but within AI, intelligence is the main feature that we’re trying very hard to increase in our systems that would plausibly let the systems we build outcompete us. We aren’t trying to make AI systems that replicate as fast as possible. So it seems like the main thing to be worried about is intelligence.
  Blaise Agüera y Arcas gave a keynote at this NeurIPS pushing ALife (motivated by specification problems, weirdly enough...: https://neurips.cc/Conferences/2019/Schedule?showEvent=15487).
  The talk recording: https://slideslive.com/38921748/social-intelligence. I recommend it.
- David Scott Krueger (formerly: capybaralet)Jan 2, 2020, 4:44 PM
  LW: 1 AF: 1
  AF Parent
  Sure, but within AI, intelligence is the main feature that we’re trying very hard to increase in our systems that would plausibly let the systems we build outcompete us. We aren’t trying to make AI systems that replicate as fast as possible. So it seems like the main thing to be worried about is intelligence.
  I think I was maybe trying to convey too much of my high-level views here. What’s maybe more relevant and persuasive here is this line of thought:
  - Intelligence is very multi-faceted
  - An AI that is super-intelligent in a large number (but small fraction) of the facets of intelligence could strategically outmanuver humans
  - Returning to the original point: such as AI could also be significantly less “rational” than humans
  Also, nitpicking a bit: to a large extent, society is trying to make systems that are as competitive as possible at narrow, profitable tasks. There are incentives for excellence in many domains. FWIW, I’m somewhat concerned about replicators in practice, e.g. because I think open-ended AI systems operating in the real-world might create replicators accidentally/indifferently, and we might not notice fast enough.
  My main opposition to this is that it’s not actionable
  I think the main take-away from these concerns is to realize that there are extra risk factors that are hard to anticipate and for which we might not have good detection mechanisms. This should increase pessimism/paranoia, especially (IMO) regarding “benign” systems.
  Idk, if it’s superintelligent, that system sounds both rational and competently goal-directed to me.
  (non-hypothetical Q): What about if it has a horizon of 10^-8s? Or 0?
  I’m leaning on “we’re confused about what rationality means” here, and specifically, I believe time-inconsistent preferences are something that many would say seem irrational (prima face). But
  - Rohin Shah Jan 2, 2020, 5:51 PM
    LW: 2 AF: 2
    AF Parent
    (non-hypothetical Q): What about if it has a horizon of 10^-8s? Or 0?
    With 0, the AI never does anything and so is basically a rock. With 10^-8, it still seems rational and competently goal-directed to me, just with weird-to-me preferences.
    I believe time-inconsistent preferences are something that many would say seem irrational
    Really? I feel like that at least depends on what the preference is. I could totally imagine that people have preferences to e.g. win at least one Olympic medal, but further medals are less important (which is history-dependent), be the youngest person to achieve <some achievement> (which is finite horizon), eat ice cream in the next half hour (but not care much after that).
    You might object that all of these can be made state-dependent, but you can make your example state-dependent by including the current time in the state.
    I agree that we are probably not going to build superintelligent AIs that have a horizon of 10^-8s, just because our preferences don’t have horizons of 10^-8s, and we’ll try to build AIs that optimize our preferences.
    - David Scott Krueger (formerly: capybaralet)Jan 3, 2020, 6:19 AM
      LW: 1 AF: 1
      AF Parent
      With 0, the AI never does anything and so is basically a rock
      I’m trying to point at “myopic RL”, which does, in fact, do things.
      You might object that all of these can be made state-dependent, but you can make your example state-dependent by including the current time in the state.
      I do object, and still object, since I don’t think we can realistically include the current time in the state. What we can include is: an impression of what the current time is, based on past and current observations. There’s an epistemic/indexical problem here you’re ignoring.
      I’m not an expert on AIXI, but my impression from talking to AIXI researchers and looking at their papers is: finite-horizon variants of AIXI have this “problem” of time-inconsistent preferences, despite conditioning on the entire history (which basically provides an encoding of time). So I think the problem I’m referring to exists regardless.
      - Rohin Shah Jan 3, 2020, 8:36 AM
        LW: 3 AF: 3
        AF Parent
        I’m trying to point at “myopic RL”, which does, in fact, do things.
        Ah, an off-by-one miscommunication. Sure, it’s both rational and competently goal-directed.
        I do object, and still object, since I don’t think we can realistically include the current time in the state.
        I mean, if you want to go down that route, then “win at least one medal” is also not state-dependent, because you can’t realistically include “whether Alice has won a medal” in the state: you can only include an impression of whether Alice has won a medal, based on past and current observations. So I still have the same objection.
        finite-horizon variants of AIXI have this “problem” of time-inconsistent preferences
        Oh, I see. You probably mean AI systems that act as though they have goals that will only last for e.g. 5 seconds. Then, 2 seconds later, they act as though they have goals that will last for 5 more seconds, i.e. 7 seconds after the initial time. (I was thinking of agents that initially care about the next 5 seconds, and then after 2 seconds, they care about the next 3 seconds, and after 7 seconds, they don’t care about anything.)
        I agree that the preferences you were talking about are time-inconsistent, and such agents seem both less rational and less competently goal-directed to me.