deepthoughtlife comments on [AN #157]: Measuring misalignment in the technology underlying Copilot

deepthoughtlife 24 Jul 2021 0:01 UTC
1 point
I must admit to a greater degree of ignorance than usual for my comment here, but I have a huge problem with the longtermists [at least from the longtermist paper I read]: their position reeks of begging the question. If we suppose that an immense numbers of people will live in the future, that the short term is not immensely easier to knowingly influence, that improving the short term does not improve the long term, that there is no medium term worth considering, that influences are percentage based, and that we care equally about immensely future people [including nonhuman ones] as we do our loved ones, then and only then does their argument make sense. That said, I’m perfectly fine with pursuing long term benefit, and think that one of their points should be highly pursued; research into how best to influence the future seems worthwhile.
It seems clearly made as a justification for the position they already wanted to take. There’s nothing wrong with that, but I think their premises are unlikely. I think it is obvious that the near term is much easier to influence. Assumptions I find highly questionable are that the short term doesn’t have significant knock on effects [I think it clearly does], that we shouldn’t consider a distinct medium term, and that we shouldn’t care more about those closer to us (in time, space, likeness, and just plain affection). Percentage based is also highly questionable, considering we know requirements to improve tech seem exponential, and things that are naturally easier to quantify are probably much easier to improve [so things other than tech are hard to improve. AI safety is in a philosophy stage]. They also fail to include a scenario where the number of future people doesn’t explode. I also don’t believe in quick AI takeoff if AGI ever happens, and so even if I were one of them, I wouldn’t focus so much AI safety. (I am aware this community was built by people very concerned about AI safety.)
In the linked post, I think they can’t tell which intermediate goals to pursue for two reasons. First, they are looking too far into the future, and two, AI simply isn’t advanced enough yet to build good hypothesis about how they will really end up working [this is one reason it is philosophy]. The possibilities space is immense, and we have few clues where to look. (I do think current approaches are also subtly very wrong, so they are actually looking in the wrong general areas too.)
Also, focusing on AI governance is a bit of a strange way to influence AI safety, and so it is hard to know what effect you will have based on what you do. Influencing the people that influence the laws and norms of the society AI researchers are operating in when there are hundreds of countries, and possibly thousands of cultures is a highly difficult task. Historically, many such influences have turned out to be malignant regardless of whether the people behind them were beneficent or malevolent. There are other approaches to influence, but they are even less reliable. It seems like a genuinely very tricky problem that may be clarified later once AI is really understood, but not until then. Focusing on understanding how and why AI will do things seems likely to be much more valuable than locking in governance before we understand.
- Rohin Shah 24 Jul 2021 5:06 UTC
  6 points
  Parent
  Like Luke I’m going to take longtermism as an axiom for most purposes (I find it decently convincing given my values), though if you’re interested in debating it you could post on the EA Forum. (One note: my understanding of longtermism is “the primary determinant of whether an action is one of the best that you can take is its consequences on the far future”; you seem to be interpreting it as a stronger / more specific claim than that.)
  Also, focusing on AI governance is a bit of a strange way to influence AI safety
  You’re misunderstanding the point of AI governance. AI governance isn’t a subset of AI safety, unless you interpret the term “AI safety” very very broadly. Usually I think of AI safety as “how do we build AI systems that do what their designers intend”; AI governance is then “how do we organize society so that humanity uses this newfound power of AI for good, and in particular doesn’t use it to destroy ourselves” (e.g. how do we prevent humans from using AI in a way that makes wars existentially risky, that enforces robust totalitarianism, that persuades humans to change their values, etc). I guess part of governance is “how do we make sure no one builds unsafe AI”, which is somewhat related to AI safety, but that’s not the majority of AI governance.
  A lot of these issues don’t seem to become that more clarified even with a picture of how AGI will come about, e.g. I have such a picture in mind, and even if I condition on that picture being completely accurate (which it obviously won’t be), many of the relevant questions still don’t get resolved. This is because often they’re primarily questions about human society rather than questions about how AI works.
  - deepthoughtlife 26 Jul 2021 3:13 UTC
    3 points
    Parent
    I am unlikely to post on the EA forum. (I only recently started posting much here, and I find most of EA rather unconvincing, aside from the one sentence summary, which is obviously a good thing.) Considering my negativity toward long-termism, I’m glad you decided more on the productive side for your response. My response is a bit long, I didn’t manage to get what I was trying to say down when it was shorter. Feel free to ignore it.
    I will state that all of that is AI safety. Even the safety of the AI is determined by the overarching world upon which it is acting. A perfectly well controlled AI is unsafe if regulations followed by defense-bot-3000 state that all rebels must be ended, and everyone matches the definition of a rebel. The people that built defense-bot-3000 probably didn’t intend to end humanity because a human law said to. Identically, they probably didn’t mean for defense-bot-4000 to stand by and let it happen because a human is required in the loop by the 4000 version, and defense-bot-3000 made sure to kill those in charge of defense-bot-4000 at the start for its instrumental value.
    Should a police bot let criminals it can prove are guilty run free, because their actions are justified in this instance? Should a facial recognition system point out that it has determined that new intern matches a spy for the government of that country? Should people be informed that a certain robot is malfunctioning, and likely to destroy an important machine in a hospital [when that means the people will simply destroy the sapient robot, but if the machine is destroyed people might die]? These are moral, and legal governance questions, that are also clearly AI safety questions.
    I’d like to compare it to computer science where we know seemingly very disparate things are theoretically identical, such as iteration versus recursion, and hardware vs software. Regulation internal to the AI is the narrow construal of AI safety, while regulation external to it is governance. (Whether this regulation is on people or on the AI directly can be an important distinction, but either way it is still governance.)
    Governance is thus actually a subset of AI safety broadly construed. And it should be broadly construed, since there is no difference between an inbuilt part of the AI and a part of the environment it is being used in if the lead to the same actions.
    That wasn’t actually my point though. The definition of whether or not you call it AI safety isn’t important. You want to make it safe to have AI in use in society through regulation and cultural action. If you don’t understand AI, your regulation and cultural bits will be wrong. You do not currently understand AI, especially what effects it will actually have dealing with people [since sufficient AIs don’t exist to get data, and current approaches are not well understood in terms of why they do what they do].
    Human culture has been massively changed by computers, the internet, cellphones, and so on. If I was older, I’d have a much longer list. If [and this is a big if] AI turns out to be that big of a thing, you can’t know what it will look like at this stage. That’s why you have to wait to find out [while trying to figure out what it will actually do.] If AI turns out to mostly be good at tutoring people, you need completely different regulation that if it turns out to only be good at war, and both are very different than if it is good at a wide variety of things.
    Questions of human society rest on two things. First, what are people truly like on the inside. We aren’t good at figuring that out, but we have documented several thousand years of trying, and we’re starting to learn. Second, what is the particular culture like? Actual human level AI would massively change all of our cultures, to fit or banish the contours of the actual and perceived effects of the devices. (Also, what are the AI’s like on the inside? What are their natures? What cultures would spring up amongst different AIs?)
    - Rohin Shah 26 Jul 2021 8:59 UTC
      4 points
      Parent
      I agree that regulation is harder to do before you know all the details of the technology, but it doesn’t seem obviously doomed, and it seems especially-not-doomed to productively think about what regulations would be good (which is the vast majority of current AI governance work by longtermists).
      As a canonical example I’d think of the Asilomar conference, which I think happened well before the details of the technology were known. There are a few more examples, but overall not many. I think that’s primarily because we don’t usually try to foresee problems because we’re too caught up in current problems, so I don’t see that as a very strong update against thinking about governance in advance.
      - deepthoughtlife 26 Jul 2021 20:59 UTC
        1 point
        Parent
        Perhaps I was unclear. I object to the idea that you should get attached to any ideas now, not that you shouldn’t think about them. People being people, they are much more prone to getting attached to their ideas than is wise. Understand before risking attachment.
        The problem with AI governance, is that AI is a mix between completely novel abilities, and things humans have been doing as long as there have been humans. The latter don’t need special ‘AI governance’ and the former are not understood.
        (It should be noted that I am absolutely certain that AI will not take off quickly if it ever does takeoff beyond human limits.)
        The Asilomar conference isn’t something I’m particularly familiar with, but it sounds like people actually had significant hands on experience with the technology, and understood them already. They stopped the experiments because they needed the clarity, not because someone else made rules earlier. There are not details as to whether they did a good job, and the recommendations seem very generic. Of course, it is wikipedia. We are not at this point with nontrivial AI. Needless to say, I don’t think this is against my point.