Rohin Shah comments on [AN #57] Why we should focus on robustness in AI safety, and the analogous problems in programming

Rohin Shah 7 Jun 2019 17:47 UTC
LW: 4 AF: 2
AF
I’m interested to know what insights you gained from this analogy, aside from what Pushmeet talked about.
I’m not sure there’s any particular insights I can point to; it was more an analogy that helped see how we had tackled similar problems before. I don’t think it’s that useful for figuring out solutions—there’s a reason I haven’t done any projects at the intersection of PL and AI, despite my huge absolute advantage at it.
An example of the analogy is in program synthesis, in which you give some specification of what a program should do, and then the computer figures out the program that meets that specification. When you specify a program via input-output examples, you need to have “capability control” to make sure that it doesn’t come up with the program “if input == X: return Y elif …”. It also often “games” the specification; one of my favorite examples is the function for sorting an array—often a new student will provide the specification as $\forall i, j : i < j ⟹ A [i] < A [j]$ , and then the outputted program is “return []” or “return range(len(A))”.
There are supposed to be regular meetings between FHI and DeepMind, but it seems like the leaders at DeepMind aren’t familiar with or aren’t convinced of Nick Bostrom’s ideas about The Vulnerable World Hypothesis and Differential Technological Development.
As context (not really disagreeing), afaik those meetings are between DeepMind’s AGI safety team and FHI. Pushmeet is not on that team and so probably doesn’t attend those meetings.
I’m curious to learn about specification learning but didn’t see where in the transcript Pushmeet talked about it. Can you give a pointer to what Pushmeet said, or a blog post or paper?
Here’s one section, I think there were others that also alluded to it. Reading it again it’s maybe more accurate from this section to talk about the expressivity of our language of specifications, rather than specification learning:
Robert Wiblin: Yeah. That leads into my next question, which was, is it going to be possible to formally verify safety performance on the ML systems that we want to use?

Pushmeet Kohli: I think a more pertinent question is, would it be possible to specify what we want out of the system, because at the end of the day you can only verify what you can specify. I think technically there is nothing, of course this is a very hard problem, but fundamentally we have solved hard search problems and challenging optimization problems and so on. So it is something that we can work towards, but a more critical problem is specifying what do we want to verify? What do we want to formally verify? At the moment we verify, is my function consistent with the input-output examples, that I gave the machine learning system and that’s very easy. You can take all the inputs in the training set, you can compute the outputs and then check whether the outputs are the same or not. That’s a very simple thing. No rocket science needed.

Pushmeet Kohli: Now, you can have a more sophisticated specification saying, well, if I perturb the input in some way or transform the input and I expect the output to not change or change in a specific way, is it true? That’s a harder question and would be showing that we can try to make progress. But what other types of specifications or what other type of behavior or what kind of rich questions might people want to ask in the future? That is a more challenging problem to think about.

Robert Wiblin: Interesting. So then relative to other people you think it’s going to be figuring out what we want to verify that’s harder rather than the verification process itself?

Pushmeet Kohli: Yeah, like how do you specify what is the task? Like a task is not a data set..

Robert Wiblin: How do you? Do you have any thoughts on that?

Pushmeet Kohli: Yes. I think this is something that … It goes into like how this whole idea of, it’s a very philosophical thing, how do we specify tasks? When we talk about tasks, we talk about in human language. I can describe a task to you and because we share some notion of certain concepts, I can tell you, well, we should try to detect whether a car passes by and what is a car, a car has something which has four wheels and something, and can drive itself and so on. And a child with a scooter, which also has four wheels goes past and you say, “Oh that’s a car.” You say, “No, that’s not a car.” The car is slightly different, bigger, basically people can sit inside it and so on. I’m describing the task of detecting what is a car in these human concepts that I believe that you and I share a common understanding of.

Pushmeet Kohli: That’s a key assumption that I’ve made. Will I be able to also communicate with the machine in those same concepts? Does the machine understand those concepts? This is a key question that we have to try to think about. At the moment we’re just saying, oh input, this is the output, input this output that. This is a very poor form of teaching. If you’re trying to teach an intelligent system, just showing it examples is a very poor form of teaching. There’s a much more richer, like when we are talking about solving a task, we are talking in human language and human concepts.

Robert Wiblin: It seems like you might think that it would be reliability enhancing to have better natural language processing that, that’s going to be disproportionately useful?

Pushmeet Kohli: Natural processing would be useful, but the grounding problem of does a machine really understand-

Robert Wiblin: The concepts, or is it just pretending, or is it just aping it?

Pushmeet Kohli: Exactly.
Specification learning is also explicitly called out in Towards Robust and Verified AI: Specification Testing, Robust Training, and Formal Verification (AN #52):
Learning specifications: Specifications that capture “correct” behavior in AI systems are often difficult to precisely state. Building systems that can use partial human specifications and learn further specifications from evaluative feedback would be required as we build increasingly intelligent agents capable of exhibiting complex behaviors and acting in unstructured environments.
- Wei Dai 7 Jun 2019 19:49 UTC
  LW: 3 AF: 2
  AF Parent
  
  there’s a reason I haven’t done any projects at the intersection of PL and AI, despite my huge comparative advantage at it.
  
  What’s PL? Programming languages?
  
  As context (not really disagreeing), afaik those meetings are between DeepMind’s AGI safety team and FHI. Pushmeet is not on that team and so probably doesn’t attend those meetings.
  
  I guess I was imagining that people in the AGI safety team must know about the “AI for science” project that Pushmeet is heading up, and Pushmeet also heads up the ML safety team, which he says collaborates “very, very closely” with the AGI safety team, so they should have a lot of chances to talk. Perhaps they just talk about technical safety issues, and not about strategy.
  
  Specification learning is also explicitly called out in Towards Robust and Verified AI: Specification Testing, Robust Training, and Formal Verification (AN #52)
  
  Do you know if there are any further details about it somewhere, aside from just the bare idea of “maybe we can learn specifications from evaluative feedback”?
  - Rohin Shah 9 Jun 2019 4:30 UTC
    LW: 4 AF: 2
    AF Parent
    What’s PL? Programming languages?
    Yes, sorry for the jargon.
    Do you know if there are any further details about it somewhere, aside from just the bare idea of “maybe we can learn specifications from evaluative feedback”?
    Not to my knowledge.