TekhneMakre answers Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?

TekhneMakre 9 Jun 2022 0:05 UTC
60 points
1

Thank you for posting about this here so that you can get feedback, and so that other people can know how much people are doing this sort of thing (and by the same token it could be good for people who’ve already done this sort of thing to say so).
I have a bit of a sinking feeling reading your draft; I’ll try to say concrete things about it, but I don’t think I’ll capture all of what’s behind the feeling. I think part of the feeling is about, this just won’t work.
Part of it is like, the email seems to come from a mindset that doesn’t give weight to curiosity and serious investigation (what Tao does with his time).
I know that to you it isn’t the most interesting problem to think about[1] but it really, actually is a very very important and urgent completely open problem. It isn’t simply a theoretical concern, if Demis’ predictions of 10 to 20 years to AGI are anywhere near correct, it will deeply affect you and your family (and everyone else).
I think there’s a sort of violence or pushiness here that’s anti-helpful. It doesn’t acknowledge that Tao doesn’t have good reason to trust your judgements about what’s “very very important and urgent”, and people who go around telling other people what things are “very very important and urgent” in contexts without trust in judgement are often trying to coerce people into doing things they don’t want to do. It doesn’t acknowledge that people aren’t utility maximizing machines, but instead have momentum and joy and specialization and context. (Not to say that Tao doesn’t deserve to be informed about the consequences of events happening in the world and his possible effect on those consequences, and not to say that stating beliefs is bad, and not to say that Tao might just be curious and learn about AI risk if it’s shown to him in the right way.)
Another thing is the sources recommendation. The links should be to technical arguments about problems in AI alignment and technical descriptions of the overall problem, the sort of thing X-risk and AI-risk thinkers say to each other, not to material prepared with introductoriness in mind.
It is not once but twice that I have heard leaders of AI research orgs say they want you to work on AI alignment.
This is kind of weird and pushy. On the face of it, it looks like either you’re confused and think that big high status people to you are also big high status people to Tao and therefore should be able to give him orders about what to work on, and that Tao is even the sort of entity that takes orders; or at least, it looks like you yourself are trying to take orders from big high status people, propagating perceived urgency from them to whoever else, without regard to individual agents’s local/private information about what’s good for them to do. Like, it looks like you got scared, flailed and grasped for whatever the high status people said they think might be cool, and then wanted to push that. (I’m being blunt here, but to be clear, if something like this is happening, that’s very empathizable-with; I don’t think “you’re bad” or anything like that, and doing stuff that seems like it would have good consequences is generally good.)
If you are ever interested you can start by reading
This is sort of absurd: 1. if Tao were interested, he could likely have lots of conversations with competent AI alignment thinkers, which would be a much better use of his time, and 2. frankly, it seems like you’re posturing as someone giving orders to Tao.
- JeffreyK 27 Jun 2022 1:48 UTC
  2 points
  Parent
  I agree with TekhneMakre...it comes across like an average looking unconfident person asking out a gorgeous celeb. Probably a friend approaching him is best, but an email can’t hurt. I would get a few people together to work on it...my approach would be to represent truly who we are as a motivated group of people that has the desire to write this email to him by saying something like, “There’s a great forum of AI interested and concerned folks that we are a part of, many of us on the younger side, and we fear for the future of humanity from misaligned AI and we look to people like you Dr. Tao as being the kind of gifted person we hope could become involved early in helping guide AI in the right directions that would keep us all safe. We are younger and up and coming, so we don’t know how to appeal to what interests you, so we’re just laying it out there so you can know there are thousands of us and we’re hoping to create a conversation with you and your high level peers to drive some energy in this direction and maybe your direct involvement. Thanks.”
- P. 9 Jun 2022 9:56 UTC
  1 point
  Parent
  I was trying to rely on Tao’s trust in Demis’s judgement, since he is an AI researcher. Mentioning Eliezer is mainly so he has someone to contact if he wants to get hired.
  I wanted his thinking to be “this competent entity has spent some of his computational resources verifying that it is important to solve this problem, and now that I’m reminded of that I should also throw mine at it”.
  Is he truly mostly interested in what he considers to be mentally stimulating? Not in improving the world, or in social nonsense, or guaranteeing that his family is completely safe from all threats?
  Then was including this link a bad idea? It gives examples of areas a mathematician might find interesting. And if not that, then what should I say? I’ve got nothing better. Do you know any technical introduction to alignment that he might like?
  And about getting him to talk to other people, if anyone volunteers just DM me your contact information so that I can include it in the email (or reply directly if you don’t care about it being public). I mean, what else could I do?
  - mukashi 9 Jun 2022 14:59 UTC
    6 points
    Parent
    If you plan to rewrite that letter with a less pushy tone (I agree 100% with the comment from TechneMakre) I think it might be useful if you try to change the framework of the problem a bit. Imagine that a random guy is writing to you instead, and he is telling you to work on deviating possible meteorites reaching Earth. What sort of email would make you compelled to reply?
    - P. 9 Jun 2022 16:10 UTC
      1 point
      Parent
      I’ll rewrite it but I can’t just model other people after me. If I were writing it for someone like myself it would be a concise explanation of the main argument to make me want to spend time thinking about it followed by a more detailed explanation or links to further reading. As long as it isn’t mean I don’t think I would care if it’s giving me orders, begging for help or giving me information without asking for anything at all. But he at least already knows that unaligned AIs are a problem, I can only remind him of that, link to reading material or say that other people also think he should work on it.
      But now the priority of that is lower, see the edit to the post. Do you think that the email to Demis Hassabis has similar problems or that it should stay like it is now?
  - TekhneMakre 10 Jun 2022 2:12 UTC
    2 points
    Parent
    Does the stuff about pushiness make sense to you? What do you think of it? I think as is, the letter, if Tao reads it, would be mildly harmful, for the reasons described by other commenters.
    - P. 10 Jun 2022 15:46 UTC
      1 point
      Parent
      I think I get it, but even if I didn’t now I know that’s how it sounds, and I think I know how to improve it. That will be for other mathematicians though (at least Maxim Kontsevich), see the edit to the post. Does the tone in the email to Demis seem like the right one to you?
      - TekhneMakre 10 Jun 2022 18:09 UTC
        2 points
        Parent
        In terms of tone it seems considerably less bad. I definitely like it more than the other one because it seems to make arguments rather than give social cues. It might be improved by adding links giving technical descriptions about the terms you use (e.g. inner alignment (Hubinger’s paper), IRL (maybe a Russell paper on CIRL)). I still don’t think it would work, simply because I would guess Hassabis gets a lot of email from randos who are confused and the email doesn’t seem to distinguish you from that (this may be totally unfair to you, and I’m not saying it’s correct or not, it’s just what I expect to happen). I also feel nervous about talking about arms races like that, enforcing a narrative where they’re not only real but the default (this is an awkward thing to think because it sounds like I’m trying to manage Hassabis’s social environment deceptively, and usually I would think that worrying about “reinforcing narratives” isn’t a main thing to worry about and instead one should just say what one thinks, but, still my instincts say to worry about that here, which might be incorrect).
        What links here?
        P.'s comment on Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment? by P. (18 Jun 2022 21:07 UTC; 1 point)