Charlie Steiner comments on I (with the help of a few more people) am planning to create an introduction to AI Safety that a smart teenager can understand. What am I missing?

Charlie Steiner 14 Nov 2022 18:44 UTC
2 points
0
My overall reactions:
- I would cut more things from this list than I’d add.
- Getting audience attention is hard, and requires planning.
- The more effort you spend getting attention, the more effort you should also spend making sure that you’re not writing things that will backfire.
- The downside I think is most likely would be if you write this in the “voice” of an AI authority but confuse or omit some technical details, causing friction with other people in AI or even the audience. I don’t know you, but if you’re not an AI authority, it’s okay to write as yourself—talking about what you personally find interesting / convincing.
Some specific edits I’d make, in order of their destination:
- I’d move “what is agency?” from section 9 to section 3, or just spend more time on it in section 1.
- Under forecasting, I’d put less emphasis on takeoff speed and more emphasis on attempting to teach people that superhuman performance is very possible on nearly every task, AI is not just going to plateau at human level, that’s not a plausible future.
- I would actually not mention the words “inner” or “outer” alignment. But I would talk about generalization failure and deceptive alignment.
- I would cut decision theory entirely.
- I would merge the EY and capabilities externality bullet points into a more general strategy section. What would the world look like if we were on a trajectory to succeed? What of our actions move us closer / further from that trajectory?
- Tapatakt 14 Nov 2022 19:30 UTC
  1 point
  0
  Parent
  Thanks for your answer!
  The downside I think is most likely would be if you write this in the “voice” of an AI authority but confuse or omit some technical details, causing friction with other people in AI or even the audience. I don’t know you, but if you’re not an AI authority, it’s okay to write as yourself—talking about what you personally find interesting / convincing.
  I’m going to post each part on LW and collect feedback before I put it all together, to avoid this failure mode in particular.
  I’d move “what is agency?” from section 9 to section 3, or just spend more time on it in section 1.
  I will think about it.
  Under forecasting, I’d put less emphasis on takeoff speed and more emphasis on attempting to teach people that superhuman performance is very possible on nearly every task, AI is not just going to plateau at human level, that’s not a plausible future.
  I’m not sure it should be in the forecasting section, more like in the introduction (or, if it is harder than I think, in its own separate section).
  I would actually not mention the words “inner” or “outer” alignment.
  Why not?
  I would cut decision theory entirely.
  Hmmmm… maybe?
  I would merge the EY and capabilities externality bullet points into a more general strategy section. What would the world look like if we were on a trajectory to succeed? What of our actions move us closer / further from that trajectory?
  Seems like a good proposal, thanks!