Daniel Kokotajlo comments on Covid 7/23: The Second Summit

Daniel Kokotajlo 23 Jul 2020 15:51 UTC
16 points
We like to say There is No Fire Alarm for Artificial General Intelligence but let’s raise the possibility that the alarm is there and it’s working fine and it’s us that are the problem.
I think you are misremembering the point of that article. The article distinguished between smoke (evidence for fire) and a fire alarm (socially accepted signal to everyone that it’s time to start reacting to the fire). GPT-3 is exactly what Yud was talking about, basically: It’s smoke, but not a fire alarm. Yud claims there will never be a fire alarm, no matter how much smoke there is. I think he’s probably right, OTOH people like Paul think that there will probably be various AI catastrophes in which some AI system is caught red-handed lying to its human handlers or something, and this will make it socially acceptable to devote lots of effort towards safety.
If you want the most fun improv experience I’ve ever seen that’s also kind of a harbinger of the end of the world, try the dragon model of AI dungeon here. It’s not as good as the pure prompt version, but it’s available to the public.
Maybe add something clarifying that the free version is not the dragon model? You need to pay to get access to the dragon model, and then you need to go into settings and select “dragon” as well.
- Zvi 23 Jul 2020 21:19 UTC
  2 points
  Parent
  Yeah, the free version isn’t dragon, you need to pay $10/month, will think about whether it’s worth a note up top.
  On the fire alarm, it’s a metaphor, agreed we’re using it slightly differently, if there’s a consensus this is a problem I can reword.
- Kenny 10 Aug 2020 1:48 UTC
  1 point
  Parent
  I thought Zvi was pretty clear that his point is that, in a sense, ‘we’ (humanity) are not smart enough to respond to the ‘fire alarms’ that already do exist; not that they’re actually already a socially acceptable alarm.
  - Daniel Kokotajlo 10 Aug 2020 11:15 UTC
    3 points
    Parent
    If they are not a socially acceptable alarm, they aren’t fire alarms, according to the definition set out by Yudkowsky in the linked post. Zvi can use the word differently if he likes, but since he linked to the Yudkowsky piece I thought it worth mentioning that Yudkowsky means something different by it than Zvi does.
    - Kenny 10 Aug 2020 19:59 UTC
      1 point
      Parent
      That seems entirely clear in what Zvi wrote tho:
      
      We like to say There is No Fire Alarm for Artificial General Intelligence but let’s raise the possibility that the alarm is there and it’s working fine and it’s us that are the problem.
      - Daniel Kokotajlo 10 Aug 2020 20:41 UTC
        2 points
        Parent
        What seems entirely clear?

Daniel Kokotajlo comments on Covid 7/​23: The Second Summit

Daniel Kokotajlo comments on Covid 7/23: The Second Summit