harfe comments on Yoshua Bengio: How Rogue AIs may Arise

harfe 23 May 2023 18:49 UTC
9 points
2
I think overall this is a well-written blogpost. His previous blogpost already indicated that he took the arguments seriously, so this is not too much of a surprise. That previous blogpost was discussed and partially criticized on Lesswrong. As for the current blogpost, I also find it noteworthy that active LW user David Scott Krueger is in the acknowledgements.

This blogpost might even be a good introduction for AI xrisk for some people.

I hope he engages further with the issues. For example, I feel like inner misalignment is still sort of missing from the arguments.
- Leon Lang 23 May 2023 21:27 UTC
  5 points
  1
  Parent
  Yoshua Bengio was on David Krueger’s PhD thesis committee, according to David’s CV.
  - Vika 24 May 2023 9:48 UTC
    20 points
    2
    Parent
    David had many conversations with Bengio about alignment during his PhD, and gets a lot of credit for Bengio taking AI risk seriously
  - jacquesthibs 23 May 2023 22:29 UTC
    5 points
    0
    Parent
    Bengio was his Master’s thesis advisor too.
- Seth Herd 23 May 2023 21:00 UTC
  4 points
  0
  Parent
  It seems like inner misalignment is a subset of “we don’t know how to make aligned AI”. Maybe he could’ve fit that in neatly, but adding more is at odds with the function as an intro to AI risk.