aogara answers What’s the actual evidence that AI marketing tools are changing preferences in a way that makes them easier to predict?

aogara 1 Oct 2022 18:14 UTC
6 points
0
Simple example: If YouTube can turn you into an ideological extremist, you’ll probably watch more YouTube videos. See these two recent papers by people interested in AI safety for more detail:

https://openreview.net/pdf?id=mMiKHj7Pobj

https://arxiv.org/abs/2204.11966
- Emrik 2 Oct 2022 5:07 UTC
  1 point
  0
  Parent
  Thanks! This is what I’m looking for. Seems like I should have googled “recommender systems” and “preference shifts”.
  Edit: The openreview paper is so good. Do you know who the authors are?
  - aogara 2 Oct 2022 20:48 UTC
    1 point
    0
    Parent
    Yeah it’s really cool! It’s David Scott Krueger, who’s doing a lot of work bringing theories from the LW alignment community into mainstream ML. This preference shift argument seems similar to the concept of gradient hacking, though it doesn’t require the presence of a mesa optimizer. I’d love to write a post summarizing this recent work and discussing its relevance to long-term safety if you’d be interesting in working on it together.
    - Emrik 3 Oct 2022 16:56 UTC
      1 point
      0
      Parent
      Flattered you ask, but I estimate that I’ll be either very busy with my own projects or on mental-health vacation until the end of the year. But unless you’re completely saturated with connections, I’d be happy to have a 1:1 conversation sometime after October 25th? Just for exploration purposes, not for working on a particular project.