Christopher Olah comments on Chris Olah’s views on AGI safety

Christopher Olah 2 Nov 2019 21:12 UTC
LW: 39 AF: 15
0
AF
I think that’s a fair characterization of my optimism.

I think the classic response to me is “Sure, you’re making progress on understanding vision models, but models with X are different and your approach won’t work!” Some common values of X are not having visual features, recurrence, RL, planning, really large size, and language-based. I think that this is a pretty reasonable concern (more so for some Xs than others). Certainly, one can imagine worlds where this line of work hits a wall and ends up not helping with more powerful systems. However, I would offer a small consideration in the other direction: In 2013 I think no one thought we’d make this much progress on understanding vision models, and in fact many people thought really understanding them was impossible. So I feel like there’s some risk of distorting our evaluation of tractability by moving the goal post in these conversations.

I’m not surprised by other people feeling like they have less traction. I feel like the first three or so years I spent trying to understand the internals neural networks involved a lot of false starts with approaches that ended up being dead ends (eg. visualizing really small networks, or focusing on dimensionality reduction). DeepDream was very exciting, but it retrospect I feel like it took me another two or so years to really digest what it meant and how one could really use it as a scientific tool. And this is with the benefit of amazing collaborators and multiple very supportive environments.

One final thing I’d add is that, if I’m honest, I’m probably more motivated by aesthetics than optimism. I’ve spent almost seven years obsessed with the question of what goes on inside neural networks and I find the crazy partial answers we learn every year tantalizingly beautiful. I think this is pretty normal for early research directions; Kuhn talks about it a fair amount in The Structure of Scientific Revolutions.
What links here?
- Modeling the impact of safety agendas by Ben Cottier (5 Nov 2021 19:46 UTC; 51 points)