I’d definitely agree the updates are towards the views of certain other people (roughly some mix of views that tend to be common in academia, and views I got from Paul Christiano, Redwood and other people in a similar cluster). Just based on that observation, it’s kind of hard to disentangle updating towards those views just because they have convincing arguments behind them, vs updating towards them purely based on exposure or because of a subconscious desire to fit in socially.
I definitely think there are good reasons for the updates I listed (e.g. specific arguments I think are good, new empirical data, or things I’ve personally observed working well or not working well for me when doing research). That said, it does seem likely there’s also some influence from just being exposed to some views more than others (and then trying to fit in with views I’m exposed to more, or just being more familiar with arguments for those views than alternative ones).
If I was really carefully building an all-things-considered best guess on some question, I’d probably try to take this into account somehow (though I don’t see a principled way of doing that). Most of the time I’m not trying to form the best possible all-things-considered view anyway (and focus more on understanding specific mechanisms instead etc.), in those cases it feels more important to e.g. be aware of other views and to not trust vague intuitions if I can’t explain where they’re coming from. I feel like I’m doing a reasonable job at those things but hard to be sure from the inside naturally
ETA: I should also say that from my current perspective, some of my previous views seem like they were basically just me copying views from my “ingroup” and not questioning them enough. As one example, the “we all die vs utopia” dichotomy for possible outcomes felt to me like the commonly accepted wisdom and I don’t recall thinking about it particularly hard. I was very surprised when I first read a comment by Paul where he argued against the claim that unaligned AI would kill us all with overwhelming probability. Most recently, I’ve definitely been more exposed to the view that there’s a spectrum of potential outcomes. So maybe if I talked to people a lot who think an unaligned AI would definitely kill us all, I’d update back towards that a bit. But overall, my current epistemic state where I’ve at least been exposed to both views and some arguments on both sides seems way better than the previous one where I’d just never really considered the alternative.
I’m not saying that it looks like you’re copying your views, I’m saying that the updates look like movements towards believing in a certain sort of world: the sort of world where it’s natural to be optimistically working together with other people on project that are fulfilling because you believe they’ll work. (This is a super empathizable-with movement, and a very common movement to make. Also, of course this is just one hypothesis.) For example, moving away from theory and “big ideas”, as well as moving towards incremental / broadly-good-seeming progress, as well as believing more in a likely continuum of value of outcomes, all fit with trying to live in a world where it’s more immediately motivating to do stuff together. Instead of witholding motivation until something that might really work is found, the view here says: no, let’s work together on whatever, and maybe it’ll help a little, and that’s worthwhile because every little bit helps, and the witholding motivation thing wasn’t working anyway.
(There could be correct reasons to move toward believing and/or believing in such worlds; I just want to point out the pattern.)
For me personally, an important contributor to day-to-day motivation is just finding research intrinsically fun—impact on the future is more something I have to consciously consider when making high-level plans. I think moving towards more concrete and empirical work did have benefits on personal enjoyment just because making clear progress is fun to me independently of whether it’s going to be really important (though I think there’ve also been some downsides to enjoyment because I do quite like thinking about theory and “big ideas” compared to some of the schlep involved in experiments).
I don’t think my views overall make my work more enjoyable than at the start of my PhD. Part of this is the day-to-day motivation being sort of detached from that anyway like I mentioned. But also, from what I recall now (and this matches the vibe of some things I privately wrote then), my attitude 1.5 years ago was closer to that expressed in We choose to align AI than feeling really pessimistic.
(I feel like I might still not represent what you’re saying quite right, but hopefully this is getting closer.)
ETA: To be clear, I do think if I had significantly more doomy views than now or 1.5 years ago, at some point that would affect how rewarding my work feels. (And I think that’s a good thing to point out, though of course not a sufficient argument for such views in its own right.)
I’d definitely agree the updates are towards the views of certain other people (roughly some mix of views that tend to be common in academia, and views I got from Paul Christiano, Redwood and other people in a similar cluster). Just based on that observation, it’s kind of hard to disentangle updating towards those views just because they have convincing arguments behind them, vs updating towards them purely based on exposure or because of a subconscious desire to fit in socially.
I definitely think there are good reasons for the updates I listed (e.g. specific arguments I think are good, new empirical data, or things I’ve personally observed working well or not working well for me when doing research). That said, it does seem likely there’s also some influence from just being exposed to some views more than others (and then trying to fit in with views I’m exposed to more, or just being more familiar with arguments for those views than alternative ones).
If I was really carefully building an all-things-considered best guess on some question, I’d probably try to take this into account somehow (though I don’t see a principled way of doing that). Most of the time I’m not trying to form the best possible all-things-considered view anyway (and focus more on understanding specific mechanisms instead etc.), in those cases it feels more important to e.g. be aware of other views and to not trust vague intuitions if I can’t explain where they’re coming from. I feel like I’m doing a reasonable job at those things but hard to be sure from the inside naturally
ETA: I should also say that from my current perspective, some of my previous views seem like they were basically just me copying views from my “ingroup” and not questioning them enough. As one example, the “we all die vs utopia” dichotomy for possible outcomes felt to me like the commonly accepted wisdom and I don’t recall thinking about it particularly hard. I was very surprised when I first read a comment by Paul where he argued against the claim that unaligned AI would kill us all with overwhelming probability. Most recently, I’ve definitely been more exposed to the view that there’s a spectrum of potential outcomes. So maybe if I talked to people a lot who think an unaligned AI would definitely kill us all, I’d update back towards that a bit. But overall, my current epistemic state where I’ve at least been exposed to both views and some arguments on both sides seems way better than the previous one where I’d just never really considered the alternative.
I’m not saying that it looks like you’re copying your views, I’m saying that the updates look like movements towards believing in a certain sort of world: the sort of world where it’s natural to be optimistically working together with other people on project that are fulfilling because you believe they’ll work. (This is a super empathizable-with movement, and a very common movement to make. Also, of course this is just one hypothesis.) For example, moving away from theory and “big ideas”, as well as moving towards incremental / broadly-good-seeming progress, as well as believing more in a likely continuum of value of outcomes, all fit with trying to live in a world where it’s more immediately motivating to do stuff together. Instead of witholding motivation until something that might really work is found, the view here says: no, let’s work together on whatever, and maybe it’ll help a little, and that’s worthwhile because every little bit helps, and the witholding motivation thing wasn’t working anyway.
(There could be correct reasons to move toward believing and/or believing in such worlds; I just want to point out the pattern.)
Oh I see, I indeed misunderstood your point then.
For me personally, an important contributor to day-to-day motivation is just finding research intrinsically fun—impact on the future is more something I have to consciously consider when making high-level plans. I think moving towards more concrete and empirical work did have benefits on personal enjoyment just because making clear progress is fun to me independently of whether it’s going to be really important (though I think there’ve also been some downsides to enjoyment because I do quite like thinking about theory and “big ideas” compared to some of the schlep involved in experiments).
I don’t think my views overall make my work more enjoyable than at the start of my PhD. Part of this is the day-to-day motivation being sort of detached from that anyway like I mentioned. But also, from what I recall now (and this matches the vibe of some things I privately wrote then), my attitude 1.5 years ago was closer to that expressed in We choose to align AI than feeling really pessimistic.
(I feel like I might still not represent what you’re saying quite right, but hopefully this is getting closer.)
ETA: To be clear, I do think if I had significantly more doomy views than now or 1.5 years ago, at some point that would affect how rewarding my work feels. (And I think that’s a good thing to point out, though of course not a sufficient argument for such views in its own right.)