So, I feel like I am concerned for everyone, including myself, but also including people who do not think that it would effect them. A large part of what concerns me is that the effects could be invisible.
For example, I think that I am not very effected by this, but I recently noticed a connection between how difficult it is to get to work on writing a blog post that I think it is good to write, and how much my system one expects some people to receive the post negatively. (This happened when writing the recent MtG post.) This is only anecdotal, but I think that posts that seems like bad PR caused akrasia, even when when controlling for how good I think the post is on net. The scary part is that there was a long time before I noticed this. If I believed that there was a credible way to detect when there are thoughts you can’t have in the first place, I would be less worried.
I didn’t have many data points, and the above connection might have been a coincidence, but the point I am trying to make is that I don’t feel like I have good enough introspective access to rule out a large, invisible, effect. Maybe others do have enough introspective access, but I do not think that just not seeing the outer incentives pulling on you is enough to conclude that they are not there.
For example, I think that I am not very effected by this, but I recently noticed a connection between how difficult it is to get to work on writing a blog post that I think it is good to write, and how much my system one expects some people to receive the post negatively. (This happened when writing the recent MtG post.)
How accurate is your system one on this? Was it right about the MtG post? (I’m trying to figure out if the problem is your system one being too cautious, or if you think there’s an issue even if it’s not.)
This is only anecdotal, but I think that posts that seems like bad PR caused akrasia, even when when controlling for how good I think the post is on net.
Maybe it’s actually rational to wait in such cases, and write some other posts to better prepare the readers to be receptive to the new idea? It seems like if you force yourself to write posts when you expect bad PR, that’s more likely to cause propagation back to idea generation. Plus, writing about an idea too early can create unnecessary opposition to it, which can be harder to remove than to not create in the first place.
(You could certainly go overboard with PR concerns and end up not writing about anything that could be remotely controversial, especially if you work at a place where bad PR could cause to you lose your livelihood, but that feels like a separate problem from what you’re talking about.)
I think it was wrong about the MtG post. I mostly think the negative effects of posting ideas (related to technical topics) that people think are bad is small enough to ignore, except in so far as it messes with my internal state. My system 2 thinks my system 1 is wrong about the external effects, but intends to cooperate with it anyway, because not cooperating with it could be internally bad.
As another example, months ago, you asked me to talk about how embedded agency fits in with the rest of AI safety, and I said something like that I didn’t want to force myself to make any public arguments for or against the usefulness of agent foundations. This is because I think research prioritization is especially prone to rationalization, so it is important to me that any thoughts about research prioritization are not pressured by downstream effects on what I am allowed to work on. (It still can change what I decide to work on, but only through channels that are entirely internal.)
I enjoyed the MtG post by the way. It was brief, and well illustrated. I haven’t seen other posts that talked about that many AI things on that level before. (On organizing approaches, as opposed to just focusing on one thing and all its details.)
I completely empathise with worries about social pressures when I’m putting something out there for people to see. I don’t think this would apply to me in the generation phase but you’re right that my introspection may be completely off the mark.
My own experience at work is that I get ideas for improvements even when such ideas aren’t encouraged but maybe I’d get more if they were. My gut says that the level of encouragement mainly determines how likely I am to share the ideas but there could be more going on that I’m unaware of.
So, I feel like I am concerned for everyone, including myself, but also including people who do not think that it would effect them. A large part of what concerns me is that the effects could be invisible.
For example, I think that I am not very effected by this, but I recently noticed a connection between how difficult it is to get to work on writing a blog post that I think it is good to write, and how much my system one expects some people to receive the post negatively. (This happened when writing the recent MtG post.) This is only anecdotal, but I think that posts that seems like bad PR caused akrasia, even when when controlling for how good I think the post is on net. The scary part is that there was a long time before I noticed this. If I believed that there was a credible way to detect when there are thoughts you can’t have in the first place, I would be less worried.
I didn’t have many data points, and the above connection might have been a coincidence, but the point I am trying to make is that I don’t feel like I have good enough introspective access to rule out a large, invisible, effect. Maybe others do have enough introspective access, but I do not think that just not seeing the outer incentives pulling on you is enough to conclude that they are not there.
How accurate is your system one on this? Was it right about the MtG post? (I’m trying to figure out if the problem is your system one being too cautious, or if you think there’s an issue even if it’s not.)
Maybe it’s actually rational to wait in such cases, and write some other posts to better prepare the readers to be receptive to the new idea? It seems like if you force yourself to write posts when you expect bad PR, that’s more likely to cause propagation back to idea generation. Plus, writing about an idea too early can create unnecessary opposition to it, which can be harder to remove than to not create in the first place.
(You could certainly go overboard with PR concerns and end up not writing about anything that could be remotely controversial, especially if you work at a place where bad PR could cause to you lose your livelihood, but that feels like a separate problem from what you’re talking about.)
I think it was wrong about the MtG post. I mostly think the negative effects of posting ideas (related to technical topics) that people think are bad is small enough to ignore, except in so far as it messes with my internal state. My system 2 thinks my system 1 is wrong about the external effects, but intends to cooperate with it anyway, because not cooperating with it could be internally bad.
As another example, months ago, you asked me to talk about how embedded agency fits in with the rest of AI safety, and I said something like that I didn’t want to force myself to make any public arguments for or against the usefulness of agent foundations. This is because I think research prioritization is especially prone to rationalization, so it is important to me that any thoughts about research prioritization are not pressured by downstream effects on what I am allowed to work on. (It still can change what I decide to work on, but only through channels that are entirely internal.)
I enjoyed the MtG post by the way. It was brief, and well illustrated. I haven’t seen other posts that talked about that many AI things on that level before. (On organizing approaches, as opposed to just focusing on one thing and all its details.)
Thanks, that makes sense.
I completely empathise with worries about social pressures when I’m putting something out there for people to see. I don’t think this would apply to me in the generation phase but you’re right that my introspection may be completely off the mark.
My own experience at work is that I get ideas for improvements even when such ideas aren’t encouraged but maybe I’d get more if they were. My gut says that the level of encouragement mainly determines how likely I am to share the ideas but there could be more going on that I’m unaware of.