Thanks for doing/sharing this Vael. I was excited to see it!
I am currently bringing something of a behaviour change/marketing mindset to thinking about AI Safety movement building and therefore feel that testing how well different messages and materials work for audiences is very important. Not sure if it will actually be as useful as I currently think though.
With that in mind, I’d like to know:
was this as helpful for you/others as expected?
are you planning related testing to do next?
Two ideas
I wonder if it would be valuable to first test predictions among communicators for which materials will work best before then doing the test. This could make the value of the new information more salient by showing if/where our intuitions are wrong
I wonder about the value of trying to build an informal panel/mailing list of ML researchers who we can contact/pay to do various things like surveys/interviews. Also to potentially review AI Safety arguments/post from a more skeptical perspective so we can more reliably find any likely flaws in the logic or rhetoric.
Would welcome any thoughts or work on either if you have the time and inclination.
I think these results, and the rest of the results from the larger survey that this content is a part of, have been interesting and useful to people, including Collin and I. I’m not sure what I expected beforehand in terms of helpfulness, especially since there’s a question “helpful with respect to /what/”, and I expect we may have different “what”s here.
are you planning related testing to do next?
Good chance of it! There’s some question about funding, and what kind of new design would be worth funding, but we’re thinking it through.
I wonder if it would be valuable to first test predictions among communicators
Yeah, I think this is currently mostly done informally—when Collin and I were choosing materials, we had a big list, and were choosing based on shared intuitions that EAs / ML researchers / fieldbuilders have, in addition to applying constraints like “shortness”. Our full original plan was also much longer and included testing more readings—this was a pilot survey. Relatedly, I don’t think these results are very surprising to people (which I think you’re alluding to in this comment) -- somewhat surprising, but we have a fair amount of information about researcher preferences already.
I do think that if we were optimizing for “value of new information to the EA community” this survey would have looked different.
I wonder about the value of trying to build an informal panel/mailing list of ML researchers
Instead of contacting a random subset of people who had papers accepted at ML conferences? I think it sort of depends on one’s goals here, but could be good. A few thoughts: I think this may already exist informally, I think this becomes more important as there’s more people doing surveys and not coordinating with each other, and this doesn’t feel like a major need from my perspective / goals but might be more of a bottleneck for yours!
I think these results, and the rest of the results from the larger survey that this content is a part of, have been interesting and useful to people, including Collin and I. I’m not sure what I expected beforehand in terms of helpfulness, especially since there’s a question “helpful with respect to /what/”, and I expect we may have different “what”s here.
Good to know. When discussing some recent ideas I had for surveys, several people told me that their survey results underperformed their expectations, so I was curious if you would say the same thing.
Yeah, I think this is currently mostly done informally—when Collin and I were choosing materials, we had a big list, and were choosing based on shared intuitions that EAs / ML researchers / fieldbuilders have, in addition to applying constraints like “shortness”. Our full original plan was also much longer and included testing more readings—this was a pilot survey. Relatedly, I don’t think these results are very surprising to people (which I think you’re alluding to in this comment) -- somewhat surprising, but we have a fair amount of information about researcher preferences already.
Thanks for explaining. I realise that the point of that part of my comment was unclear, sorry. I think that using these sorts of surveys to test if best practice contrasts with current practice could make the findings clearer and spur improvement/innovation if needed.
For instance, doing something like this: “We curated the 10 most popular public communication paper from AI Safety organisations and collected predictions from X public AI Safety communicators about which of thse materials would be most effective at persuading existing ML researchers to care about AI Safety. We tested these materials with a random sample of X ML researchers and [supported/challenged existing beliefs/practices]… etc.”
I am interested to hear what you think of the idea of testing using these sorts of surveys to test if best practice contrasts with current practice, but ok if you don’t have time to explain! I imagine that it does add some extra complexity and challenge to the research process, so may not be worth it.
I hope you can do the larger study eventually. If you do, I would also like to see how sharing readings compares against sharing podcasts or videos etc. Maybe some modes of communication perform better on average etc.
Instead of contacting a random subset of people who had papers accepted at ML conferences? I think it sort of depends on one’s goals here, but could be good. A few thoughts: I think this may already exist informally, I think this becomes more important as there’s more people doing surveys and not coordinating with each other, and this doesn’t feel like a major need from my perspective / goals but might be more of a bottleneck for yours!
Thanks, that’s helpful. Yeah, I think that the panel idea is one for the future. My thinking is something like this: Understanding why and how AI Safety related materials (e.g., arguments, research agendas, recruitment type messages etc) influence ML researchers is going to become increasingly important to a growing number of AI Safety community actors (e.g., researchers, organisations, recruiters and movement builders).
Whenever an audience becomes important to some social/business actor (e.g., government/academics/companies), this usually creates sufficient demand to justify setting up a panel/database to service those actors. Assuming the same trend, it may be important/useful to create a panel of ML researchers that AI Safety actors can access.
Does that seem right?
I mention the above in part because I think that you are one of the people who might be best-placed to set something like this up if it seemed like a good idea. Also, because I think that there is a reasonable chance that I would use a service like this within the next two years and end up referring several other people (e.g., those producing choosing educational materials for relevant AI Safety courses) to use it.
Thanks for doing/sharing this Vael. I was excited to see it!
I am currently bringing something of a behaviour change/marketing mindset to thinking about AI Safety movement building and therefore feel that testing how well different messages and materials work for audiences is very important. Not sure if it will actually be as useful as I currently think though.
With that in mind, I’d like to know:
was this as helpful for you/others as expected?
are you planning related testing to do next?
Two ideas I wonder if it would be valuable to first test predictions among communicators for which materials will work best before then doing the test. This could make the value of the new information more salient by showing if/where our intuitions are wrong
I wonder about the value of trying to build an informal panel/mailing list of ML researchers who we can contact/pay to do various things like surveys/interviews. Also to potentially review AI Safety arguments/post from a more skeptical perspective so we can more reliably find any likely flaws in the logic or rhetoric.
Would welcome any thoughts or work on either if you have the time and inclination.
I think these results, and the rest of the results from the larger survey that this content is a part of, have been interesting and useful to people, including Collin and I. I’m not sure what I expected beforehand in terms of helpfulness, especially since there’s a question “helpful with respect to /what/”, and I expect we may have different “what”s here.
Good chance of it! There’s some question about funding, and what kind of new design would be worth funding, but we’re thinking it through.
Yeah, I think this is currently mostly done informally—when Collin and I were choosing materials, we had a big list, and were choosing based on shared intuitions that EAs / ML researchers / fieldbuilders have, in addition to applying constraints like “shortness”. Our full original plan was also much longer and included testing more readings—this was a pilot survey. Relatedly, I don’t think these results are very surprising to people (which I think you’re alluding to in this comment) -- somewhat surprising, but we have a fair amount of information about researcher preferences already.
I do think that if we were optimizing for “value of new information to the EA community” this survey would have looked different.
Instead of contacting a random subset of people who had papers accepted at ML conferences? I think it sort of depends on one’s goals here, but could be good. A few thoughts: I think this may already exist informally, I think this becomes more important as there’s more people doing surveys and not coordinating with each other, and this doesn’t feel like a major need from my perspective / goals but might be more of a bottleneck for yours!
Thanks! Quick responses:
Good to know. When discussing some recent ideas I had for surveys, several people told me that their survey results underperformed their expectations, so I was curious if you would say the same thing.
Thanks for explaining. I realise that the point of that part of my comment was unclear, sorry. I think that using these sorts of surveys to test if best practice contrasts with current practice could make the findings clearer and spur improvement/innovation if needed.
For instance, doing something like this: “We curated the 10 most popular public communication paper from AI Safety organisations and collected predictions from X public AI Safety communicators about which of thse materials would be most effective at persuading existing ML researchers to care about AI Safety. We tested these materials with a random sample of X ML researchers and [supported/challenged existing beliefs/practices]… etc.”
I am interested to hear what you think of the idea of testing using these sorts of surveys to test if best practice contrasts with current practice, but ok if you don’t have time to explain! I imagine that it does add some extra complexity and challenge to the research process, so may not be worth it.
I hope you can do the larger study eventually. If you do, I would also like to see how sharing readings compares against sharing podcasts or videos etc. Maybe some modes of communication perform better on average etc.
Thanks, that’s helpful. Yeah, I think that the panel idea is one for the future. My thinking is something like this: Understanding why and how AI Safety related materials (e.g., arguments, research agendas, recruitment type messages etc) influence ML researchers is going to become increasingly important to a growing number of AI Safety community actors (e.g., researchers, organisations, recruiters and movement builders).
Whenever an audience becomes important to some social/business actor (e.g., government/academics/companies), this usually creates sufficient demand to justify setting up a panel/database to service those actors. Assuming the same trend, it may be important/useful to create a panel of ML researchers that AI Safety actors can access.
Does that seem right?
I mention the above in part because I think that you are one of the people who might be best-placed to set something like this up if it seemed like a good idea. Also, because I think that there is a reasonable chance that I would use a service like this within the next two years and end up referring several other people (e.g., those producing choosing educational materials for relevant AI Safety courses) to use it.