Thanks for your comment, Koen. Two quick clarifications:
In the event that we receive a high number of submissions, the undergrads and grad students will screen submissions. Submissions above a certain cutoff will be sent to our (senior) panel of judges.
People who submit promising 500-word submissions will (often) be asked to submit longer responses. The 500-word abstract is meant to save people time (they get feedback on the 500-word idea before they spend a bunch of time formalizing things, running experiments, etc.)
Two questions for you:
What do you think are the strongest proposals for corrigibility? Would love to see links to the papers/proofs.
Can you email us at akash@alignmentawards.com and olivia@alignmentawards.com with some more information about you, your AIS background, and what kinds of submissions you’d be interested in judging? We’ll review this with our advisors and get back to you (and I appreciate you volunteering to judge!)
Hi Akash! Thanks for the quick clarifications, these make the contest look less weird and more useful than just a 500 word essay contest.
My feedback here is that I definitely got the 500 word essay contest vibe when I read the ‘how it works’ list on the contest home page, and this vibe only got reinforced when I clicked on the official rules link and skimmed the document there. I recommend that you edit the ‘how it works’ list to on the home page, to make it it much more explicit that the essay submission is often only the first step of participating, a step that will lead to direct feedback, and to clarify that you expect that most of the prize money will go to participants who have produced significant research beyond the initial essay. If that is indeed how you want to run things.
On judging: OK I’ll e-mail you.
I have to think more about your question about posting a writeup on this site about what I think are the strongest proposals for corrigibility. My earlier overview writeup that explored the different ways how people define corrigibility took me a lot of time to write, so there is an opportunity cost I am concerned about. I am more of an academic paper writing type of alignment researcher than a blogging all of my opinions on everything type of alignment researcher.
On the strongest policy proposal towards alignment and corrigibility, not technical proposal: if I limit myself to the West (I have not looked deeply into China, for example) then I consider the EU AI Act initiative by the EU to be the current strongest policy proposal around. It is not the best proposal possible, and there are a lot of concerns about it, but if I have to estimate expected positive impact among different proposals and initiatives, this is the strongest one.
Thanks for your comment, Koen. Two quick clarifications:
In the event that we receive a high number of submissions, the undergrads and grad students will screen submissions. Submissions above a certain cutoff will be sent to our (senior) panel of judges.
People who submit promising 500-word submissions will (often) be asked to submit longer responses. The 500-word abstract is meant to save people time (they get feedback on the 500-word idea before they spend a bunch of time formalizing things, running experiments, etc.)
Two questions for you:
What do you think are the strongest proposals for corrigibility? Would love to see links to the papers/proofs.
Can you email us at akash@alignmentawards.com and olivia@alignmentawards.com with some more information about you, your AIS background, and what kinds of submissions you’d be interested in judging? We’ll review this with our advisors and get back to you (and I appreciate you volunteering to judge!)
Hi Akash! Thanks for the quick clarifications, these make the contest look less weird and more useful than just a 500 word essay contest.
My feedback here is that I definitely got the 500 word essay contest vibe when I read the ‘how it works’ list on the contest home page, and this vibe only got reinforced when I clicked on the official rules link and skimmed the document there. I recommend that you edit the ‘how it works’ list to on the home page, to make it it much more explicit that the essay submission is often only the first step of participating, a step that will lead to direct feedback, and to clarify that you expect that most of the prize money will go to participants who have produced significant research beyond the initial essay. If that is indeed how you want to run things.
On judging: OK I’ll e-mail you.
I have to think more about your question about posting a writeup on this site about what I think are the strongest proposals for corrigibility. My earlier overview writeup that explored the different ways how people define corrigibility took me a lot of time to write, so there is an opportunity cost I am concerned about. I am more of an academic paper writing type of alignment researcher than a blogging all of my opinions on everything type of alignment researcher.
On the strongest policy proposal towards alignment and corrigibility, not technical proposal: if I limit myself to the West (I have not looked deeply into China, for example) then I consider the EU AI Act initiative by the EU to be the current strongest policy proposal around. It is not the best proposal possible, and there are a lot of concerns about it, but if I have to estimate expected positive impact among different proposals and initiatives, this is the strongest one.