Currently, newcomers to the field of AI Alignment often struggle to understand the ongoing work and individuals involved, as well as the assumptions, strengths, and weaknesses of each plan.
We believe AI-plans.com will be an easy, centralized way to discover and learn more about the most promising alignment plans.
The site is currently in Stage 1, functioning purely as a compendium. We are in the process of adding up to 1000 plans and the criticisms made against them so far. Further plans and criticisms can be added by users.
Projected benefits of Stage 1:
- Easy discovery of proposed plans and better understanding of their prevalent challenges.
(This is already showing promise, with one researcher letting us know they found useful papers on the site and multiple researchers interested- including Jonathan Ng who has been helping us.)
Next, in Stage 2, we will introduce a scoring system for criticisms and a ranking system for plans. Plans will be ranked based on the cumulative scores of their criticisms. Criticism votes will be weighted, giving more influence to users who have submitted higher-scoring criticisms. Alignment researchers will have the option to link their AI-Plans account to accounts on research-relevant platforms (such as arXiv, OpenReview or the AI Alignment Forum) in order to start out with a slightly weighted vote (with mod approval).
Each new plan will start with 0 bounty, and lower bounty plans will give the most points. That way, each new plan will have a lot of opportunity and incentive for criticism. More details here.
Projected benefits of Stage 2:
- Incentivizes users to write high-quality criticisms.
- Facilitates identification of plans with significant weaknesses, supporting arguments against problematic plans.
- Allows newcomers to the field(including talented and untapped scientists and engineers) to see which companies have the least problematic plans. After all, who would want to work for the lowest-ranked company on the leaderboard?
(I have spoken with the creator of aisafety.careers, who intends to integrate with our site.)
At Stage 3, in addition to everything from Stage 1 and 2, we plan to introduce monthly cash prizes for the highest ranking plan and for the users with the most criticism points that month.
Projected benefits of Stage 3:
- Supercharges the impact of Stage 2, attracting talented individuals who require a non-committal monetary incentive to engage with alignment research.
- Provides a heuristic argument for the difficulty of the problem: “There is money on the table if anyone can come up with a plan with fewer problems, yet no one has done so!”
What I’d like to ask lesswrong users- What do you think could go wrong with this? What would make you want to use the site/participate/contribute?
Even briefer summary of ai-plans.com
Link post
Currently, newcomers to the field of AI Alignment often struggle to understand the ongoing work and individuals involved, as well as the assumptions, strengths, and weaknesses of each plan.
We believe AI-plans.com will be an easy, centralized way to discover and learn more about the most promising alignment plans.
The site is currently in Stage 1, functioning purely as a compendium. We are in the process of adding up to 1000 plans and the criticisms made against them so far. Further plans and criticisms can be added by users.
Projected benefits of Stage 1:
- Easy discovery of proposed plans and better understanding of their prevalent challenges.
(This is already showing promise, with one researcher letting us know they found useful papers on the site and multiple researchers interested- including Jonathan Ng who has been helping us.)
Next, in Stage 2, we will introduce a scoring system for criticisms and a ranking system for plans. Plans will be ranked based on the cumulative scores of their criticisms. Criticism votes will be weighted, giving more influence to users who have submitted higher-scoring criticisms. Alignment researchers will have the option to link their AI-Plans account to accounts on research-relevant platforms (such as arXiv, OpenReview or the AI Alignment Forum) in order to start out with a slightly weighted vote (with mod approval).
Each new plan will start with 0 bounty, and lower bounty plans will give the most points. That way, each new plan will have a lot of opportunity and incentive for criticism. More details here.
Projected benefits of Stage 2:
- Incentivizes users to write high-quality criticisms.
- Facilitates identification of plans with significant weaknesses, supporting arguments against problematic plans.
- Allows newcomers to the field(including talented and untapped scientists and engineers) to see which companies have the least problematic plans.
After all, who would want to work for the lowest-ranked company on the leaderboard?
(I have spoken with the creator of aisafety.careers, who intends to integrate with our site.)
At Stage 3, in addition to everything from Stage 1 and 2, we plan to introduce monthly cash prizes for the highest ranking plan and for the users with the most criticism points that month.
Projected benefits of Stage 3:
- Supercharges the impact of Stage 2, attracting talented individuals who require a non-committal monetary incentive to engage with alignment research.
- Provides a heuristic argument for the difficulty of the problem: “There is money on the table if anyone can come up with a plan with fewer problems, yet no one has done so!”
What I’d like to ask lesswrong users-
What do you think could go wrong with this?
What would make you want to use the site/participate/contribute?