This is a cool idea and something I’d love to see succeed.
Challenges will be: - getting enough adoption from the right people to populate, criticize, and keep it updated; the trick would be to be successful enough that people are willing to return to the site (email subscriptions/reminders would likely help you a lot). - While voting on criticisms might push better ones to the top, you have to start with a voting population whose judgment would be good, a little chickn/eggo problem. - I predict that many (most?) peoples whose plans you post will feel many of the criticisms posted miss the point and are kind of annoying, nor do they obviously want to end up in some kind of ranking. It’s possible other people will post on their behalf, and it could get discussed, which might be okay.
My guess is to make it work, I’d adopt a pretty Lean approach if you’re not already (e.g. read the Lean Startup). Your plan has a lot of pieces, and it’s likely impractical to build them all before you start getting users, so be very deliberate about what you build first, and target the most likely points of failure.
A more modest MVP that I have absolutely no self-interested bias in whatsoever is that someone competent collects plans/agendas in the LessWrong wiki[1] (perhaps with one overview doc of all the plans, and individual pages for each of the plans that can link out to posts and comments that discuss them. Big advantage of that is that LessWrong already has many of the features you need (karma, spam, good design detection, etc.) plus a lot of people are frequently on the site for Alignment content and discussion. Doesn’t have duplication detection, but I think you’d need a lot of adoption before that couldn’t be dealt with manually.
All in all, good luck! Very cool if you can make it work.
The LessWrong team needs to do a bit of work to make it possible to create wiki pages that are not also tag pages but seems fine to create pages as tags in the meantime.
A deeper challenge I think you’ll face is that many plans are a lot higher context than they might seem. E.g., you only understand a plan Paul proposed decently well if you’ve spent many hours talking to him directly about it (based on a report from at least one person), and it’s not really practical or feasible to right up something that could convey the plan via text to other people (it’s unclear to me how much the ELK document even fully succeeded at conveying its underlying ideas, despite being 200 pages long).
This means that it’s hard to be well-situated enough to criticize plans helpfully unless you’re really close to the work, i.e. working directly or in close proximity to those working on it. Doesn’t mean there isn’t value here, but I think a challenge.
In my vision for a similar-feature, the authors of a plan/research agenda have listed the prerequisite knowledge required to engage well with their proposal, and I could imagine listed prerequisites being extensive.
Also beware building around an ontology that might not be quite right. “Plan” and “criticism” I don’t think will capture much of the discussion that needs to be had. Many people’s work will be more narrowly scoped than being a plan, more like an investigation into some topic that seems like it might be useful as part of a larger plan (e.g. exploring a concept or particular interpretability approach), and a criticism might be more like “you assume X, but it’s not clear X is true”, in which case the response should be an exploration of whether or not X is true, not trying to revise the plan to avoid the assumption. A good exercise might be trying to fit the MIRI Dialogs into your format and seeing how well that’d work.
Hi Ruby! Thanks for the great feedback!! Sorry for the late reply, I’ve been working on the site!
So, we’re not doing just criticisms anymore- we’re ranking plans by Total Strength score—Total Vulnerabilities scores. Quite a few researchers have been posting their plans on the site! Going to do a full rebuild soon, to make the site look nicer and be even faster to work on. We’re also holding regular critique-a-thons. The last one went very well! We had 40+ submissions and produced what I think is really great work! We also made a Broad List of Vulnerabilities in the first two days! https://docs.google.com/document/d/1tCMrvJEueePNgb2_nOEUMc_UGce7TxKdqI5rOJ1G7C0/edit?usp=sharing
On not getting all of a plan’s details without talking to the person a lot- I think this is a vulnerability in communication. A serious plan, with the intention of actually solving the problem, should have the effort put into it to make it clear to a reader what it actually is, what problems it aims to solve, why it aims to solve them and how it seeks to do so. A failure to do so is silly for any serious strategy.
The good thing is, that if such a vulnerability is pointed out, on AI-Plans.com, the poster can see the vulnerability and iterate on it!
This is a cool idea and something I’d love to see succeed.
Challenges will be:
- getting enough adoption from the right people to populate, criticize, and keep it updated; the trick would be to be successful enough that people are willing to return to the site (email subscriptions/reminders would likely help you a lot).
- While voting on criticisms might push better ones to the top, you have to start with a voting population whose judgment would be good, a little chickn/eggo problem.
- I predict that many (most?) peoples whose plans you post will feel many of the criticisms posted miss the point and are kind of annoying, nor do they obviously want to end up in some kind of ranking. It’s possible other people will post on their behalf, and it could get discussed, which might be okay.
I’ve thought about something similar in the context of LessWrong in the past.
My guess is to make it work, I’d adopt a pretty Lean approach if you’re not already (e.g. read the Lean Startup). Your plan has a lot of pieces, and it’s likely impractical to build them all before you start getting users, so be very deliberate about what you build first, and target the most likely points of failure.
A more modest MVP that I have absolutely no self-interested bias in whatsoever is that someone competent collects plans/agendas in the LessWrong wiki[1] (perhaps with one overview doc of all the plans, and individual pages for each of the plans that can link out to posts and comments that discuss them. Big advantage of that is that LessWrong already has many of the features you need (karma, spam, good design detection, etc.) plus a lot of people are frequently on the site for Alignment content and discussion. Doesn’t have duplication detection, but I think you’d need a lot of adoption before that couldn’t be dealt with manually.
All in all, good luck! Very cool if you can make it work.
The LessWrong team needs to do a bit of work to make it possible to create wiki pages that are not also tag pages but seems fine to create pages as tags in the meantime.
A deeper challenge I think you’ll face is that many plans are a lot higher context than they might seem. E.g., you only understand a plan Paul proposed decently well if you’ve spent many hours talking to him directly about it (based on a report from at least one person), and it’s not really practical or feasible to right up something that could convey the plan via text to other people (it’s unclear to me how much the ELK document even fully succeeded at conveying its underlying ideas, despite being 200 pages long).
This means that it’s hard to be well-situated enough to criticize plans helpfully unless you’re really close to the work, i.e. working directly or in close proximity to those working on it. Doesn’t mean there isn’t value here, but I think a challenge.
In my vision for a similar-feature, the authors of a plan/research agenda have listed the prerequisite knowledge required to engage well with their proposal, and I could imagine listed prerequisites being extensive.
Also beware building around an ontology that might not be quite right. “Plan” and “criticism” I don’t think will capture much of the discussion that needs to be had. Many people’s work will be more narrowly scoped than being a plan, more like an investigation into some topic that seems like it might be useful as part of a larger plan (e.g. exploring a concept or particular interpretability approach), and a criticism might be more like “you assume X, but it’s not clear X is true”, in which case the response should be an exploration of whether or not X is true, not trying to revise the plan to avoid the assumption. A good exercise might be trying to fit the MIRI Dialogs into your format and seeing how well that’d work.
Hi Ruby! Thanks for the great feedback!! Sorry for the late reply, I’ve been working on the site!
So, we’re not doing just criticisms anymore- we’re ranking plans by Total Strength score—Total Vulnerabilities scores. Quite a few researchers have been posting their plans on the site!
Going to do a full rebuild soon, to make the site look nicer and be even faster to work on.
We’re also holding regular critique-a-thons. The last one went very well!
We had 40+ submissions and produced what I think is really great work!
We also made a Broad List of Vulnerabilities in the first two days! https://docs.google.com/document/d/1tCMrvJEueePNgb2_nOEUMc_UGce7TxKdqI5rOJ1G7C0/edit?usp=sharing
On not getting all of a plan’s details without talking to the person a lot- I think this is a vulnerability in communication.
A serious plan, with the intention of actually solving the problem, should have the effort put into it to make it clear to a reader what it actually is, what problems it aims to solve, why it aims to solve them and how it seeks to do so.
A failure to do so is silly for any serious strategy.
The good thing is, that if such a vulnerability is pointed out, on AI-Plans.com, the poster can see the vulnerability and iterate on it!