The “private knowledge space” model does seem a lot more practical than my original post for the purposes of maintaining a community without the same burden on coordinators.
Some questions I think about when it comes to this kind of thing (not directed just at you, but also myself and anyone else!):
How is access to the space managed? What degree of due diligence is involved? (e.g. punt to “passed LTFF’s due diligence” or do its own to avoid correlating failures? Rely on personal connections and references? “Wrote some good lesswrong posts?”)
What are the concrete details of the agreements that need to be signed to join?
Where is the line drawn for classifying information as private?
If the space is associated with an actual lab, what constraints, if any, might the lab find themselves working under? (For example, concerns of being legally “tainted” by private information leading to formal members of the lab not participating or reading.)
For members of organizations that can’t participate (because of concerns like the above), is there any way for them to still indirectly participate? For example, throwing information over the wall, or getting limited/scrubbed information out? (I suspect members of existing major capability-oriented organizations would probably not be able to submit information to the community that isn’t already cleared for general publication- is there a way around this? Would such organizations be more inclined to share information if a solid track record of strong, legally-enforced secrecy is established?)
What’s the upper bound on the dangerousness of information shared within the community? This would partly be a function of who has access, what the legal constraints are, and the measures taken to prevent leaks. (I’m guessing the upper bound will still be fairly low- you wouldn’t want to mention One Weird Trick to End The World- but that this bound would still cover most capabilities that you’re likely to run across during alignment work.)
To be clear, I don’t view these as blockers, but rather as optimization problems. Within this kind of system, what parameters along each of these axes do you pick to get the best outcome? I’m pretty sure there’s a wide region of parameter space where this kind of thing has significant value.
I think this approach has some limitations, but there’s no reason why one system needs to cover every possible use case at once.
The “private knowledge space” model does seem a lot more practical than my original post for the purposes of maintaining a community without the same burden on coordinators.
Some questions I think about when it comes to this kind of thing (not directed just at you, but also myself and anyone else!):
How is access to the space managed? What degree of due diligence is involved? (e.g. punt to “passed LTFF’s due diligence” or do its own to avoid correlating failures? Rely on personal connections and references? “Wrote some good lesswrong posts?”)
What are the concrete details of the agreements that need to be signed to join?
Where is the line drawn for classifying information as private?
If the space is associated with an actual lab, what constraints, if any, might the lab find themselves working under? (For example, concerns of being legally “tainted” by private information leading to formal members of the lab not participating or reading.)
For members of organizations that can’t participate (because of concerns like the above), is there any way for them to still indirectly participate? For example, throwing information over the wall, or getting limited/scrubbed information out? (I suspect members of existing major capability-oriented organizations would probably not be able to submit information to the community that isn’t already cleared for general publication- is there a way around this? Would such organizations be more inclined to share information if a solid track record of strong, legally-enforced secrecy is established?)
What’s the upper bound on the dangerousness of information shared within the community? This would partly be a function of who has access, what the legal constraints are, and the measures taken to prevent leaks. (I’m guessing the upper bound will still be fairly low- you wouldn’t want to mention One Weird Trick to End The World- but that this bound would still cover most capabilities that you’re likely to run across during alignment work.)
To be clear, I don’t view these as blockers, but rather as optimization problems. Within this kind of system, what parameters along each of these axes do you pick to get the best outcome? I’m pretty sure there’s a wide region of parameter space where this kind of thing has significant value.
I think this approach has some limitations, but there’s no reason why one system needs to cover every possible use case at once.