Jess: But they don’t share them because “danger” so no one can check their work, and it looks like a lot of nothing from the outside.
X: It’s a shocking failure of rationality.
Jess: Yes.
There’s an awkward issue here, which is: how can there be people who are financially supported to do research on stuff that’s heavily entangled with ideas that are dangerous to spread? It’s true that there are dangerous incentive problems here, where basically people can unaccountably lie about their private insight into dangerous issues; on the other hand, it seems bad for ideas to be shared that are more or less plausible precursors to a world-ending artifact. My understanding about Eliezer and MIRI is basically, Eliezer wrote a bunch of public stuff that demonstrated that he has insight into the alignment problem, and professed his intent to solve alignment, and then he more or less got tenure from EA. Is that not what happened? Is that not what should have happened? That seems like the next best thing to directly sharing dangerous stuff.
I could imagine a lot of points of disagreement, like
1. that there’s such a thing as ideas that are plausible precursors to world-ending artifacts;
2. that some people should be funded to work on dangerous ideas that can’t be directly shared / evidenced;
3. that Eliezer’s public writing is enough to deserve “tenure”;
4. that the danger of sharing ideas that catalyze world-ending outweighs the benefits of understanding the alignment problem better and generally coordinating by sharing more.
The issue of people deciding to keep secrets is a separate issue from how *other people* should treat these “sorcerers”. My guess is that it’d be much better if sorcerers could be granted tenure without people trusting their opinions or taking instructions from them, when those opinions and instructions are based on work that isn’t shared. (This doesn’t easily mesh with intuitions about status: if someone should be given sorcerer tenure, isn’t that the same thing as them being generally trusted? But no, it’s not, it should be perfectly reasonable to believe someone is a good bet to do well within their cabal, but not a good bet to do well in a system that takes commands and deductions from hidden beliefs without sharing the hidden beliefs.)
Some ways of giving third parties Bayesian evidence that you have some secret without revealing it:
Demos, show off the capability somehow
Have the idea evaluated by a third party who doesn’t share it with the public
Do public work that is impressive the way you’re claiming the secret is (so it’s a closer analogy)
I’m not against “tenure” in this case. I don’t think it makes sense for people to make their plans around the idea that person X has secret Y unless they have particular reason to think secret Y is really important and likely to be possessed by person X (which is related to what you’re saying about trusting opinions and taking instructions). In particular, outsiders should think there’s ~0 chance that a particular AI researcher’s secrets are important enough here to be likely to produce AGI without some sort of evidence. Lots of people in the AI field say they have these sorts of secrets and many have somewhat impressive AI related accomplishments, they’re just way less impressive than what would be needed for outsiders to assign a non-negligible chance to possession of enough secrets to make AGI, given base rates.
Glad this is shared.
There’s an awkward issue here, which is: how can there be people who are financially supported to do research on stuff that’s heavily entangled with ideas that are dangerous to spread? It’s true that there are dangerous incentive problems here, where basically people can unaccountably lie about their private insight into dangerous issues; on the other hand, it seems bad for ideas to be shared that are more or less plausible precursors to a world-ending artifact. My understanding about Eliezer and MIRI is basically, Eliezer wrote a bunch of public stuff that demonstrated that he has insight into the alignment problem, and professed his intent to solve alignment, and then he more or less got tenure from EA. Is that not what happened? Is that not what should have happened? That seems like the next best thing to directly sharing dangerous stuff.
I could imagine a lot of points of disagreement, like
1. that there’s such a thing as ideas that are plausible precursors to world-ending artifacts;
2. that some people should be funded to work on dangerous ideas that can’t be directly shared / evidenced;
3. that Eliezer’s public writing is enough to deserve “tenure”;
4. that the danger of sharing ideas that catalyze world-ending outweighs the benefits of understanding the alignment problem better and generally coordinating by sharing more.
The issue of people deciding to keep secrets is a separate issue from how *other people* should treat these “sorcerers”. My guess is that it’d be much better if sorcerers could be granted tenure without people trusting their opinions or taking instructions from them, when those opinions and instructions are based on work that isn’t shared. (This doesn’t easily mesh with intuitions about status: if someone should be given sorcerer tenure, isn’t that the same thing as them being generally trusted? But no, it’s not, it should be perfectly reasonable to believe someone is a good bet to do well within their cabal, but not a good bet to do well in a system that takes commands and deductions from hidden beliefs without sharing the hidden beliefs.)
Some ways of giving third parties Bayesian evidence that you have some secret without revealing it:
Demos, show off the capability somehow
Have the idea evaluated by a third party who doesn’t share it with the public
Do public work that is impressive the way you’re claiming the secret is (so it’s a closer analogy)
I’m not against “tenure” in this case. I don’t think it makes sense for people to make their plans around the idea that person X has secret Y unless they have particular reason to think secret Y is really important and likely to be possessed by person X (which is related to what you’re saying about trusting opinions and taking instructions). In particular, outsiders should think there’s ~0 chance that a particular AI researcher’s secrets are important enough here to be likely to produce AGI without some sort of evidence. Lots of people in the AI field say they have these sorts of secrets and many have somewhat impressive AI related accomplishments, they’re just way less impressive than what would be needed for outsiders to assign a non-negligible chance to possession of enough secrets to make AGI, given base rates.