I have a dangerous idea which, unlike most of my dangerous ideas, is unencumbered by any laws or non-disclosure agreements, so I can dispose of it as I wish.
In order to demonstrate that I do in fact have a dangerous idea, I would obviously have to disclose it, but disclosure would not prompt the development of countermeasures other than ‘wow, hope nobody else thinks of that’. Any implementation of the idea would probably result in its full disclosure to the world, meaning, no entity could both implement the idea, and keep enough of the idea a secret to prevent others from using it.
I believe implementation of the idea by myself to be immoral, but am conflicted on the ethics of simple disclosure, which I believe would lead to its’ implementation by others with more flexible ethics.
Altruistic silence is probably my default position, but from a strictly rational standpoint, is there some way to get paid for my continued silence (other than with the joy of living in a world ignorant of this idea)?
Edit per request from the comments, I work in an industry related to one of the ones EA has identified as harmful. I view this particular idea as similarly harmful to ‘a proposal for a marketing campaign that would be substantially more effective than existing marketing campaigns at convicing children and teenagers to start using nicotine’ (I do not work in the nicotine industry, and the analogy is imperfect because I think that countermeasures could be developed to a marketing tactic). As my subfield is arcane enough that there are not many people likely to run across my own idea, and if I keep quiet (which I intend to do), there is a good chance that it will not be discovered by someone else.
Unfortunately, my only motivation for staying quiet here is altruism, for others in similar circumstances, how could they be incentivized by the world at large to avoid disclosure of unambiguously harmful ideas?
https://slatestarcodex.com/2013/06/14/the-virtue-of-silence/
This belies a misunderstanding of what ‘rational’ means. Rational does not mean homo economicus, it means doing what a person would actually want to do on reflection if they had a good understanding of their options.
I doubt your idea is actually that dangerous, so I’m treating this as a hypothetical. But in general if your idea is dangerous and you want hush money to keep silent about it then this is really more like a blackmail threat than anything else. I think you should reflect on what life decisions you’ve made that posting what amounts to a “it’d be a real shame if...” threat on a public forum seems like a good idea.
And while you’re at it, delete this.
I am holding a lot of dangerous knowledge and am encumbered by a variety of laws and non-disclosure agreements. This is not actually uncommon. So arguably, I am already being paid to keep my mouth shut about a variety of things, but these are mostly not original thoughts. This specific idea is, in my best judgement, both dangerous, and unencumbered by those laws and NDAs.
The assertion that my default position is ‘altrusitic silence’ means that this is not ‘posting a threat on a public forum’. It would be a real shame if a large variety of things that are currently not generally known were to become public. While I would indeed like to be paid not to make them public (and, as previously stated, in some cases already am), this should not be taken as an assertion to the reader, that if they fail to provide me with some tangible benefit, that I will do something harmful.
This is in a broader sense, a question: ‘If there exists an idea which is simply harmful, for example, a phrase which when spoken aloud turns a human into a raging canninal, such that there is no value whatsoever to increasing the number of people aware of the idea, how can people who generate such ideas be incentivized to not spread them?’
Maybe the best thing to do is to look for originators of new ideas perceived as dangerous, and encourage them to drink hemlock tea before they can hurt anyone else. https://en.m.wikipedia.org/wiki/Trial_of_Socrates
Perhaps your post would have been received differently if the title were “How can dangerous ideas be disposed of” or “How can society incentivize people to not unleash terrible ideas on the world” and the post proceeded accordingly. (The dangerous and empty* personal anecdote, could be replaced with a more mundane musing (‘[This technology] can obviously be used in bad ways**‘, or ‘Given how nukes impacted history, and might in the future, how can things be altered, or what institutions or incentives can be created or implemented, so problems like that don’t happen/are less likely in the future?’)
*People are probably unhappy about that.
**A well known example would do.
I recommend updating the post to make that slightly more clear.
Updated, I left the original wording as intact as possible. The ‘emptiness’ of the personal anecdote I think is important because it demonstrates the messaging challenge faced by someone in this position. If the torches and pitchforks are out in ‘this’ community, imagine how the general public would react.
“I have an idea that makes the world a worse place. I could potentially profit somewhat personally by bringing it to life, but this would be unethical. How badly do I need the money?” Is, in my opinion, probably a fairly common thought in many fields. Ethics can sometimes be expensive, and the prevailing morality, at least in the USA, is ‘just make the money’. Fortunately, in my own case, I do not have visions of large sums of money or prestige on the other side of disclosure, so I am not being tempted very strongly.
Farmers are regularly paid not to grow certain crops, and this makes economic sense somehow. How could someone in my position be incentivized to avoid disclosure of harmful ideas, without requiring that disclosure?
Arguably, an alternative to dealing with the social opprobium of making a pitch like mine would be to rationalize disclosure, argue that the idea is not harmful but is in some way helpful, say that people who say otherwise have flawed arguments, and attempt to maximize profit while minimizing the harms to myself and my own community.
Like an award-winning pornographer who makes a strenuous effort to keep his children and family away from his work.
There are plenty of ideas which can be used for good or ill. (Disrupting the messaging system of (viruses/bacteria) that they use to coordinate attacks on the host once they’ve built up a sufficiently large population sounds obviously good—until you ask ‘once the population gets high enough, won’t they manage to coordinate a larger scale attack even if you’re trying to disrupt their signals?’)
The sense in which something can only be used for one is harder to pin down. (Using machinery for creating viruses/etc to create deadly virus, and then launching a bio-attack on people with said virus, qualifies as only evil.) Perhaps specificity is the key. Is there a way the idea can be generalized for some good use (especially one which outweighs the risk)?
Unfortunately, you really nailed the issue. Out of an abundance of caution, I won’t use your violent analogy of a bio-weapon here, as that could be construed as furthering the ‘blackmail’ misinterpretation of my writing.
To use the analogy I added to the OP, there may in theory be good reasons to market things to vulnerable populations (like children), and there may in theory be good reasons to study nicotine marketing (market less harmful products to existing users), but someone with knowledge of both fields who realizes something like ‘by synthesizing existing work on nicotine marketing with existing work on marketing things to children, I have identified a magic formula that will double the number of smokers in the next generation’ has discovered a dangerous idea.
If for example, this person is employed at a marketing agency that took work from a client who sells nicotine products, his manager will make a strong appeal to his selfishness (‘so what have you been working on?’)
As altruists, we would like that idea to remain unknown, how do we as altruists appeal to that person’s selfishness without demanding disclosures to some entity that promises not to actually do anything with the idea?
The Unabomber had a proposed solution to this problem—people he judged to be producing ideas that were harmful to whatever it was that he cared about received bombs in the mail, thus appealing to engineers’ desire to not get hurt in bombings. I understand that there is a country in the middle east which has historically taken the same approach.
Perhaps I should view the ‘delete this’ command and suggestion that I was violating a social norm that is often punished by violent men (posting a threat in a public forum bad decision wut wut) in the most upvoted comment on this thread as an endorsement of that ‘negative reinforcement’ strategy by this community?
Only socially I imagine—via Downvotes yes, bombs no.
I’d guess it’s mostly about the belief that blackmail was involved, but there’s only one way to test that.
I imagine people react differently to “my work has bad incentives in place, it’s a shame I’m not payed for not doing X” than “I’m looking for a job which doesn’t encourage/involve doing bad things.” (Yes, people demand ‘altruism’ of others.)
The question is, can this be reversed? Can a formula for reducing the number of smokers be devised instead? Or is the thing you describe just the reverse of this (work on how to reduce harm turned into work on how to increase harm)?
To use the zombie-words example I raised in a previous comment.
Imagine a “human shellcode compiler”, which requires a large amount of processing power and can generate a phrase that a human who hears it will instantly obey, and no countermeasures are available other than ‘not hearing the phrase’. Theoretically, this could have good applications if very carefully controlled (“stop using heroin!”).
Imagine someone runs this to make a command like ‘devour all the living human flesh you can find’. The compiler is salvageable, this particular compiled command is not.
I believe my idea to be closer to the second example than the first, though not nearly to the same level of harm. Based on the qualia computing post linked elsewhere, my most ethical option is ‘be quiet about this one and hope I find a better idea to sell’.
I think this a really unfair reading of this post. Maybe it has been edited in ways since it was originally posted to change its tone (my understanding is that some editing has happened), but my impression is that the author is asking about economic incentives that would keep them or someone like them quiet rather than blackmail. If the author wanted to blackmail us, they could have made a very different kind of post.
Thank you for this, I believe you have described my intent accurately.
To clarify, everything before ‘per request from the comments...’ was the original post.
Here’s some good thoughts on dealing with dangerous ideas:
https://qualiacomputing.com/2019/08/30/why-care-about-meme-hazards-and-thoughts-on-how-to-handle-them/
Thank you for this, it is the type of response I was looking for, and I now have a new blog to read regularly.
I don’t think we have a general solution to this problem, but it would be great if we could find one because I believe it to be equivalent to solving the coordination problems we face in avoiding unsafe superintelligent AI (assuming you believe superintelligent AI is unsafe by default, as I do). You might find some insight towards mechanism people have considered by looking at posts here about AI policy related to avoiding and preventing the development of unsafe AI.
Thank you! I was hoping that someone was aware of some clever solution to this problem.
I believe that AI is at least as inherently unsafe as HI, ‘Human Intelligence’. I do think that our track record with managing the dangers of HI is pretty good, in that we are still here, which gives me hope for AI safety.
I wonder how humans would react to a superintelligent AI stating ‘I have developed an idea harmful to humans, and I am incentivized to publicize that idea. I don’t want to do harm to humans, can you please take a look at my incentives and tell me if my read of them is correct? I’ll stick with inaction until the analysis is complete.’
Is that a best-case scenario for a friendly superintelligence?
I think I have a general solution. It requires altruism, but at the social rather than the individual level.
This concept of a reverse patent office as described is based on the idea that if a toxic meme emerges, it would be more harmful if it had emerged in the past, and that any delay is good.
The reverse patent office accepts encrypted submissions from idea generators, and observes new ideas in the world.
When the reverse patent office observes a harmful idea in existence in the world, and assesses it as worth ‘having been delayed’ based on pre-existing criteria, submitters who previously conceived of the toxic meme submit decryption instructions. Payout is given based on the amount of time that has passed since initial receipt of the encrypted submission, using a function that gives compounding interest as time passes (thus not creating an incentive to ‘harvest’ a harmful idea by disclosing it after submission).
Does this solution work? Who would fund such an organization? What would be its criteria?
I think a serious complication arises in this scenario:
I discover a dangerous idea at age 20, and get a reverse patent on it as described here.
At age 60 I learn I am terminally ill and don’t believe there is any mechanism by which my existence can be carried forward (e.g. cryonics). I am given 1 year to live.
I make the dangerous idea public and collect and large sum for having kept quiet for years so I can enjoy my last year of life, even if the world doesn’t continue much beyond that because it’s destroyed by my dangerous idea.
Maybe some sort of prohibition on collecting if it can be proven that you chose to publicize it would be a good idea.
I argue that in your scenario, the reverse patent is unambiguously a good thing, as it bought us 40 years.
The same ugly incentives apply to pollution, ‘I can get money now, or leave an unpolluted world to posterity’, you don’t need too many people to think that way to get some severe harm.