I enjoyed this and thank you for writing it. Ultimately, the only real reason to do this is for your own enjoyment or perhaps those of friends (and random people on the internet).
Jeff Rose
Non-signatories to the NPT (Israel, India, Pakistan), were able to and did develop nuclear weapons without being subject to military action. By contrast (and very much contrary to international law) Yudkowsky proposes that non-signatories to his treaty be subject to bombardment.
It is not a well-thought out exception. If this proposal were meant to be taken seriously it would make enforcement exponentially harder and set up an overhang situation where AI capabilities would increase further in a limited domain and be less likely to be interpretable.
The use of violence in case of violations of the NPT treaty has been fairly limited and highly questionable in international law. And, in fact, calls for such violence are very much frowned upon because of fear they have a tendency to lead to full scale war.
No one has ever seriously suggested violence as a response to potential violation of the various other nuclear arms control treaties.
No one has ever seriously suggested running a risk of nuclear exchange to prevent a potential treaty violation. So, what Yudkowsky is suggesting is very different than how treaty violations are usually handled.
Given Yudkowsky’s view that the continued development of AI has an essentially 100% probability of killing all human beings, his view makes total sense—but he is explicitly advocating for violence up to and including acts of war. (His objections to individual violence mostly appear to relate to such violence being ineffective.)
I would think you could force the AI to not notice that the world was round, by essentially inputting this as an overriding truth. And if that was actually and exactly what you cared about, you would be fine. But if what you cared about was any corollary of the world being round or any result of the world being round or the world being some sort of curved polygon it wouldn’t save you.
To take the Paul Tibbetts analogy: you told him not to murder and he didn’t murder; but what you wanted was for him not to kill and in most systems including the one he grew up in killings of the enemy in war are not murder.
This may say more about the limits of the analogy than anything else, but in essence you might be able to tell the AI it can’t deceive you, but it will be bound exactly by the definition of deception you provide and it will freely deceive you in any way that you didn’t think of.
-
Other planets have more mass, higher insolation, lower gravity, lower temperature and/or rings and more (mass in) moons. I can think of reasons why any of those might be more or less desirable than the characteristics of Earth It is also possible that the AI may determine it is better off not to be on a planet at all. In addition, in a non- foom scenario, for defensive or conflict avoidance reasons the AI may wind up leaving Earth and once it does so may choose not to return.
-
That depends a lot on how it views the probe. In particular by doing this is it setting up a more dangerous competitor than humanity or not? Does it regard the probe as self? Has it solved the alignment problem and how good does it think it’s solution is?
-
No. Humans aren’t going to be the best solution. The question is whether they will be good enough that it would be a better use of resources to continue using the humans and focus on other issues.
-
It’s definitely possible that it will discover extra reasons to process Earth (or destroy the humans even if it doesn’t process Earth).
-
This is just wrong. Avoiding processing Earth doesn’t require that the AI cares for us. Other possibilities include:
(1) Earth is not worth it; the AI determines that getting off Earth fast is better;
(2) AI determines that it is unsure that it can process Earth without unacceptable risk to itself;
(3) AI determines that humans are actually useful to it one way or another;
(4) Other possibilities that a super-intelligent AI can think of, that we can’t.
There are, of necessity, a fair number of assumptions in the arguments he makes. Similarly, counter-arguments to his views also make a fair number of assumptions. Given that we are talking about something that has never happened and which could happen in a number of different ways, this is inevitable.
What makes monkeys intelligent in your view?
This is an interesting question on which I’ve gone back and forth. I think ultimately, the inability to recognize blatant inconsistencies or to reason at all means that LLMs so far are not intelligent. (Or at least not more intelligent than a parrot.)
Bing Chat is not intelligent. It doesn’t really have a character. (And whether one calls it GPT-4 or not, given the number of post-GPT changes doesn’t seem very meaningful.)
But to the extent that people think that one or more of the above things are true however, it will tend to increase skepticism of AI and support for taking more care in deploying it and for regulating it, all of which seem positive.
“An exception is made for jobs that fail to reach their employment due to some clearly identifiable non-software-related shock or change in trends, such as an economic crisis or a war. Such jobs will be removed from the list before computing the fraction.”
But macroeconomic or geopolitical events such as major recession or war are likely to affect all job categories. So the correct way to deal with this is not to remove such jobs but to adjust the fraction by the change in overall employment.
Shortening Timelines: There’s No Buffer Anymore
There already exist communication mechanisms more nuanced than signing a petition. You can call or write/email your legislator with more nuanced views. The barrier is not the effort to communicate (which under this proposal might be slightly lower) but the effort to evaluate the issue and come up with a nuanced position.
If the risk from AGI is significant (and whether you think p(doom) is 1% or 10% or 100% is it unequivocally significant) and imminent (and whether your timelines are 3 years or 30 years it is pretty imminent) the problem is that an institution as small as MIRI is a significant part of the efforts to mitigate this risk, not whether or not MIRI gave up.
(I recognize that some of the interest in MIRI is the result of having a relatively small community of people focused on the AGI x-risk problem and the early prominence in that community of a couple of individuals, but that really is just a restatement of the problem).
I appreciate that you have defined what you mean when you say AGI. One problem with a lot of timeline work, especially now, is that AGI is not always defined.
That isn’t very comforting. To extend the analogy: there was a period when humans were relatively less powerful when they would trade with some other animals such as wolves/dogs. Later, when humans became more powerful that stopped.
It is likely that the powers of AGI will increase relatively quickly, so even if you conclude there is a period when AGI will trade with humans that doesn’t help us that much.
I was thinking along similar lines. I note that someone with amnesia probably remains generally intelligent, so I am not sure continuous learning is really required.
I was aware of a couple of these, but most are new to me. Obviously, published papers (even if this is comprehensive) represent only a fraction of what is happening and, likely, are somewhat behind the curve.
And it’s still fairly surprising how much of this there is.
He specifically told me when I asked this question that his views were the same as Geoff Hinton and Scott Aaronson and neither of them hold the view that smarter than human AI poses zero threat to humanity.