[deleted] comments on Debunking Fallacies in the Theory of AI Motivation

[deleted] 18 May 2015 20:43 UTC
1 point
I have read what you wrote above carefully, but I won’t reply line-by-line because I think it will be clearer not to.

When it comes to finding a concise summary of my claims, I think we do indeed need to be careful to avoid blanket terms like “superintelligent” or superclever” or “superwise” … but we should only avoid these IF they are used with the implication they have a precise (perhaps technically precise) meaning. I do not believe they have precise meaning. But I do use the term “superintelligent” a lot anyway. My reason for doing that is because I only use it as an overview word—it is just supposed to be a loose category that includes a bunch of more specific issues. I only really want to convey the particular issues—the particular ways in which the intelligence of the AI might be less than adequate, for example.

That is only important if we find ourselves debating whether it might clever, wise, or intelligent ….. I wouldn’t want to get dragged into that, because I only really care about specifics.

For example: does the AI make a habit of forming plans that massively violate all of its background knowledge about the goal that drove the plan? If it did, it would (1) take the baby out to the compost heap when what it intended to do was respond to the postal-chess game it is engaged in, or (2) cook the eggs by going out to the workshop and making a cross-cutting jog for the table saw, or (3) …...… and so on. If we decided that the AI was indeed prone to errors like that, I wouldn’t mind if someone diagnosed a lack of ‘intelligence’ or a lack of ‘wisdom’ or a lack of … whatever. I merely claim that in that circumstance we have evidence that the AI hasn’t got what it takes to impose its will on a paper bag, never mind exterminate humanity.

Now, my attacks on the scenarios have to do with a bunch of implications for what the AI (the hypothetical AI) would actually do. And it is that ‘bunch’ that I think add up to evidence for what I would summarize as ‘dumbness’.

And, in fact, I usually go further than that and say that if someone tried to get near to an AI design like that, the problems would arise early on and the AI itself (inasmuch as it could do anyhting smart at all) would be involved in the efforts to suggest improvements. This is where we get the suggestions in your item 2, about the AI ‘recognizing’ misalignments.

I suspect that on this score a new paper is required, to carefully examine the whole issue in more depth. In fact, a book.

I am now decided that that has to happen.

So perhaps it is best to put the discussion on hold until a seriously detailed technical book comes out of me? At any rate, that is my plan.
- Vaniver 18 May 2015 22:32 UTC
  1 point
  Parent
  
  So perhaps it is best to put the discussion on hold until a seriously detailed technical book comes out of me? At any rate, that is my plan.
  
  That seems like a solid approach. I do suggest that you try to look deeply into whether or not it’s possible to partially solve the problem of understanding goals, as I put it above, and make that description of why that is or isn’t possible or likely long and detailed. As you point out, that likely requires book-length attention.