But, in this case, would “thinking like an AI programmer” really help you answer the question of “what we mean by Friendly” ? Of course, once we do get an answer, we’d need to implement it, which is where thinking like an AI programmer (or actually being one) would come in handy. But I think that’s also an engineering challenge at that point.
If you don’t think like an AI programmer, you will be tempted to use concepts without understanding them well enough to program them. I don’t think that’s reduced to the level of ‘engineering challenge.’
Are you saying that it’s impossible to correctly answer the question “what does ‘friendly’ mean ?” without understanding how to implement the answer by writing a computer program ? If so, why do you think that ?
Edit: added “correctly” in the sentence above, because it’s trivially possible to just answer “bananas !” or something :-)
I don’t think the division is so sharp as all that. Rather, what Vanvier is getting at, I think, is that one is capable of correctly and usefully answering the question “What does ‘Friendy’ mean?” in proportion to one’s ability to reason algorithmically about subproblems of Friendliness.
I see, so you’re saying that a philosopher who is not familiar with AI might come up with all kinds of philosophically valid definitions of friendliness, which would still be impossible to implement (using a reasonable amount of space and time) and thus completely useless in practice. That makes sense. And (presumably) if we assume that humans are kind of similar to AIs, then the AI-savvy philosopher’s ideas would have immediate applications, as well.
So, that makes sense, but I’m not aware of any philosophers who have actually followed this recipe. It seems like at least a few such philosophers should exist, though… do they ?
[P]hilosophically valid definitions of friendliness, which would still be impossible to implement (using a reasonable amount of space and time) and thus completely useless in practice.
Yes, or more sneakily, impossible to implement due to a hidden reliance on human techniques for which there is as-yet no known algorithmic implementation.
Programmers like to say “You don’t truly understand how to perform a task until you can teach a computer to do it for you”. A computer, or any other sort of rigid mathematical mechanism, is unable to make the ‘common sense’ connections that a human mind can make. We humans are so good at that sort of thing that we often make many such leaps in quick succession without even noticing!
Implementing an idea on a computer forces us to slow down and understand every step, even the ones we make subconsciously. Otherwise the implementation simply won’t work. One doesn’t get as thorough a check when explaining things to another human.
Philosophy in general is enriched by an understanding of math and computation, because it provides a good external view of the situation. This effect is of course only magnified when the philosopher is specifically thinking about how to represent human mental processes (such as volition) in a computational way.
I agree with most of what you said, except for this:
Yes, or more sneakily, impossible to implement due to a hidden reliance on human techniques for which there is as-yet no known algorithmic implementation.
Firstly, this is an argument for studying “human techniques”, and devising algorithmic implementations, and not an argument for abandoning these techniques. Assuming the techniques are demonstrated to work reliably, of course.
Secondly, if we assume that uploading is possible, this problem can be hacked around by incorporating an uploaded human into the solution.
Firstly, this is an argument for studying “human techniques”, and devising algorithmic implementations, and not an argument for abandoning these techniques.
Indeed, I should have been more specific; not all processes used in AI need to be analogous to humans, of course. All I meant was that it is very easy, when trying to provide a complete spec of a human process, to accidentally lean on other human mental processes that seem on zeroth-glance to be “obvious”. It’s hard to spot those mistakes without an outside view.
Secondly, if we assume that uploading is possible, this problem can be hacked around by incorporating an uploaded human into the solution.
To a degree, though I suspect that even in an uploaded mind it would be tricky to isolate and copy-out individual techniques, since they’re all likely to be non-locally-cohesive and heavily interdependent.
It’s hard to spot those mistakes without an outside view.
Right, that makes sense.
To a degree, though I suspect that even in an uploaded mind it would be tricky to isolate and copy-out individual techniques...
True, but I wasn’t thinking of using an uploaded mind to extract and study those ideas, but simply to plug the mind into your overall architecture and treat it like a black box that gives you the right answers, somehow. It’s a poor solution, but it’s better than nothing—assuming that the Singularity is imminent and we’re all about to be nano-recycled into quantum computronium, unless we manage to turn the AI into an FAI in the next 72 hours.
If you don’t think like an AI programmer, you will be tempted to use concepts without understanding them well enough to program them. I don’t think that’s reduced to the level of ‘engineering challenge.’
Are you saying that it’s impossible to correctly answer the question “what does ‘friendly’ mean ?” without understanding how to implement the answer by writing a computer program ? If so, why do you think that ?
Edit: added “correctly” in the sentence above, because it’s trivially possible to just answer “bananas !” or something :-)
I don’t think the division is so sharp as all that. Rather, what Vanvier is getting at, I think, is that one is capable of correctly and usefully answering the question “What does ‘Friendy’ mean?” in proportion to one’s ability to reason algorithmically about subproblems of Friendliness.
I see, so you’re saying that a philosopher who is not familiar with AI might come up with all kinds of philosophically valid definitions of friendliness, which would still be impossible to implement (using a reasonable amount of space and time) and thus completely useless in practice. That makes sense. And (presumably) if we assume that humans are kind of similar to AIs, then the AI-savvy philosopher’s ideas would have immediate applications, as well.
So, that makes sense, but I’m not aware of any philosophers who have actually followed this recipe. It seems like at least a few such philosophers should exist, though… do they ?
Yes, or more sneakily, impossible to implement due to a hidden reliance on human techniques for which there is as-yet no known algorithmic implementation.
Programmers like to say “You don’t truly understand how to perform a task until you can teach a computer to do it for you”. A computer, or any other sort of rigid mathematical mechanism, is unable to make the ‘common sense’ connections that a human mind can make. We humans are so good at that sort of thing that we often make many such leaps in quick succession without even noticing!
Implementing an idea on a computer forces us to slow down and understand every step, even the ones we make subconsciously. Otherwise the implementation simply won’t work. One doesn’t get as thorough a check when explaining things to another human.
Philosophy in general is enriched by an understanding of math and computation, because it provides a good external view of the situation. This effect is of course only magnified when the philosopher is specifically thinking about how to represent human mental processes (such as volition) in a computational way.
I agree with most of what you said, except for this:
Firstly, this is an argument for studying “human techniques”, and devising algorithmic implementations, and not an argument for abandoning these techniques. Assuming the techniques are demonstrated to work reliably, of course.
Secondly, if we assume that uploading is possible, this problem can be hacked around by incorporating an uploaded human into the solution.
Indeed, I should have been more specific; not all processes used in AI need to be analogous to humans, of course. All I meant was that it is very easy, when trying to provide a complete spec of a human process, to accidentally lean on other human mental processes that seem on zeroth-glance to be “obvious”. It’s hard to spot those mistakes without an outside view.
To a degree, though I suspect that even in an uploaded mind it would be tricky to isolate and copy-out individual techniques, since they’re all likely to be non-locally-cohesive and heavily interdependent.
Right, that makes sense.
True, but I wasn’t thinking of using an uploaded mind to extract and study those ideas, but simply to plug the mind into your overall architecture and treat it like a black box that gives you the right answers, somehow. It’s a poor solution, but it’s better than nothing—assuming that the Singularity is imminent and we’re all about to be nano-recycled into quantum computronium, unless we manage to turn the AI into an FAI in the next 72 hours.
Endorsed.