If your goal is to get gold and not to just do alchemy, then upon discovering that alchemy is stupid you turn to different angles of attack. You don’t need to know whether SIAI’s current approach is right, you only need to know whether there are capable people working on the problem there, who really want to solve the problem and not just create appearance of solving the problem, and who won’t be bogged down by pursuit of lost causes. Ensuring the latter is of course a legitimate concern.
The three outstanding problems in physics, in a certain sense, were never worked on while I was at Bell Labs. By important I mean guaranteed a Nobel Prize and any sum of money you want to mention. We didn’t work on (1) time travel, (2) teleportation, and (3) antigravity. They are not important problems because we do not have an attack. It’s not the consequence that makes a problem important, it is that you have a reasonable attack.
For now, a valid “attack” of Friendly AI is to actually research this question, given that it wasn’t seriously thought about before. For time travel or antigravity, we don’t just not have at attack, we have a pretty good idea of why it’s won’t be possible to implement them now or ever, and the world won’t end if we don’t develop them. For Friendly AI, there is no such clarity or security.
I want to ask “how much thought have you given it, to be confident that you don’t have an attack?”, but I’m guessing you’ll say that the outside view says you don’t and that’s that.
I didn’t mean to say no attack existed, only that I don’t have one ready. I can program okay and have spent enough time reading about AGI to see how the field is floundering.
I’ve grown out of seeing FAI as an AI problem, at least on the conceptual stage where there are very important parts still missing, like what exactly are we trying to do. If you see it as a math problem, the particular excuse of there being a crackpot-ridden AGI field, stagnating AI field and the machine learning field with no impending promise of crossing over into AGI, ceases to apply, just like the failed overconfident predictions of AI researchers in the past are not evidence that AI won’t be developed in two hundred years.
In the same sense AIXI is a mathematical formulation of a solution to the AGI problem, we don’t have a good idea of what FAI is supposed to be. As a working problem statement, I’m thinking of how to define “preference” for a given program (formal term), with this program representing an agent that imperfectly implements that preference, for example a human upload could be such a program. This “preference” needs to define criteria for decision-making on the unknown-physics real world from within a (temporary) computer environment with known semantics, in the same sense that a human could learn about what could/should be done in the real world while remaining inside a computer simulation, but having an I/O channel to interact with the outside, without prior knowledge of the physical laws.
I’m gradually writing up the idea of this direction of research on my blog. It’s vague, but there is some hope that it can put people into a more constructive state of mind about how to approach FAI.
Thanks (and upvoted) for the link to your blog posts about preference. They are some of the best pieces of writings I’ve seen on the topic. Why not post them (or the rest of the sequence) on Less Wrong? I’m pretty sure you’ll get a bigger audience and more feedback that way.
Thanks. I’ll probably post a link when I finish the current sequence—by current plan, it’s 5-7 posts to go. As is, I think this material is off-topic for Less Wrong and shouldn’t be posted here directly/in detail. If we had a transhumanist/singularitarian subreddit, it would be more appropriate.
I didn’t mean to say no attack existed, only that I don’t have one ready. I can program okay and have spent enough time reading about AGI to see how the field is floundering.
What you are saying in the last sentence is that you estimate that there unlikely to be an attack for some time, which is a much stronger statement than “only that I don’t have one ready”, and actually is a probabilistic statement that no attack exists (“I didn’t mean to say no attack existed”). This statement feeds into the estimate that marginal value of investment in search for such an attack is very low at this time.
That seems to diminish the relevance of Hamming’s quote, since the problems he names are all ones where we have good reason to believe an attack doesn’t exist.
WIthout evidence that their approach is right, for me it’s like investing in alchemy to get gold.
If your goal is to get gold and not to just do alchemy, then upon discovering that alchemy is stupid you turn to different angles of attack. You don’t need to know whether SIAI’s current approach is right, you only need to know whether there are capable people working on the problem there, who really want to solve the problem and not just create appearance of solving the problem, and who won’t be bogged down by pursuit of lost causes. Ensuring the latter is of course a legitimate concern.
Vladimir is right, but also I didn’t necessarily mean give to SIAI. If you think they’re irretrievably doing it wrong, start your own effort.
A quote explaining why I don’t do that either:
-- Richard Hamming, “You and Your Research”
For now, a valid “attack” of Friendly AI is to actually research this question, given that it wasn’t seriously thought about before. For time travel or antigravity, we don’t just not have at attack, we have a pretty good idea of why it’s won’t be possible to implement them now or ever, and the world won’t end if we don’t develop them. For Friendly AI, there is no such clarity or security.
I want to ask “how much thought have you given it, to be confident that you don’t have an attack?”, but I’m guessing you’ll say that the outside view says you don’t and that’s that.
I didn’t mean to say no attack existed, only that I don’t have one ready. I can program okay and have spent enough time reading about AGI to see how the field is floundering.
I’ve grown out of seeing FAI as an AI problem, at least on the conceptual stage where there are very important parts still missing, like what exactly are we trying to do. If you see it as a math problem, the particular excuse of there being a crackpot-ridden AGI field, stagnating AI field and the machine learning field with no impending promise of crossing over into AGI, ceases to apply, just like the failed overconfident predictions of AI researchers in the past are not evidence that AI won’t be developed in two hundred years.
How is FAI a math problem? I never got that either.
In the same sense AIXI is a mathematical formulation of a solution to the AGI problem, we don’t have a good idea of what FAI is supposed to be. As a working problem statement, I’m thinking of how to define “preference” for a given program (formal term), with this program representing an agent that imperfectly implements that preference, for example a human upload could be such a program. This “preference” needs to define criteria for decision-making on the unknown-physics real world from within a (temporary) computer environment with known semantics, in the same sense that a human could learn about what could/should be done in the real world while remaining inside a computer simulation, but having an I/O channel to interact with the outside, without prior knowledge of the physical laws.
I’m gradually writing up the idea of this direction of research on my blog. It’s vague, but there is some hope that it can put people into a more constructive state of mind about how to approach FAI.
Thanks (and upvoted) for the link to your blog posts about preference. They are some of the best pieces of writings I’ve seen on the topic. Why not post them (or the rest of the sequence) on Less Wrong? I’m pretty sure you’ll get a bigger audience and more feedback that way.
Thanks. I’ll probably post a link when I finish the current sequence—by current plan, it’s 5-7 posts to go. As is, I think this material is off-topic for Less Wrong and shouldn’t be posted here directly/in detail. If we had a transhumanist/singularitarian subreddit, it would be more appropriate.
What you are saying in the last sentence is that you estimate that there unlikely to be an attack for some time, which is a much stronger statement than “only that I don’t have one ready”, and actually is a probabilistic statement that no attack exists (“I didn’t mean to say no attack existed”). This statement feeds into the estimate that marginal value of investment in search for such an attack is very low at this time.
That seems to diminish the relevance of Hamming’s quote, since the problems he names are all ones where we have good reason to believe an attack doesn’t exist.
How long have you thought about it, to reach your confidence that you don’t have an attack?