I have similar backgorund (although probably less far along with it) and certainly would find working in AGI to be by far the most meaningful and motivating thing possible, but I’m somewhat more sceptical that I’m the very best candidate to do something so unimaginable important and sensitive. I mean, it’s a **ing FAI, not a silly little nuke or genome or other toy like that.
I’m somewhat more sceptical that I’m the very best candidate to do something so unimaginable important and sensitive. I mean, it’s a **ing FAI, not a silly little nuke or genome or other toy like that.
So work on FAI but don’t actually run one. Get people who you think are more qualified to check your work if you make useful progress.
The problem appears to be that no one has a clue how to work on FAI and make sure that it actually is FAI. If someone made what they thought was FAI and it wasn’t actually FAI, how could you tell until it was too late?
If someone made what they thought was FAI and it wasn’t actually FAI, how could you tell until it was too late?
How can you tell that a theorem is correct? By having a convincing proof, or perhaps other arguments to the effect of the theorem being correct, and being trained to accept correct proofs (arguments) and not incorrect ones.
Even though we currently have no idea about how to construct a FAI or how an argument for a particular design being a FAI might look like, we can still refuse to believe that something is a FAI when we are not given sufficient grounds for believing it is one, provided the people who make such decisions are skilled enough to accept correct arguments but not incorrect ones.
The rule to follow is to not actually build or run things that you don’t know to be good. By default, they don’t even work, and if they do, they are not good, because value is complex and won’t be captured by chance. There need to be strong grounds for believing that a particular design will work, and here rationality training is essential, because by default people follow one of the many kinds of crazy reasoning about these things.
I read a science fiction story where they made a self sustaining space station. Placed a colony of scientist and engineers needed to run it and then sealed it off with no connection to the outside world. Then they modified all the computer files to make it appear as though the humans had evolved on the station and there was nothing else but the station. Then they woke up the AI and start stress testing it by attacking it in non-harmful ways.
It was an interesting story, not sure how useful it would be in real life. The AI actually manages to figure out that there is stuff outside the station and they are only saved because it creates its own moral code in which killing is wrong. This was a very convenient plot point so I wouldn’t trust it in real life.
How do you know that a person is really friendly? You use methods that have worked in the past and look for manipulative techniques that misleadingly friendly people use to make you think they are friendly. We know that someone is friendly via the same methodology that we determine what it means to be friendly, subjective benefits (emotional support etc.) and goal assistance (helping you move when they could simply refuse to do so) without malicious motives that ultimately disservice you.
In the case of FAI we want more surety, and we can presumably get this via simulation and proofs of correctness. I would assume that even after we had a proof of correctness for a meta-ethical system we would want to run it through as many virtual scenarios as possible, since the human brain is simply not capable of the chains of reasoning within the meta-ethics that the machine would be, so we would want to introduce it to scenarios that are as complex as possible in order to determine that it fits our intuition of friendliness.
It seems to me that the bulk of the work is in the arena of identifying the most friendly meta-ethical architecture. The Lokhorst paper lukeprog posted a while ago clarified a few things for me, though I have no access to the cutting edge work on FAI (save for what leaks out into the blog posts), and judging by what Will Newsome has said in the past (cannot find the post) they have compiled a relatively large list of possibly relevant sub-problems that I would be very interested to see (even if many of them are likely to be time drains).
Hmm, it seems like other researchers could at least assess you. Besides, it isn’t like there aren’t always sub-problems for those who feel less capable. You don’t have to be the head researcher to do useful things related to a large project, you can have someone who has established themselves direct you until you get a sense for what you are capable of.
Hmm. I’ve read most of the sequences and some of his monograph on FAI, but I don’t recall him explicitly arguing against dividing up the work into sub-problems. Intuitively it seems that if you trust person X to cautiously do FAI then you should trust them to be able to pick out sub-problems and be able to determine when it is the case that they have been solved satisfactorily.
Could you point me out to the relevant links?
Also, I might be terribly mistaken here, but it seems like not every component of the AGI puzzle need be tied directly to FAI, at least in the development phase. Each one must be fully understood and integrated into an FAI, but I don’t see why, say, I need to be as careful when designing a concept model of an ontology engine that can combine and generalize concepts, at least until I want to integrate it into the FAI, at which point the FAI people could review the best attempts at the various components of AGI and figure out whether any of them are acceptable and then how to integrate them into an FAI framework.
I guess so, but I distinctly remember some writing Eleizer did what gave the strong impression that if your IQ was below 9000 you shouldn’t try to do anything but give the SIAI money. I don’t remember from where thou and it certainly sounds weird, so maybe my memory just messed up.
Going solely on what Eliezer has said about ‘exceeding your role models’, I would take that with a grain of salt. I’ve never met Eliezer, but although he comes off as extremely intelligent, judging by his writings and level of achievement (which are impressive) he still does not come off to me as, say, Von Neumann intelligent.
Eliezer’s writings have clarified my thoughts a great deal and given me a stronger sense of purpose. He is a very intelligent researcher and gifted explainer and evangelist, but I don’t take his word as Gospel, I take it as generally very good advice.
I’ve never met Eliezer, but although he comes off as extremely intelligent, judging by his writings and level of achievement (which are impressive) he still does not come off to me as, say, Von Neumann intelligent.
I’ve never met Eliezer, but although he comes off as extremely intelligent, judging by his writings and level of achievement (which are impressive) he still does not come off to me as, say, Von Neumann intelligent.
Wouldn’t dispute that.
I seem to recall you saying as much—at one time as not quite like a thousand year old vampire and at another as not ‘glittery’. It only occurs to me now that that combination make Jaynes a thousand year old Twilight vampire. Somehow that takes some of the impressiveness out of the metaphor, Luminosity revamp (cough) or not!
I suspect but am not certain that you’re thinking of “So You Want To Be A Seed AI Programmer”. I also suspect that the document is at least partially out of date, however.
I have similar backgorund (although probably less far along with it) and certainly would find working in AGI to be by far the most meaningful and motivating thing possible, but I’m somewhat more sceptical that I’m the very best candidate to do something so unimaginable important and sensitive. I mean, it’s a **ing FAI, not a silly little nuke or genome or other toy like that.
So work on FAI but don’t actually run one. Get people who you think are more qualified to check your work if you make useful progress.
Or make your job checking other peoples’ work for errors.
The problem appears to be that no one has a clue how to work on FAI and make sure that it actually is FAI. If someone made what they thought was FAI and it wasn’t actually FAI, how could you tell until it was too late?
How can you tell that a theorem is correct? By having a convincing proof, or perhaps other arguments to the effect of the theorem being correct, and being trained to accept correct proofs (arguments) and not incorrect ones.
Even though we currently have no idea about how to construct a FAI or how an argument for a particular design being a FAI might look like, we can still refuse to believe that something is a FAI when we are not given sufficient grounds for believing it is one, provided the people who make such decisions are skilled enough to accept correct arguments but not incorrect ones.
The rule to follow is to not actually build or run things that you don’t know to be good. By default, they don’t even work, and if they do, they are not good, because value is complex and won’t be captured by chance. There need to be strong grounds for believing that a particular design will work, and here rationality training is essential, because by default people follow one of the many kinds of crazy reasoning about these things.
See! You’ve found a problem to work on already! :)
[The downvote on your comment isn’t mine btw.]
I read a science fiction story where they made a self sustaining space station. Placed a colony of scientist and engineers needed to run it and then sealed it off with no connection to the outside world. Then they modified all the computer files to make it appear as though the humans had evolved on the station and there was nothing else but the station. Then they woke up the AI and start stress testing it by attacking it in non-harmful ways.
It was an interesting story, not sure how useful it would be in real life. The AI actually manages to figure out that there is stuff outside the station and they are only saved because it creates its own moral code in which killing is wrong. This was a very convenient plot point so I wouldn’t trust it in real life.
How do you know that a person is really friendly? You use methods that have worked in the past and look for manipulative techniques that misleadingly friendly people use to make you think they are friendly. We know that someone is friendly via the same methodology that we determine what it means to be friendly, subjective benefits (emotional support etc.) and goal assistance (helping you move when they could simply refuse to do so) without malicious motives that ultimately disservice you.
In the case of FAI we want more surety, and we can presumably get this via simulation and proofs of correctness. I would assume that even after we had a proof of correctness for a meta-ethical system we would want to run it through as many virtual scenarios as possible, since the human brain is simply not capable of the chains of reasoning within the meta-ethics that the machine would be, so we would want to introduce it to scenarios that are as complex as possible in order to determine that it fits our intuition of friendliness.
It seems to me that the bulk of the work is in the arena of identifying the most friendly meta-ethical architecture. The Lokhorst paper lukeprog posted a while ago clarified a few things for me, though I have no access to the cutting edge work on FAI (save for what leaks out into the blog posts), and judging by what Will Newsome has said in the past (cannot find the post) they have compiled a relatively large list of possibly relevant sub-problems that I would be very interested to see (even if many of them are likely to be time drains).
Hmm, it seems like other researchers could at least assess you. Besides, it isn’t like there aren’t always sub-problems for those who feel less capable. You don’t have to be the head researcher to do useful things related to a large project, you can have someone who has established themselves direct you until you get a sense for what you are capable of.
Thanks,and yea that MIGHT work… But so far everything Eliezer have said indicates the opposite and argues it very well.
I’m mostly hoping for some VERY far removed sub-sub-problem maybe.
Hmm. I’ve read most of the sequences and some of his monograph on FAI, but I don’t recall him explicitly arguing against dividing up the work into sub-problems. Intuitively it seems that if you trust person X to cautiously do FAI then you should trust them to be able to pick out sub-problems and be able to determine when it is the case that they have been solved satisfactorily.
Could you point me out to the relevant links?
Also, I might be terribly mistaken here, but it seems like not every component of the AGI puzzle need be tied directly to FAI, at least in the development phase. Each one must be fully understood and integrated into an FAI, but I don’t see why, say, I need to be as careful when designing a concept model of an ontology engine that can combine and generalize concepts, at least until I want to integrate it into the FAI, at which point the FAI people could review the best attempts at the various components of AGI and figure out whether any of them are acceptable and then how to integrate them into an FAI framework.
I guess so, but I distinctly remember some writing Eleizer did what gave the strong impression that if your IQ was below 9000 you shouldn’t try to do anything but give the SIAI money. I don’t remember from where thou and it certainly sounds weird, so maybe my memory just messed up.
Going solely on what Eliezer has said about ‘exceeding your role models’, I would take that with a grain of salt. I’ve never met Eliezer, but although he comes off as extremely intelligent, judging by his writings and level of achievement (which are impressive) he still does not come off to me as, say, Von Neumann intelligent.
Eliezer’s writings have clarified my thoughts a great deal and given me a stronger sense of purpose. He is a very intelligent researcher and gifted explainer and evangelist, but I don’t take his word as Gospel, I take it as generally very good advice.
Wouldn’t dispute that.
I seem to recall you saying as much—at one time as not quite like a thousand year old vampire and at another as not ‘glittery’. It only occurs to me now that that combination make Jaynes a thousand year old Twilight vampire. Somehow that takes some of the impressiveness out of the metaphor, Luminosity revamp (cough) or not!
I imagine you are probably thinking of something like this.
Yes! In fact, I’m pretty sure it’s not just somehting like that, but that exact specific page.
I suspect but am not certain that you’re thinking of “So You Want To Be A Seed AI Programmer”. I also suspect that the document is at least partially out of date, however.
Yup, correct!
Sounds like he might have been talking about this.
Or possibly about this.
umm, the 9000 thing was me interpreting. I don’t think the article I’m talking about even explicitly mentioned IQ.