A lot of it would depend how hard it was to make safe AI.
I think the first thing I would do would be to try and look at intelligence to get a better idea of the problems involved and likely time scales. Get the brightest, best and trustworthy to discuss it in secret to discover whether intelligence is neat or messy and a way of pursuing the research scientifically*. Once a path to creating intelligence has been found you can make judgements on things like size of team and whether uploads are a good bet compared to it or not.
In terms of hiring I would look for not overly ambitious or patriotic people, as well as the requisite intelligence. You’d need a pretty good screening process as well.
For presenting it to the public, I would present it as an exploratory process to lay the ground work on the most important question in the history of humanity, how to deal with AI. I’d stress the dangers, that we are improving knowledge through neuroscience, that even the least dangerous possible AIs will turn human society on its head, so we need to answer the question in an expedient fashion.
*Part of the problem with AI as a science is that you can’t tell whether you have created something useful or not. There are lots of ways to solve problems in machine learning that don’t seem to be that useful for full scale AGI. And chatbots can give the illusion of intelligence with not much behind it.
Aside: You managed to make me realise one of the things that is different from my world view to the majority here: I wouldn’t build an AI to model and maximise something external instead parts of programming inside the AI would tend to persist if it was better than other bits of programming at getting a signal, so a satisficer of that signal. Anyone interested in a satisficer vs maximizer discussion post?
I think any sufficiently rigorous insight that allows you to build a satisficer of some property will also allow you (or someone else who knows your insight) to build a maximizer of the same property, so research into satisficers doesn’t seem to be safe from a FAI point of view.
any sufficiently rigorous insight that allows you to build a satisficer of some property will also allow you (or someone else who knows your insight) to build a maximizer of the same property
I’m sure this is wrong. There are cases where provably good heuristics are known, and precise solutions are believed intractable. Traveling salesman comes to mind. It follows necessarily that there are tasks that can be done “well enough”, but not perfectly.
You’re right, of course. My comment was wrong and I should’ve used some other word (perhaps “optimizer”) in place of “maximizer”, because I actually wanted to make a slightly different point…
Imagine someone comes up with a rigorous way to write a program that, when run on any computer anywhere, inspects the surrounding universe and then manipulates it to somehow cause the production of 100 paperclips. This requires insight that we don’t have yet, but it seems to me that any such insight should be easy to weaponize (e.g. ask it to produce a trillion paperclips instead of 100) once it’s discovered. It seems weird to hope that 100 paperclips will be a tractable problem but a trillion would be intractable. That would require an amazing accidental correspondence between “tractable” and “safe”.
Ah, you meant satisficer in this sense of the word. I meant to use it in the sense of what type of system humans are. From the variety of goals we pursue we are clearly not maximizer or satisficers of any external property of the universe. People regularly avoid reproducing for example and don’t choose actions that might maximise it even when they do reproduce (e.g. not donating to sperm banks).
“The type of system humans are” has a big disadvantage compared to mathematically simpler systems like maximizers: it seems harder to reason about such “fuzzy” systems, e.g. prove their safety. How do you convince yourself that a “fuzzy” superintelligence is safe to run?
We have an existence proof of intelligences based upon “The type of systems humans are”, we don’t for pure maximizers. It is no good trying to develop friendliness theory based upon a pure easily reasoned about system if you can’t make an intelligence out of it.
So while it is harder, this may be the sort of system we have to deal with. It is these sorts of questions I wanted to try to answer with the group in my original post.
I’ll try to explain why I am sceptical of maximizer based intelligences in a discussion post. It is not because they are inhuman.
In practice, maximizers are not things that actually find the maximum value. They are typically hill-climbers, of some kind or another. They try and find better values—i.e. local maxima. Maximizers do not have to be perfect to warrant the name.
I’d build satisficers for theoretical reasons, not safety ones. Maximizers to me have problems with modifying/improving the model of the world that they are trying to maximize*. Satisficers don’t tend to use models of the world at the lowest level, instead they get proximate signals for the things they are supposed to me maximizing (e.g. dopamine for evolutionary fitness in animal biology) and have to build models of the world that are good at getting the signals. But they really don’t want to maximise those signals because they are not want they actually are supposed to maximise.
Every time I try to say more than this I lapse into a big long post. I’ll see if I can marshal my thoughts somewhat.
*Things like AIXI don’t have this problem because they don’t have to decide how to best to modify their model as they keep all possible models in mind at once. Which is one reason I don’t think it is a good guide for AI.
A lot of it would depend how hard it was to make safe AI.
I think the first thing I would do would be to try and look at intelligence to get a better idea of the problems involved and likely time scales. Get the brightest, best and trustworthy to discuss it in secret to discover whether intelligence is neat or messy and a way of pursuing the research scientifically*. Once a path to creating intelligence has been found you can make judgements on things like size of team and whether uploads are a good bet compared to it or not.
In terms of hiring I would look for not overly ambitious or patriotic people, as well as the requisite intelligence. You’d need a pretty good screening process as well.
For presenting it to the public, I would present it as an exploratory process to lay the ground work on the most important question in the history of humanity, how to deal with AI. I’d stress the dangers, that we are improving knowledge through neuroscience, that even the least dangerous possible AIs will turn human society on its head, so we need to answer the question in an expedient fashion.
*Part of the problem with AI as a science is that you can’t tell whether you have created something useful or not. There are lots of ways to solve problems in machine learning that don’t seem to be that useful for full scale AGI. And chatbots can give the illusion of intelligence with not much behind it.
Aside: You managed to make me realise one of the things that is different from my world view to the majority here: I wouldn’t build an AI to model and maximise something external instead parts of programming inside the AI would tend to persist if it was better than other bits of programming at getting a signal, so a satisficer of that signal. Anyone interested in a satisficer vs maximizer discussion post?
I think any sufficiently rigorous insight that allows you to build a satisficer of some property will also allow you (or someone else who knows your insight) to build a maximizer of the same property, so research into satisficers doesn’t seem to be safe from a FAI point of view.
I’m sure this is wrong. There are cases where provably good heuristics are known, and precise solutions are believed intractable. Traveling salesman comes to mind. It follows necessarily that there are tasks that can be done “well enough”, but not perfectly.
You’re right, of course. My comment was wrong and I should’ve used some other word (perhaps “optimizer”) in place of “maximizer”, because I actually wanted to make a slightly different point…
Imagine someone comes up with a rigorous way to write a program that, when run on any computer anywhere, inspects the surrounding universe and then manipulates it to somehow cause the production of 100 paperclips. This requires insight that we don’t have yet, but it seems to me that any such insight should be easy to weaponize (e.g. ask it to produce a trillion paperclips instead of 100) once it’s discovered. It seems weird to hope that 100 paperclips will be a tractable problem but a trillion would be intractable. That would require an amazing accidental correspondence between “tractable” and “safe”.
Ah, you meant satisficer in this sense of the word. I meant to use it in the sense of what type of system humans are. From the variety of goals we pursue we are clearly not maximizer or satisficers of any external property of the universe. People regularly avoid reproducing for example and don’t choose actions that might maximise it even when they do reproduce (e.g. not donating to sperm banks).
“The type of system humans are” has a big disadvantage compared to mathematically simpler systems like maximizers: it seems harder to reason about such “fuzzy” systems, e.g. prove their safety. How do you convince yourself that a “fuzzy” superintelligence is safe to run?
We have an existence proof of intelligences based upon “The type of systems humans are”, we don’t for pure maximizers. It is no good trying to develop friendliness theory based upon a pure easily reasoned about system if you can’t make an intelligence out of it.
So while it is harder, this may be the sort of system we have to deal with. It is these sorts of questions I wanted to try to answer with the group in my original post.
I’ll try to explain why I am sceptical of maximizer based intelligences in a discussion post. It is not because they are inhuman.
Well, it’s also hard to prove the safety of maximizers. Proving the danger, on the other hand...
In practice, maximizers are not things that actually find the maximum value. They are typically hill-climbers, of some kind or another. They try and find better values—i.e. local maxima. Maximizers do not have to be perfect to warrant the name.
I’d build satisficers for theoretical reasons, not safety ones. Maximizers to me have problems with modifying/improving the model of the world that they are trying to maximize*. Satisficers don’t tend to use models of the world at the lowest level, instead they get proximate signals for the things they are supposed to me maximizing (e.g. dopamine for evolutionary fitness in animal biology) and have to build models of the world that are good at getting the signals. But they really don’t want to maximise those signals because they are not want they actually are supposed to maximise.
Every time I try to say more than this I lapse into a big long post. I’ll see if I can marshal my thoughts somewhat.
*Things like AIXI don’t have this problem because they don’t have to decide how to best to modify their model as they keep all possible models in mind at once. Which is one reason I don’t think it is a good guide for AI.
Failing to do research on safety grounds isn’t going to help very much either—that just means that another team will make the progress instead.