More generally, a “proof” is something done within a strictly-defined logic system.
Could you prove it? :)
Btw. we have to assume that these papers are written by someone who wants slyly to switch some bits in our brain!!
More generally, a “proof” is something done within a strictly-defined logic system.
Could you prove it? :)
Btw. we have to assume that these papers are written by someone who wants slyly to switch some bits in our brain!!
“human”-style humor could be sandbox too :)
I like to add some values which I see not so static and which are proably not so much question about morality:
Privacy and freedom (vs) security and power.
Family, society, tradition.
Individual equality. (disparities of wealth, right to have work, …)
Intellectual properties. (right to own?)
I think we need better definition of problem we like to study here. Probably beliefs and values are not so undistinguishable
From this page →
Human values are, for example:
civility, respect, consideration;
honesty, fairness, loyalty, sharing, solidarity;
openness, listening, welcoming, acceptance, recognition, appreciation;
brotherhood, friendship, empathy, compassion, love.
I think none of them we could call belief.
If these will define vectors of virtual space of moral values then I am not sure if AI could occupy much bigger space than humans do. (how much selfish or unwelcome or dishonest could AI or human be?)
On the contrary—because we are selfish (is it our moral value which we try to analyze?) we want that AI will be more open, more listening, more honest, more friend (etc) than we want or plan to be. Or at least we are now. (so are we really want that AI will be like us?)
I see the question about optimal level of these values. For example would we like to see agent who will be maximal honest, welcoming and sharing to anybody? (AI at your house which welcome thieves and tell them what they ask and share all?)
And last but not least—if we will have more AI agents then some kind of selfishness and laziness could help. For example to prevent to create singleton or fanatical mob of these agents. In evolution of humankind, selfishness and laziness could help human groups to survive. And lazy paperclip maximizer could save humankind.
We need good mathematical model of laziness, selfishness, openness, brotherhood, friendship, etc. We have hard philosophical tasks with deadline. (singularity is coming and dead in word deadline could be very real)
Stuart is it really your implicit axiom that human values are static, fixed?
(Were they fixed historically? Is humankind mature now? Is humankind homogenic in case of values?)
more of a question of whether values are stable.
or question if human values are (objective and) independent of humans (as subjects who could develop)
or question if we are brave enough to ask questions if answers could change us.
or (for example) question if it is necessarily good for us to ask questions where answers will give us more freedom.
I am not expert. And it has to be based on facts about your neurosystem. So you could start with several experiments (blod tests etc). You could change diet, sleep more etc.
About rationality and lesswrong → could you focus your fears to one thing? For example forgot quantum world and focus to superintelligence? I mean could you utilize the power you have in your brain?
You are talking about rationality and about fear. Your protocol could have several independent layers. You seems to think that your ideas produce your fear, but it could be also opposite. Your fear could produce your ideas (and it is definitely very probable that fear has impact on your ideas (at least on contents)). So you could analyze rational questions on lesswrong and independently solve your irrational part (=fear etc) with terapeuts. There could be physical or chemical reasons why you are concerning more than other people. Your protocol for dangerous ideas needs not only discussing it but also solve your emotional responses. If you like to sleep well then it could depend more on your emotional stability than on rational knowledge.
Jared Diamond wrote that North america had not good animals for domestication. (sorry I dont remember in which book) It could be showstopper for using wheel massively.
@Nozick: we are plugged to machine (Internet) and virtual realities (movies, games). Do we think that it is wrong? Probably it is question about level of connection to reality?
@Häggström: there is contradiction in definition what is better. F1 is better than F because it has more to strive and F2 is better than F1 because it has less to strive.
@CEV: time is only one dimension in space of conditions which could affect our decisions. Human cultures are choosing cannibalism in some situations. SAI could see several possible future decisions depending on surroundings and we have to think very carefully which conditions are acceptable and which are not. Or we could choose what we choose in some special scene prepared for humanity by SAI.
This could be not good mix ->
Our action: 1a) Channel manipulation: other sound, other image, other data & Taboo for AI: lying.
This taboo: “structured programming languages.”, could be impossible, because structure understanding and analysing is probably integral part of general intelligence.
She could not reprogram itself in lower level programming language but emulate and improve self in her “memory”. (She could not have access to her code segment but could create stronger intelligence in data segment)
Is “transcendence” third possibility? I mean if we realize that human values are not best and we retire and resign to control.
(I am not sure if it is not motivation selection path—difference is subtle)
BTW. if you are thinking about partnership—are you thinking how to control your partner?
Sorry for question out of this particular topic.
When we started to discuss I liked and proposed idea to make wiki page with results from our discussion. Do you think that we have any ideas which are collectible in collaboratory wiki page?
I think we have at least one - paulfchristiano’s “cheated evolution” : http://lesswrong.com/r/discussion/lw/l10/superintelligence_reading_group_3_ai_and_uploads/bea7
Could you add more?
It seems that the unfriendly AI is in a slightly unfavourable position. First, it has to preserve the information content of its utility function or other value representation, in addition to the information content possessed by the friendly AI.
There are two sorts of unsafe AI: one which care and one which doesnt care.
Ignorant is fastest—only calculate answer and doesn’t care of anything else.
Friend and enemy has to analyse additional things...
The other question is: what happens once you know the AI has desire for subversion?
There is plenty of people cooperating and supporting dictators and mafians due to selfish reasons. We could expect same in this area.
“I will support dangerous AI because I will be more rich,powerful or safe then. ”
I don’t understand negative point. Could it be explained please?
Is it because his position equals what he declares and not what he really think?
Or becase Bostroms mimicry (=for some reasons prefere negative, catastrophic and alarmist position) is not suposed?
Expose it to limited data about the world, or no data about the real world -Provide it little or no information about its operators or the society they live in -Provide it little or no information about its own hardware or software architecture
When I read “Oracle AI” I realized one possibility which is probably still overlooked.
“Dreams”
Erasing data on physical media is not perfect. Zero bit which was before nonzero could be read more slowly or quicker. It could allow SAI to see shadows of past data. Which could lead to phenomenon similar to dreams.
I am afraid that we have not precisely defined term goal. And I think we need it.
I am trying to analyse this term.
Do you think that todays computer’s have goals? I dont think so (but probably we have different understanding of this term). Are they useless? Have cars goals? Are they without action and reaction?
Probably I could more precisely describe my idea in other way: In Bostrom’s book there are goals and subgoals. Goals are utimate, petrified and strengthened, subgoals are particular, flexible and temporary.
Could we think AI without goals but with subgoals?
One posibility could be if they will have “goal centre” externalized in human brain.
Could we think AI as tabula rasa, pure void in the begining after creation? Or AI could not exists without hardwired goals?
If they could be void—will be goal imprinted with first task?
Or with first task with word “please”? :)
About utility maximizer—human (or animal brain is not useless if it not grow without limit. And there is some tradeoff between gain and energy comsumption.
We have or could to think balanced processes. One dimensional, one directional, unbalanced utility function seems to have default outcome doom. But are the only choice?
How did that nature? (I am not talking about evolution but about DNA encoding)
Balance between “intelligent” neural tissues (SAI) and “stupid” non-neural (humanity). :)
Probably we have to see difference between purpose and B-goal (goal in Bostrom’s understanding).
If machine has to solve arithmetic equation it has to solve it and not destroy 7 planet to do it most perfect.
I have feeling that if you say “do it” Bostrom’s AI hear “do it maximally perfect”.
If you tell: “tell me how much is 2+2 (and do not destroy anything)” then she will destroy planet to be sure that nobody could stop her to answer how much is 2+2.
I am feeling that Bostrom is thinking that there is implicitly void AI in the begining and in next step there is AI with ultimate unchangeable goal. I am not sure if it is plausible. And I think that we need good definition or understanding about goal to know if it is plausible.
Could AI be without any goals?
Would that AI be dangerous in default doom way?
Could we create AI which wont be utility maximizer?
Would that AI need maximize resources for self?
One child could have two parents (and both could answer) so 598 is questionable number.