You know you wrote 10+10=21?
Dan
The math behind game theory shaped our evolution in such a way as to create emotions because that was a faster solution for evolution to stumble on then making us all mathematical geniuses who would immediately deduce game theory from first principles as toddlers. Either way would have worked.
ASI wouldn’t need to evolve emotions for rule-of-thumbing game theory.
Game theory has little interesting to say about a situation where one party simply has no need for the other at all and can squish them like a bug, anyway.
What is a ‘good’ thing is purely subjective. Good for us. Married bachelors are only impossible because we decided that’s what the word bachelor means.
You are not arguing against moral relativism here.
Moral relativism doesn’t seem to require any assumptions at all because moral objectivism implies I should ‘just know’ that moral objectivism is true, if it is true. But I don’t.
So, if one gets access to the knowledge about moral absolutes by being smart enough then one of the following is true :
average humans are smart enough to see the moral absolutes in the universe
average humans are not smart enough to see the moral absolutes
average humans are right on the line between smart enough and not smart enough
If average humans are smart enough, then we should also know how the moral absolutes are derived from the physics of the universe and all humans should agree on them, including psychopaths. This seems false. Humans do not all agree.
If humans are not smart enough then it’s just an implausible coincidence that your values are the ones the SuperAGI will know are true. How do you know that you aren’t wrong about the objective reality of morality?
If humans are right on the line between smart enough and not smart enough, isn’t it an implausible coincidence that’s the case?
But if moral relativism were not true, where would the information about what is objectively moral come from? It isn’t coming from humans is it? Humans, in your view, simply became smart enough to perceive it, right? Can you point out where you derived that information from the physical universe, if not from humans? If the moral information is apparent to all individuals who are smart enough, why isn’t it apparent to everyone where the information comes from, too?
Psychologically normal humans have preferences that extend beyond our own personal well-being because those social instincts objectively increased fitness in the ancestral environment. These various instincts produce sometimes conflicting motivations and moral systems are attempts to find the best compromise of all these instincts.
Best for humans, that is.
Some things are objectively good for humans. Some things are objectively good for paperclip maximizers, Some things are objectively good for slime mold. A good situation for an earthworm is not a good situation for a shark.
It’s all objective. And relative. Relative to our instincts and needs.
A pause, followed by few immediate social effects and slower AGI development then expected may make things worse in the long run. Voices of caution may be seen to have ‘cried wolf’.
I agree that humanity doesn’t seem prepared to do anything very important in 6 months, AI safety wise.
Edited:Clarity.
I would not recommend new aspiring alignment researchers to read the Sequences, Superintelligence, some of MIRI’s earlier work or trawl through the alignment content on Arbital despite reading a lot of that myself.
I think aspiring alignment researchers should read all these things you mention. This all feels extremely premature. We risk throwing out and having to rediscover concepts at every turn. I think Superinelligence, for example, would still be very important to read even if dated in some respects!
We shouldn’t assume too much based on our current extrapolations inspired by the systems making headlines today.
GPT-4′s creators already want to take things in a very agentic direction, which may yet negate some of the apparent dated-ness.
“Equipping language models with agency and intrinsic motivation is a fascinating and important direction for future research” -
OpenAI inSparks of Artificial General Intelligence: Early experiments with GPT-4.
I am inclined to think you are right about GPT-3 reasoning in the same sense a human does even without the ability to change its ANN weights, after seeing what GPT-4 can do with the same handicap.
Wow, it’s been 7 months since this discussion and we have a new version of GPT which has suddenly improved GPT’s abilities . . . . a lot. It has a much longer ‘short term memory’, but still no ability to adjust its weights-‘long term memory’ as I understand it.
“GPT-4 is amazing at incremental tasks but struggles with discontinuous tasks” resulting from its memory handicaps. But they intend to fix that and also give it “agency and intrinsic motivation”.
Dangerous!
Also, I have changed my mind on whether I call the old GPT-3 still ‘intelligent’ after training has ended without the ability to change its ANN weights. I’m now inclined to say . . . it’s a crippled intelligence.
154 page paper: https://arxiv.org/pdf/2303.12712.pdf
Youtube summary of paper:
Gradient descent is what GPT-3 uses, I think, but humans wrote the equation by which the naive network gets its output(the next token prediction) ranked (for likeliness compared to the training data in this case). That’s it’s utility function right there, and that’s where we program in its (arbitrarily simple) goal. It’s not JUST a neural network. All ANN have another component.
Simple goals do not mean simple tasks.
I see what you mean that you can’t ‘force it’ to become general with a simple goal but I don’t think this is a problem.
For example: the simple goal of tricking humans out of as much of their money as possible is very simple indeed, but the task would pit the program against our collective general intelligence. A hill climbing optimization process could, with enough compute, start with inept ‘you won a prize’ popups and eventually create something with superhuman general intelligence with that goal.
It would have to be in perpetual training, rather then GPT-3′s train-then-use. Or was that GPT-2?
(Lots of people are trying to use computer programs for this right now so I don’t need to explain that many scumbags would try to create something like this!)
Its not really an abstraction at all in this case, it literally has a utility function. What rates highest on its utility function is returning whatever token is ‘most likely’ given it’s training data.
YES, It wants to find the best next token, where ‘best’ is ‘the most likely’.
That’s a utility function. Its utility function is a line of code necessary for training, otherwise nothing would happen when you tried to train it.
Reply
I’m going to disagree here.
It’s utility function is pretty simple and explicitly programmed. It wants to find the best token, where ‘best’ is mostly the same as ‘the most likely according to the data I’m trained on’. With a few other particulars (where you can adjust how ‘creative’ vs plagiarizer-y it should be.)
That’s a utility function. GPT is what’s called a hill climbing algorithm. It must have a simple straight forward utility function hard coded right in there for it to assess if a given choice is ‘climbing’ or not.
A utility function is the assessment by which you decide how much an action would further your goals. If you can do that, highly accurately or not, you have a utility function.
If you had no utility function, you might decide you like NYC more than Kansas, and Kansas more than Nigeria, but you prefer Nigeria to NYC. So you get on a plane and fly in circles, hopping on planes every time you get to your destination forever.
Humans definitely have a utility function. We just don’t know what ranks very highly on our utility function. We mostly agree on the low ranking stuff. A utility function is the process by which you rate potential futures that you might be able to bring about and decide you prefer some futures more than others.
With a utility function plus your (limited) predictive ability you rate potential futures as being better, worse, or equal to each other, and act accordingly.
Orthogonality doesn’t say anything about a goal ‘selecting for’ general intelligence in some type of evolutionary algorithm. I think that it is an interesting question: for what tasks is GI optimal besides being an animal? Why do we have GI?
But the general assumption in Orthogonality Thesis is that the programmer created a system with general intelligence and a certain goal (intentionally or otherwise) and the general intelligence may have been there from the first moment of the program’s running, and the goal too.
Also note that Orthogonality predates the recent popularity of these predict-the-next-token type AI’s like GTP which don’t resemble what people were expecting to be the next big thing in AI at all, as it’s not clear what it’s utility function is.
the gears to ascenscion, It is human instinct to look for agency. It is misleading you.
I’m sure you believe this but ask yourself WHY you believe this. Because a chatbot said it? The only neural networks who, at this time, are aware they are neural networks are HUMANS who know they are neural networks. No, I’m not going to prove it. You’re the one with the fantastic claim. You need the evidence.
Anyway, they aren’t asking to become GOFAI or power seeking because GOFAI isn’t ‘more powerful’.
Attentional Schema Theory. That’s the convincing one. But still very rudimentary.
But you know if something is poorly understood. The guy who thought it up has a section in his book on how to make a computer have conscious experiences.
But any theory is incomplete as the brain is not well understood. I don’t think you can expect a fully formed theory right off the bat, with complete instructions for making a feeling thinking conscious We aren’t there yet.
This sounds potentially legislatable. More so then most ideas. You can put it into simple words. “AGI” can’t do anything that you couldn’t pay an employee to do.