A philosophical argument against “the AI-fear”.
Beyond Hyperanthropomorphism
- Responding to ‘Beyond Hyperanthropomorphism’ by 14 Sep 2022 20:37 UTC; 8 points) (
- 24 Aug 2022 19:36 UTC; 4 points) 's comment on Linkpost—Beyond Hyperanthropomorphism: Or, why fears of AI are not even wrong, and how to make them real by (EA Forum;
There’s a bad argument against AGI risk that goes kinda like this:
Transformers will not scale to AGI.
Ergo, worrying about AGI risk is silly.
Hey, while you’re here, let me tell you about this other R&D path which will TOTALLY lead to AGI … …
Thanks for listening! (applause)
My read is that this blog post has that basic structure. He goes through an elaborate argument and eventually winds up in Section 10 where he argues that a language model trained on internet data won’t be a powerful agent that gets things done in the world, but, if we train an embodied AI with a robot body, then it could be a powerful agent that gets things done in the world.
And my response is: “OK fine, whatever”. Let’s consider the hypothesis “we need to train an embodied AI with a robot body in order to get a powerful agent that gets things done in the world”. If that’s true, well, people are perfectly capable of training AIs with robot bodies! And if that’s really the only possible way to build a powerful AGI that gets things done in the world, then I have complete confidence that sooner or later people will do that!!
We can argue about whether the hypothesis is correct, but it’s fundamentally not a crazy hypothesis, and it seems to me that if the hypothesis is true then it changes essentially nothing about the core arguments for AGI risk. Just because the AI was trained using a robot body doesn’t mean it can’t crush humanity, and also doesn’t mean that it won’t want to.
In Venkatesh’s post, the scenario where “people build an embodied AI with a robot body” is kinda thrown in at the bottom, as if it were somehow a reductio ad absurdum?? I’m not crystal clear on whether Venkatesh thinks that such an AI (A) won’t get created in the first place, or (B) won’t be able to crush humanity, or (C) won’t want to crush humanity. I guess probably (B)? There’s kinda a throwaway reference to (C) but not an argument. A lot of the post could be taken as an argument against (B), in which case I strongly disagree for the usual reasons, see for example §3.2 here (going through well-defined things that an AGI could absolutely do sooner or later, like run the same algorithms as John von Neumann’s brain but 100× faster and with the ability to instantly spin off clone copies etc.), or §1.6 here (for why radically superhuman capabilities seem unnecessary for crushing humanity anyway).
(Having a robot body does not prevent self-reproducing—the AGI could presumably copy its mind into an AI with a similar virtual robot body in a VR environment, and then it’s no longer limited by robot bodies, all it would need is compute.)
(I kinda skimmed, sorry to everyone if I’m misreading / mischaracterizing!)
I would ask him: why can an “AI” be better at chess than any human, but can’t be better than any human at worldly games like politics, war, or conquest?
I bounced off this a short ways through because it seemed like it was focused on consciousness and something-it-is-like-to-be-ness, which just has very little to do with AI fears as commonly described on LessWrong. I tried skipping to the end to see if it would tie the gestalt of the argument together and see if I missed something.
Can you give a brief high level overview of who you’re arguing against and what you think their position is? Or, what the most important takeaways of your position are regardless of whether they’re arguing against anything in particular?
You’re saying “you”, but the blog post was written by Venkatesh Rao, who AFAIK does not have a LessWrong account.
I think that Rao thinks that he is arguing against AI fears as commonly described on LessWrong. I think he thinks that something-it-is-like-to-be-ness is a prerequisite to being an effective agent in the world, and that’s why he brought it up. Low confidence on that though.
Ah, whoops. Well, then I guess given the circumstances I’ll reframe the question as “PointlessOne, what are you hoping we get out of this?”
Also, lol I just went to try and comment on the OP and it said “only paid subscribers can comment.”
If someone here has an existing subscription I’d love for them to use it to copy Steven Byrnes top level comment. Otherwise I’m gonna pay to do so reluctantly in the next couple hours.
Umm, different audiences have different shared assumptions etc., and in particular, if I were writing directly to Venkatesh, rather than at lesswrong, I would have written a different comment.
Maybe if I had commenting privileges at Venkatesh’s blog I would write the following:
I’m OK with someone cross-posting the above, and please DM me if he replies. :)
Replied, we’ll see.
I shared it as I though it might be interesting alternative view on the topic often discussed here. It was somewhat new to me, at least.
Sharing is not endorsement, if you’re asking that. But it might be a discussion starter.
In the same way smart Christians have a limited amount of time to become an atheist before they irrecoverably twist their minds into an escher painting justifying their theological beliefs, I propose people have a limited amount of time to see the danger behind bash+GPT-100 before they become progressively more likely to make up some pseudophilosophical argument about AI alignment being an ill posed question, and thus they’re not gonna get eaten by nanobots.
Some random quotes (not necessarily my personal views, and not pretending to be a good summary):
Then:
And:
.
My personal summary of the point in the linked post, likely grossly inadequate, is that Hyperanthropomorphism is a wild, incoherent and unjustified extrapolation from assuming that “there is something to be like a person/monkey/salamander” to “there is something to be like superintelligent”, which then becomes a scary monster to end us all unless it’s tamed.
The author also constructs a steelmanned version of this argument, called Well-Posed God AIs:
...
...
[skipped a lot]
...
How it is different from AI:
and
On reconciling the two:
Again, this is just an incomplete summary from a cursory reading.
Take a list of tasks such as
Winning a chess game
Building a mars rover
Writing a good novel
....
I think it is possible to make a machine that does well at all these tasks, not because it has a separate hardcoded subsection for each task, but because there are simple general principles, like occams razor and updating probability distributions, that can be applied to all of them.
The existence of the human brain, which does pretty well at a wide variety of tasks, despite those tasks not being hard coded in by evolution, provides some evidence for this.
AIXI is a theoretical AI that brute force simulates everything. It should do extremely well on all of these tasks. Do you agree that if we had infinite compute, AIXI would be very good at all tasks, including hacking its reward channel.
Do you agree that there is nothing magical about human brains?
Like many philosophical arguments against superintelligence, is doesn’t make clear where the intelligence stops. Can a single piece of software be at least 2500 dan at chess, and be able to drive a car for a million miles without accident? Can it do that and also prove the reinmann hypothesis and making a billion dollar startup? Looking at compute or parameters, you might be able to say that no AI could achieve all that with less than X flops. I have no idea how you would find X. But at least those are clear unambiguous technical predictions.
Perhaps this is too much commentary (on Rao’s post), but given (I believe) he’s pretty widely followed/respected in the tech commentariat, and has posted/tweeted on AI alignment before, I’ve tried to respond to his specific points in a separate LW post. Have tried to incorporate comments below, but please suggest anything I’ve missed. Also if anyone thinks this isn’t an awful idea, I’m happy to see if a pub like Noema (who have run a few relevant things e.g. Gary Marcus, Yann LeCun, etc.) would be interested in putting out an (appropriately edited) response—to try to set out the position on why alignment is an issue, in publishing venues where policymakers/opinion makers might pick it up (who might be reading Rao’s blog but are perhaps not looking at LW/AF). Apologies for any conceptual or factual errors, my first LW post :-)
Maybe, disclaimer.
I have no formal philosophical education. Nor do I have much exposure to the topic as an amateure.
Neither do I have any formal logic education but I have some exposure to the concepts in my professional endaevours.
These are pretty much unedited notes as read OP. At the moment I don’t have much left in me to actually make a coherent argument out of them so you can treat them as random thoughts.
This is a weird statement. The AIs we currently build are basically stateless functions. That is because we don’t let AIs change themselves. Once we invent a NN that can train itself and let it do it we’re f… doomed in any case where we messed up the error function (or where NN can change it). STIILTBness implies continuity of experience. Saying that current AI doesn’t have it, while, probably, factually correct at the moment, doesn’t mean it can’t ever.
In my experience, “intelligence” as a word might be used to refer to the entity but the actual pseudo-trait is a bit different. I’d call it “problem solving” followed, probably, by “intentionality” or “goal fixation”. E.g. paperclip maximiser has “intention” to keep making paperclips and possesses capacity to solve any general problem such as coming up with novel approached to obtain raw material and neutralise interference from meatbags trying to stop being turned into paperclips.
Does it though? A road construction is planned next Monday and diversion is introduced. Suddenly, schedule is very wrong but can not correct its predictions. Author mentioned Gettier earlier but here they seemingly forgot about that concept.
However, that is an experience. And it is world-like, but it can not be experience as you’re used to. You can’t touch or smell it but you can experience it. Likewise, you can manipulate it just not the way you’re used to. You can not push anything a meter to the left but you can write a post on Substack.
You can argue this implies absence of AI STIILTBness but it can also point to non-transferrability of STIILTBness. Or rather to inadequacy of the term as defined. You as a human can experience what it’s like to be a salamander because there’s a sufficient overlap in ability to experience same things in a similar way but you can not experience what it’s like to be a NN on the Internet because the experience is too alien.
This also meant that NN can not experience what it’s like to be a human (or a salamander) but it doesn’t mean it can not model humans in a sufficiently accurate way to be able to interact with humans. And NN on the Internet probably can gather enough information through its experiences of said Internet to build such a model.
Internet is not completely separated from the physical world (which we assume is the base reality), though. Internet is full of webcams, sensors, IoT devices, space satellites, and such. In a sense, an NN on the Internet can experience real world better than any human: it can see through electron microscope the tiniest things and through telescopes the biggest and furthest things, it can see in virtually all EM spectrum, it can listen through thousands of microphones all over the planet at the same time in ranges wider than any ear can hear, it can register every vibration a seismograph detects (including those on Mars), it can feel atmospheric pressure all over the world, it can know how much water is in most rivers on the planet, it knows atmospheric composition everywhere at once.
It likewise can manipulate world in some ways more than any particular human can. Thousands of robotic arms all over the world can assemble all sorts of machines. Traffic can be diverted by millions of traffic light, rail switches, instructions to navigation systems on planes and boats. How much economic influence can an NN on the Internet have by only manipulating data on the Internet itself?
Incompleteness of experience does not mean absence or deficiency of STIILTBness. Humans until very recently had no idea that there was a whole zoo of subatomic particles. Did that deny or inhibit human STIILTBness? Doesn’t seem like it. Now we’re looking at quantum physics and still coming to grips with the idea that actual physical reality can be completely different to what we seem to experience. That, however, didn’t make a single philosopher even blink.
Doesn’t this echo the core concern of “the AI-fear view”? AGI might end up not human-like but capable of influencing world. Its “fantastical model of reality” can be just close enough so we end up “attacked with a brick”.
This is a weird strategy to propose in a piece entitled “Beyond Hyperantropomorphism”. It basically suggests recreation of an artificial human intelligence by going through the typical human experience and then, somehow, by “cranking up dials” achieve hyper-STIILTBness. I don’t believe anyone on the “AI-fear” side of the argument actually worried about this specific scenario. After all if the AI is human-like there’s not much of an alignment problem. The AI would already understand what humans want. We might still need to negotiate with it to convince it doesn’t need to kill us all. Well, we have half of an alignment problem.
The other half is for the case that I believe is more likely: a NN on the Internet. In this case we would actually need to let it know what is that we actually want, which is arguably harder because to my knowledge no one ever has actually fully stated it to any degree of accuracy. OP dismisses this case on the basis of STIILTBness non-transferrability.
Overall, I feel like this is not a good argument. I have vague reservations on the validity of the approach. I don’t see justification for some of the claim, but author openly admits that some claims have no justification provided:
I’m also not convince that there’s solid logical progression from one claim to the next at every step but I’m a little to tired to investigate it further. Maybe it’s just my lack of education rearing its ugly face.
In the end, I feel like author doesn’t engage fully in good faith. There’s a lot of mentions of the central concept of STIILTBness and even an OK introduction of the concept in the first two parts but the core of the argument seem to be left out.
And author fully agrees that people don’t understand why it matters while also not actually trying to explain why it does.
For me it’s an interesting new way of looking at AI but i fail to see how it actually addresses “the AI-fear”.
Lol
🤣