lc comments on Generators Of Disagreement With AI Alignment

lc 8 Sep 2022 5:32 UTC
15 points
9
I don’t buy this framework where we split information into “quantifiable” and “non-quantifiable” categories. Information is just information. If it turns out general AIs can solve their problem best using myriad hard coded results from psychology studies then we will figure out how to get them to do that. If it turns out an AI can solve its problems best by using a raw camcorder and a series of fuzzy evolved heuristics we will figure out how to get them to do that. Modern ML researchers are already capable of teaching AIs how to ~~understand~~ [edit: do/learn how to do] things humans have only a very rudimentary theoretical foundation for, and which humans have a very hard time describing the fundamentals of in words, like language and vision. Even if they weren’t, the brain doing something means it’s possible to do, and so people will eventually figure out how to build a machine that is making whatever specific inferences you’re concerned with when you reference “non-quantifiable information”.
But, a large premise of why AI would be dangerous, is that it can build much better models of the world, without a lot of physical presence, via access to giant repositories of quantified information (i.e. the internet). So how accurate you think this information is matters, a lot.
I do in fact think, in the limit of ThinkOomph, this is not a real problem. But… if the accuracy of online information is actually a literal bottleneck in preventing AIs from learning what they need to take over the world, couldn’t an AI just pay someone to gather raw data for them? Sure, the AI has to trust that person, but an intelligent system could reasonably find a way to ensure their trustworthiness. What specifically do you think an AI has to learn outside the internet to build nanomachines?
Figuring out protein structures is very valuable, and it can be done using quantifiable information, AIs are much better at it than people. Figuring out how to massage away tightness causing pain in the neck is very valuable, and it can be done using fuzzy tactile information, trained masseurs are much better at it than massage chairs (presumably, even than massage chairs running a very fancy AI).
Again, this is a really bad way of thinking about the problem. Massage chairs are bad because they don’t respond to cues and are too mechanically unsophisticated to make precise movements, not because they’re more “quantitative”. Everything is numbers. If a computer can understand language and video feeds, I don’t know why AIs would be unable to massage well, if that computer actually had access to a robot that could do the same things humans can. The brain is in fact doing something like automatically decoding human emotional cues, ergo it is possible to develop an ML system that does the same thing, even if it turns out that’s harder than getting a fields medal for some reason.
The question is if this happens because of inherent limitations to cognition, or because most people have no reason for going through the arduous process of doing so for most systems. And, maybe more importantly, if there are a lot of high-ROI systems in need of modeling, or if our capacity to model collectively exceeded or closely matches the ROI we can get from available information.
Even if this were true (and I really don’t think it is), presumably an AI would not get tired and stop finetuning, in parallel, its expert-physicist-model or expert-marketing-model in the same way a human would decide not to learn about geopolitics, due to constraints on time or energy. So I don’t see why it’s relevant.
We aren’t agents acting independently in the world, we’re part of a super-organism containing 8 billion brains, all working towards different goals, but with some shared objectives.
I haven’t heard many try to contemplate the intelligence of AIs against the intelligence of systems, and presumably, that’s because of an assumption that systems, compared to individuals, aren’t that smart. That systems derive more so as a compromise for distribution mechanical, rather than thinking work.
Read this post before continuing.
I don’t think you really quite understand what automating intelligence means. In order to create a new person a woman has to go through 9 months of pregnancy after some very complicated and expensive social rituals, in a manner closely regulated by Earth Governance. An AI can run fork(); exec();, until it runs out of the hardware to do so. Generally training them is the hard part; after that, you can scale them up pretty much at will.
If necessary to wrest control of the future, there will be orders of magnitude more “artificial intelligences” than there are humans. AIs can speed along the production process for GPUs. More importantly, an AI doesn’t have to recruit other AIs with different goals than itself. They can coordinate on things absurdly more effectively than people can, because people have built-in conflicts of interest with they must monitor and control each other for. AIs do not need to be smarter than people in order to beat them, there just needs to be enough of them. They will be able to cooperate with one another in a way that we can’t, because humans have no way of editing our own code, or cloning ourselves, or analyzing the behavior of each other in a sandbox environment.
But this isn’t my real objection. My real objection is that a superintelligent AI is just unlikely to remain (for very long) the kind of thing you can analogize as a group of people. One giant supercomputer running a number of specialized finetune copies of a given seed AI isn’t accurately described as a one-agent or multi-agent system. The AI gathers more GPU power and then autoscales… It shuts parts of itself down, modifies them, and reallocates portion of its total thought-power at will of a grand Overseer. It’s not really constrained to a human body like a person with a distinct identity and objective from other people. It’s just this giant optimizer trying to make Number Go Up.
To be honest, I vaguely suspect a latent pattern here. You seem to be applying a sort of thinking typified by Hero Licensing to intelligent systems.
I’m psychologizing and you’re not supposed to do that, but I suspect that if you introspect a little bit you’ll realize that you’re assuming Civilizational adequacy in things like, say, cybersecurity, not because you’ve actually inspected the field of cybersecurity for economic efficiency in the face of unexpected superintelligent AI, but because there are all these important Cybersecurity People out there doing important seeming stuff and to reveal that you think they’re not actually doing it that well would be a Status Grab. Likewise you’ve got an instinctual aversion to the idea that ML researchers could create something to outcompete existing institutions; doing otherwise might be construed as Immodest, or a critique of high status people. Thus why you seem to be a smart person who is also nevertheless saying things like “‘one’ AI couldn’t necessarily beat ‘eight billion’ humans”, which wouldn’t make any sense at all if it weren’t being derived from a sub-instinct that says “a small group of AI researchers shouldn’t be allowed to claim that their product can ‘beat’ eight billion humans”.
- TAG 8 Sep 2022 11:07 UTC
  3 points
  0
  Parent
  
  AIs are demonstrably capable of understanding things humans have only a very rudimentary theoretical foundation for, and which humans have a very hard time describing the fundamentals of in words, like language and vision.
  
  Well, they are capable of doing them. Humans can do them without understanding them, so I don’t see why Ais would need to understand them.
  - lc 8 Sep 2022 12:43 UTC
    2 points
    0
    Parent
    Sure, I misspoke. Capable of doing them is the important part tho
- the gears to ascension 8 Sep 2022 9:03 UTC
  3 points
  0
  Parent
  I don’t disagree on any fundamental level, but don’t underestimate the entropy accumulation problem in any kind of self improvement, including scaling. an AI that has not solved some degree of distributed network inter-being alignment will most likely initially break if scaled in a way far outside its training, and the learning process to correct this doesn’t have to be easy. being duplicate does not make game theory trivial when you are a very complex agent who can make different mistakes in different contexts. I mean it certainly helps and it probably wouldn’t be good for humanity for this to happen but I don’t think scaling up has the same kind of terrifying danger that self hyper distillation ‘foom inwards’ does. because the latter implies very strong denoising at a level we haven’t seen from current machine learning, and as far as I can tell, eliezer’s predictions are all based on some sort of self hyper distillation improvement process. I think your model is solid here, to be clear, I’m not disagreeing about any of your main points at all.
- George3d6 8 Sep 2022 13:11 UTC
  1 point
  0
  Parent
  - I’m not arguing you should take any position on those axi, I am just suggesting them as potential axi.
  - I think that falling on one extreme of the spectrum is equivalent to thinking the spectrum doesn’t exist—so yes, I guess people that are very aligned with a MIRI style position on AI wouldn’t even find the spectrum valid or useful. Much like, say, an atheist wouldn’t find a “how much you believe in the power of prayer” spectrum insightful or useful. This was not something I considered while originally writing this, but even with it in mind now, I can’t think of any way I could address it.
  - In-so-far as your object level arguments against the spectrums I present being valid and/or of one extreme being nonsensical, I can’t say that, right now, I could say anything of much value on those topics that you haven’t probably already considered yourself.
  To address your later point, I doubt I fall into that particular fallacy. Rather, I’d say, I’m on the opposite spectrum where I’d consider most people and institutions to be beyond incompetent.
  
  Hence why I’ve reached the conclusion that improving on rationally legible metrics seems low ROI, because otherwise rationlandia would have arisen and ushered prosperity and unimaginable power in a seemingly dumb world.
  
  But I think that’s neither here nor there, as I said, I’m really not trying to argue my view here is correct, I’m trying to figure out why wide differences in view in both directions exist.
  - lc 8 Sep 2022 19:27 UTC
    2 points
    0
    Parent
    
    Much like, say, an atheist wouldn’t find a “how much you believe in the power of prayer” spectrum insightful or useful.
    
    I don’t understand what you mean? I’m an atheist and am clearly at the bottom of the spectrum. If you disagree with my objections to your axis, can you e.g. clarify what you mean when you say some datum is “non-quantifiable” and why that would prevent an AI from being able to use it decisively better than humans?
    - George3d6 10 Sep 2022 12:35 UTC
      1 point
      0
      Parent
      There are several things at the extreme of non-quantifiable:
      There’s “data” which can be examined in so much detail by human senses (which are intertwined with our thinking) that it would be inefficient to extract even with SF-level machinery. I gave as an example being able to feel another persons muscles and the tension within (hence the massage chair, but I agree smart-massage-chairs aren’t that advanced so it’s a poor analogy). Maybe a better example is “what you can tell from looking into someone’s eyes”
      There’s data that is interwound with our internal experience. So, for example, I can’t tell you the complex matrix of muscular tension I feel, but I can analyze my body and almost subconsciously decide “I need to stretch my left leg”. Similarly, I might not be able to tell you what the perfect sauce is for me or what patterns of activity it triggers in my brain, or how its molecules bind to my taste buds, but I can keep tasting the sauce and adding stuff and conclude “voila, this is perfect”
      There are things beyond data that one can never quantify, like revelations from god or querying the global consciousness or whatever
      I myself am pretty convinced there are a lot of things falling under <1> and <2> that are practically impossible to quantify (not fundamentally or theoretically impossible), even provided 1000x better camera, piezo, etc sensors and even provided 0.x nm transistors making perfect use of all 3 dimensions in their packing (so, something like 1000x better GPUs).
      I think <3> is false and mainly make fun of the people that believe in it (I’ve taken enough psychedelics not to be able to say this conclusively, but still). However, I still think it will be a generator of disagreement with AI alignment for the vast majority of people.
      I can see very good arguments that both 1 and 2 are uncritical and not that hard to quantify, and obviously that 3 is a giant hoax. Alas, my positions have remained unchanged on those, hence why I said a discussion around them may be unproductive.