Sounds right to me. LLMs love to roleplay, and LLM-roleplaying-as-AI being mistaken for LLM-talking-about-itself is a classic. (Here’s a post I wrote back in Feb 2023 on the topic.)
hold_my_fish
Have you ever played piano?
Yes, literally longer than I can remember, since I learned around age 5 or so.
The kind of fluency that we see in the video is something that a normal person cannot acquire in just a few days, period.
The video was recorded in 2016, 10 years after his 2006 injury. It’s showing the result of 10 years of practice.
You plain don’t become a pianist in one month, especially without a teacher, even if you spend all the time on the piano.
I don’t think he was as skilled after one month as he is now after 10 years.
I would guess though that you can improve a remarkable amount in one month if you play all day every day. I expect that a typical beginner would play about an hour a day at most. If he’s playing multiple hours a day, he’ll improve faster than a typical beginner.
Keep in mind also that he was not new to music, since he had played guitar previously. That makes a huge difference, since he’ll already be familiar with scales, chords, etc. and is mostly just learning motor skills.
Having watched the video about the piano player, I think the simplest explanation is that the brain injury caused a change in personality that resulted in him being intensely interested in playing the piano. If somebody were to suddenly start practicing the piano intently for some large portion of every day, they’d become very skilled very fast, much faster than most learners (who would be unlikely to put in that much time).
The only part that doesn’t fit with that explanation is the claim that he played skillfully the first time he sat down at the piano, but since there’s no recording of it, I chalk that up to the inaccuracy of memory. It would have been surprising enough for him to play it at all that it could have seemed impressive even with not much technical ability.
Otherwise, I just don’t see where the motor skills could have come from. There’s a certain amount of arbitrariness to how a piano keyboard is laid out (such as which keys are white and which are black), and you’re going to need more than zero practice to get used to that.
In encryption, hasn’t the balance changed to favor the defender? It used to be that it was possible to break encryption. (A famous example is the Enigma machine.) Today, it is not possible. If you want to read someone’s messages, you’ll need to work around the encryption somehow (such as by social engineering). Quantum computers will eventually change this for the public-key encryption in common use today, but, as far as I know, post-quantum cryptography is farther along than quantum computers themselves, so the defender-wins status quo looks likely to persist.
I suspect that this phenomenon in encryption technology, where as the technology improves, equal technology levels favor the defender, is a general pattern in information technology. If that’s true, then AI, being an information technology, should be expected to also increasingly favor the defender over time, provided that the technology is sufficiently widely distributed.
I found this to be an interesting discussion, though I find it hard to understand what Yudkowsky is trying to say. It’s obvious that diamond is tougher than flesh, right? There’s no need to talk about bonds. But the ability to cut flesh is also present in biology (e.g. claws). So it’s not the case that biology was unable to solve that particular problem.
Maybe it’s true that there’s no biologically-created material that diamond cannot cut (I have no idea). But that seems to have zero relevance to humans anyway, since clearly we’re not trying to compete on the robustness of our bodies (unlike, say, turtles).
The most general possible point, that there materials that can be constructed artificially with properties not seen in biology, is obviously true, and again doesn’t seem to require the discussion of bonds.
Consider someone asking the open source de-censored equivalent of GPT-6 how to create a humanity-ending pandemic. I expect it would read virology papers, figure out what sort of engineered pathogen might be appropriate, walk you through all the steps in duping multiple biology-as-a-service organizations into creating it for you, and give you advice on how to release it for maximum harm.
This commits a common error in these scenarios: implicitly assuming that the only person in the entire world that has access to the LLM is a terrorist, and everyone else is basically on 2023 technology. Stated explicitly, it’s absurd, right? (We’ll call the open source de-censored equivalent of GPT-6 Llama-5, for brevity.)
If the terrorist has Llama-5, so do the biology-as-a-service orgs, so do law-enforcement agencies, etc. If the biology-as-a-service orgs are following your suggestion to screen for pathogens (which is sensible), their Llama-5 is going to say, ah, this is exactly what a terrorist would ask for if they were trying to trick us into making a pathogen. Notably, the defenders need a version that can describe the threat scenario, i.e. an uncensored version of the model!
In general, beyond just bioattack scenarios, any argument purporting to demonstrate dangers of open source LLMs must assume that the defenders also have access. Everyone having access is part of the point of open source, after all.
Edit: I might as well state my own intuition here that:
In the long run, equally increasing the intelligence of attacker and defender favors the defender.
In the short run, new attacks can be made faster than defense can be hardened against them.
If that’s the case, it argues for an approach similar to delayed disclosure policies in computer security: if a new model enables attacks against some existing services, give them early access and time to fix it, then proceed with wide release.
The OP and the linked PDF, to me, seem to express a view of natural selection that is oddly common yet strikes me as dualistic. The idea is that natural selection produces bad outcomes, so we’re doomed. But we’re already the product of natural selection—if natural selection produces exclusively bad outcomes, then we’re living in one!
Sometimes people attempt to salvage their pessimistic view of natural selection by saying, well, we’re not doing what we’re supposed to do according to natural selection, and that’s why the world isn’t dystopic. But that doesn’t work either: the point of natural selection is that we’re operating according to strategies that are successful under conditions of natural selection (because the other ones died out).
So then the next attempt is to say, ah, but our environment is much different now—our behavior is outdated, owing back to a time when being non-evil worked, and being evil is optimal now. This at least is getting closer to plausibility (since indeed our behavior is outdated in many ways, with eating habits as an obvious example), but it’s still strange in quite a few ways:
If what’s good about the world is due to a leftover natural human tendency to goodness, then how come the world is so much less violent now than it was during our evolutionary history?
If the modern world makes evil optimal, how come evil kept notching up Ls in the 20th century (in WW2 and the Cold War, as the biggest examples)?
If our outdated behavior is really that far off optimal, how come it has kept our population booming for thousands of years, in conditions all quite different from our evolutionary history? Even now, fertility crisis notwithstanding, the human population is still growing, and we’re among the most successful species ever to exist on Earth.
But despite these factors that make me doubt that we humans have suboptimally inherited an innate tendency to goodness, it’s conceivable. What often comes next, though, is a disturbing policy suggestion: encode “human values” in some superintelligent AI that is installed as supreme eternal dictator of the universe. Leaving aside the issue of whether “human values” even makes sense as a concept (since it seems to me that various nasty youknowwhos of history, being undoubtedly homo sapiens, have as much a claim to the title as you or I), totalitarianism is bad.
It’s not just that totalitarianism is bad to live in, though that’s invariably true in the real world. It also seems to be ineffective. It lost in WW2, then in the Cold War. It’s been performing badly in North Korea for decades. And it’s increasingly dragging down modern China. Totalitarianism is evidently unfavored by natural selection. Granted, if there are no alternatives to compete against, it can persist (as seen in North Korea), so maybe a human-originated singular totalitarianism can persist for a billion years until it gets steamrolled by aliens running a more effective system of social organization.
One final thought: it may be that natural selection actually favors AI that cares more about humans than humans care about each other. Sound preposterous? Consider that there are species (such as Tasmanian devils) that present-day humans care about conserving but where the members of the species don’t show much friendliness to each other.
the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost
Thanks, that’s an excellent and important point that I overlooked: the growth rate of inference cost is about half that of training cost.
If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo’s viewpoint; sorry if I misunderstood), it’d also be quite expensive to do inference with, no? I’ll try to estimate how it compares to humans.
Let’s use Cotra’s “tens of billions” for training compared to GPT-4′s $100m+, for roughly a 300x multiplier. Let’s say that inference costs are multiplied by the same 300x, so instead of GPT-4′s $0.06 per 1000 output tokens, you’d be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analogous to human stream of consciousness, so let’s compare to human talking speed, which is roughly 130 wpm. Assuming 3⁄4 words per token, that converts to a human hourly wage of 18/1000/(3/4)*130*60 = $187/hr.
So, under these assumptions (which admittedly bias high), operating this hypothetical human-level GPT-N would cost the equivalent of paying a human about $200/hr. That’s expensive but cheaper than some high-end jobs, such as CEO or elite professional. To convert to a salary, assume 2000 hours per year, for a $400k salary. For example, that’s less than OpenAI software engineers reportedly earn.
This is counter-intuitive, because traditionally automation-by-computer has had low variable costs. Based on the above back-of-the-envelope calculation, I think it’s worth considering when discussing human-level-AI-soon scenarios.
It’s true that if the transition to the AGI era involves some sort of 1917-Russian-revolution-esque teardown of existing forms of social organization to impose a utopian ideology, pre-existing property isn’t going to help much.
Unless you’re all-in on such a scenario, though, it’s still worth preparing for other scenarios too. And I don’t think it makes sense to be all-in on a scenario that many people (including me) would consider to be a bad outcome.
aligned superintelligent AI will be able to better allocate resources than any human-designed system.
Sure, but allocate to what end? Somebody gets to decide the goal, and you get more say if you have money than if you don’t. Same as in all of history, really.
As a concrete example, if you want to do something with the GPT-4 API, it costs money. When someday there’s an AGI API, it’ll cost money too.
if AGI goes well, economics won’t matter much.
My best guess as to what you mean by “economics won’t matter much” is that (absent catastrophe) AGI will usher in an age of abundance. But abundance can’t be unlimited, and even if you’re satisfied with limited abundance, that era won’t last forever.
It’s critical to enter the post-AGI era with either wealth or wealthy connections, because labor will no longer be available as an opportunity to bootstrap your personal net worth.
As someone who is pro-open-source, I do think that “AI isn’t useful for making bioweapons” is ultimately a losing argument, because AI is increasingly helpful at doing many different things, and I see no particular reason that the making-of-bioweapons would be an exception. However, that’s also true of many other technologies: good luck making your bioweapon without electric lighting, paper, computers, etc. It wouldn’t be reasonable to ban paper just because it’s handy in the lab notebook in a bioweapons lab.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general. It’s a bit hard for me to imagine that being the case, so if it turned out to be true, I’d need to reconsider my viewpoint.
The explanation of PvP games as power fantasies seems quite off to me, since they are worse at giving a feeling of power than PvE games are. The vast majority of players of a PvP game will never reach the highest levels and therefore will lose half the time, which makes playing the game a humbling experience, which is why some players will blame bad luck, useless teammates, etc. to avoid having to admit that they’re bad. I don’t see how “I lose a lot at middling ranks because I’m never lucky” is much of a power fantasy.
A PvE game is much better at delivering a power fantasy because every player can have the experience of defeating hordes of enemies, conquering the world, or leveling up to godlike power.
I found this very interesting, and I appreciated the way you approached this in a spirit of curiosity, given the way the topic has become polarized. I firmly believe that, if you want any hope of predicting the future, you must at minimum do your best to understand the present and past.
It was particularly interesting to learn that the idea has been attempted experimentally.
One puzzling point I’ve seen made (though I forget where) about self-replicating nanobots: if it’s possible to make nano-sized self-replicating machines, wouldn’t it be easier to create larger-sized self-replicating machines first? Is there a reason that making them smaller would make the design problem easier instead of harder?
The intelligence explosion starts before human-level AI.
Are there any recommended readings for this point in particular? I tried searching for Shulman’s writing on the topic but came up empty. (Sorry if I missed some!)
This seems to me a key point that most discourse on AI/AGI overlooks. For example, LeCun argues that, at current rates of progress, human-level AI is 30+ years away (if I remember him correctly). He could be right about the technological distance yet wrong about the temporal distance if AI R&D is dramatically sped up by an intelligence explosion ahead of the HLAI milestone.
It also seems like a non-obvious point. For example, when I. J. Good coined the term “intelligence explosion”, it was conceived as the result of designing an ultraintelligent machine. So for explosion to precede superintelligence flips the original concept on its head.
I’ve only listened to part 1 so far, and I found the discussion of intelligence explosion to be especially fresh. (That’s hard to do given the flood of AI takes!) In particular (from memory, so I apologize for errors):
By analogy to chip compute scaling as a function of researcher population, it makes super-exponential growth seem possible if AI-compute-increase is substituted for researcher-population-increase. A particularly interesting aspect of this is that the answer could have come out the other way if the numbers had worked out differently as Moore’s law progressed. (It’s always nice to give reality a chance to prove you wrong.)
The intelligence explosion starts before human-level AI. But I was left wanting to know more: if so, how do we know when we’ve crossed the inflection point into the intelligence explosion? Is it possible that we’re already in an intelligence explosion, since AlexNet, or Google’s founding, or the creation of the internet, or even the invention of digital computers? And I thought Patel’s point about the difficulty of automating a “portfolio of tasks” was great and not entirely addressed.
The view of intelligence explosion as consisting concretely of increases in AI researcher productivity, though I’ve seen it observed elsewhere, was good to hear again. It helps connect the abstract concept of intelligence explosion to how it could play out in the real world.
It now seems clear that AIs will also descend more directly from a common ancestor than you might have naively expected in the CAIS model, since almost every AI will be a modified version of one of only a few base foundation models. That has important safety implications, since problems in the base model might carry over to problems in the downstream models, which will be spread thorought the economy. That said, the fact that foundation model development will be highly centralized, and thus controllable, is perhaps a safety bonus that loosely cancels out this consideration.
The first point here (that problems in a widely-used base model will propagate widely) concerns me as well. From distributed systems we know thatIndividual components will fail.
To withstand failures of components, use redundancy and reduce the correlation of failures.
By point 1, we should expect alignment failures. (It’s not so different from bugs and design flaws in software systems, which are inevitable.) By point 2, we can withstand them using redundancy, but only if the failures are sufficiently uncorrelated. Unfortunately, the tendency towards monopolies in base models is increasing the correlation of failures.
As a concrete example, consider AI controlling a military. (As AI improves, there are increasingly strong incentives to do so.) If such a system were to have a bug causing it to enact a military coup, it would (if successful) have seized control of the government from humans. We know from history that successful military coups have happened many times, so this does not require any special properties of AI.
Such a scenario could be prevented by populating the military with multiple AI systems with decorrelated failures. But to do that, we’d need such systems to actually be available.
It seems to me the main problem is the natural tendency to monopoly in technology. The preferable alternative is robust competition of several proprietary and open source options, and that might need government support. (Unfortunately, it seems that many safety-concerned people believe that competition and open source are bad, which I view as misguided for the above reasons.)
I believe you’re underrating the difficult of Loebner-silver. See my post on the topic. The other criteria are relatively easy, although it would be amusing if a text-based system failed on the technicality of not playing Montezuma’s revenge.
Something new and relevant: Claude 3′s system prompt doesn’t use the word “AI” or similar, only “assistant”. I view this as a good move.
As an aside, my views have evolved somewhat on how chatbots should best identify themselves. It still doesn’t make sense for ChatGPT to call itself “an AI language model”, for the same reason that it doesn’t make sense for a human to call themselves “a biological brain”. It’s somehow a category error. But using a fictional identification is not ideal for productivity contexts, either.