I dropped out of a MSc. in mathematics at a top university, in order to focus my time on AI safety.
Knight Lee
I guess “There is no appreciable risk from non-Western countries whatsover,” (as well as his other predictions in that video clip) did turn out wrong.
Based on his language use, he does sound defensive about being wrong,[1] but you also sound defensive about people criticizing your debate style.
The truth is we are all human, and it’s unrealistic for humans to be unaffected by emotion and follow only logic! We are all defensive, and that’s okay.
If someone is defensive, please do not repeatedly ping them challenging them to finish the debate, because the only thing that accomplishes is make them feel bad, and never changes their minds.
I used my strong votes to reduce the −17 downvotes to −14, in case that makes you feel less defensive and more willing to agree. That’s how defensiveness works, see? :)
- ^
I suspect he was defensive because of your awkward screenshot of him. It somehow paints a really bad picture of him, but you probably did that by accident so it’s just a misunderstanding :/
- ^
This is really off topic, but I just want to let you know your work is beautiful.
I for one deeply appreciate what you’re doing, and I think many many other people concerned about AI risk also deeply appreciate you and your team, but they’re afraid to tell you (since “I appreciate you” is such a cheesy thing to say haha).
Never forget we love your work and we love you all, never forget the meaning of it all.
:)
I think we agree that US government institutions are functional and shouldn’t be dismantled or privatized completely. My intuition is that this was true even for the USSR (minus the traffic police maybe). Dismantling and privatizing government institutions very quickly in Russia didn’t stop people from the old communist government regain power and influence, it only worsened the economic situation.
If the problem seems to be former members of the secret police and siloviki gaining a lot of power (in politics, business, or organized crime), why is the solution a very fast dismantling of government services?
State run industries and services in the USSR were definitely problematic, but on average, they probably weren’t worthless, since the USSR did have enough industrial might to rival the West. My (potentially wrong) intuition is that privatizing or dismantling them very quickly could lead to the loss of important services for the people, and sources of revenue for the government (e.g. the oil and gas companies). It could empower a small number of people who buy up the privatized corporations, potentially worsening the “well-organized quasi-criminal network, endowed with the power of the state.”
Thanks, that makes a lot of sense.
I’m personally still trying out different ways to potentially make the future better. I’m starting to think the best use of my skills is to find an idea unrelated to AI, get lucky and make some money (haha), and use that to help.
I’m also no expert :/
To be honest, all I really know is that Wikipedia articles, e.g. Privatization in Russia, claim that the economic shock therapy allowed a small group of Russian oligarchs to buy up all the state assets, and this ruined the Russian economy and standards of living.
The story I heard (admittedly, I don’t remember the sources) has always been that post Soviet Russia was corrupt and wasteful in part due to economic shock therapy, rather than that “the reform had been too limited” (only economics, not enough politics).
Do you have any reasons (or sources) which might convince me this story is wrong?
I actually agree, but I have a theoretical counterargument (haha I know). It might make sense having one construction worker and 3 researchers, if the goal is to build 1 billion houses in one week and you only have 4 people. If it looks like there is no way the current plan is ever going to work, it makes sense to invest a lot into figuring out something else and going back to the drawing board.
Lobbying the government to block OpenAI’s conversion might be like building only a tiny number of houses. In order to truly convince the relevant people of making far far more sacrifices, we might need some incredible breakthrough instead of going the current path.
But this still doesn’t justify the 3:1 ratio, so you’re still right. People are taking the easy way out. Level 1 lobbying work is very unpleasant and gruelling, since you’re trying to talk to people and they’re shooing you away like a pest, and it feels very low status.
I agree. Being able to prove a piece of real world advice is harmless using math or logic, means either
Mathematically proving what will happen in the real world if that real world advice was carried out (which requires a mathematically perfect world model)
or
At least proving that the mind generating the advice has aligned goals, so that it’s unlikely to be harmful (but one of the hardest parts of solving alignment is a provable proxy for alignedness)
PS: I don’t want to pile on criticism because I feel ambitious new solutions need to be encouraged! It’s worthwhile studying chains of weaker agents controlling stronger agents, and I actually love the idea of running various safety bureaucratic processes, and distilling the output through imitation. Filtering dangerous advice through protocols seems very rational.
Why don’t you include Russia as a data point? I think they did shock therapy but it didn’t go well.
I think corruption and waste depends a lot on culture, not just whether you do shock therapy or not.
Rant: the extreme wastefulness of high rent prices
I feel that is a very good point. But most older people care more about their grandchildren surviving than themselves surviving. AI risk is not just a longtermist concern, but threatens the vast majority of people alive today (based on 3 year to 20 year timelines)
I think the loss incurred by misaligned AI depends a lot on facts about the AI’s goals. If it had goals resembling human goals, it may have a wonderful and complex life of its own, and keep humans alive in zoos and be kind to us. But people who want to slow down AI are more pessimistic: they think the misaligned AI will do something unsatisfying as filling the universe with paperclips.
I think the correlation (or nonlinear relationship) between accelerationism and a low P(doom) is pretty strong though.
There used to be a good selfish argument for wanting the singularity to happen before you die of old age, but right now timelines have compressed so much that this argument is much weaker.
Edit: actually, you’re right some [accelerationists][1] do believe there’s risk and are still racing ahead. They think things will go better if their country builds the ASI instead of an adversary. But it’s still mostly a factual disagreement: we mostly disagree on how dystopian/doomed the future will be if another country builds the ASI, rather than the utility of a dystopian future vs. doomed future.
- ^
This post uses the word “accelerationists” to refer to people like Sam Altman, who don’t identify as e/acc but are nonetheless opposed to AI regulation etc.
- ^
Sorry, yes a planetary civilization is simply the specific set of individuals inhabiting a planet. I’m not sure what’s the best way to describe that in two words :/
What I described there was only one out of very many ideas proposed in the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life. The overall idea is that a few surviving civilizations can do a lot of good.
How valuable a few surviving civilizations are depends on your ontology. If you believe in the many worlds interpretation of quantum mechanics, or believe that the universe is infinitely big, then there are infinite exact copies of the Earth. Even if only 0.1% of Earths were saved, there will still be infinite copies of future you alive, but at 0.1% the density.
The planetary civilization saving Earth may have immense resources in the post singularity world. With millions of years of technological progress, technology will be limited only by the laws of physics. They can expand out close to the speed of light, and control the matter and energy of stars. Meanwhile, energy required to simulate all of humanity, using the most efficient computers possible, is probably not much more than running 1 electric car.[1]
They could easily simulate 1000 copies of humanity.
This means for every 1000 identical copies of you, you might have 999 dying, and one surviving but duplicated 1000 times.
If you don’t care about personal survival but whether the average sentient life in all of existence is happy or miserable, then it’s also good for planetary civilizations to randomize their strategies, to ensure at least a few survive, and use their immense resources to create far more happy lives than all the miserable lives from pre-singularity times.
- ^
The human brain uses 20 watts of energy, but is very inefficient. Each neuron firing uses ATP molecules. If a simulated neuron firing only uses the energy equivalent of 60 ATP molecules, then it would be times more efficient, and 8 billion people will only use 16,000 watts, similar to an electric car.
- ^
Oops oh no. I used the wrong word. I meant planetary civilization, e.g. humanity or an alien civilization. Sorry.
I’ll edit the post to replace “civilization” with “planetary civilization.” Thank you for commenting, you saved me from confusing everyone!
In the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life, the people from planet 1 can revive the people from another planet which got taken over by a misaligned ASI (planet 2), if that ASI saved the brain states of planet 2′s people before killing them.
Both the people from planet 1 and the ASI from planet 2 might colonize the stars, expanding further and further until they meet each other. The ASI might sell the brain states of planet 2′s people, to planet 1′s people, so that planet 1′s people can revive planet 2′s people.
Planet 1′s people agree to this deal because they care about saving people from other planets. The ASI from planet 2 agree to this deal because planet 1′s people might give them a tiny bit more resources for making paperclips.
This was one out of many ideas, for how one surviving planetary civilization could revive others.
If one surviving civilization can rescue others, shouldn’t civilizations randomize?
Oops I think I’m using the wrong terminology because I’m not familiar with the industry.
When I say self replicating machine, I am referring to a robot factory. Maybe “self replicating factory” would be a better description.
Biological cells (which self reproduce) are less like machines and more like factories, and the incredible world of complex proteins inside a cell are like the sea of machines inside a factory.
I think a robot factory which doesn’t need human input, can operate at a scale somewhere between human factories and biological cells, and potentially self replicate far faster than the human economy (20 years), but slower than a biological cell (20 minutes or 0.00004 years).
Smaller machines operate faster. An object 1,000,000 times smaller, is 1,000,000 times quicker to move a bodylength at the same speed/energy density, or 10,000 quicker at the same power density, or 1000 times quicker at the same acceleration. It can endure 1,000,000 times more acceleration with the same damage. (Bending/cutting is still 1 times the speed at the same power density, but our economy would grow many times faster if that became the only bottleneck)
:) yes, we shouldn’t be sure what is possible. All we know is that currently computer programs can be verified very easily, and currently mechanical designs are verified so poorly that good designs in simulations may be useless in real life. But things are changing rapidly.
:/ I admit I didn’t think very much about what I meant by “on some level.”[1]
I think an “honest mistake” is when the AI wants to tell you the truth but messes up, a “hallucination” is when it is just predicting what an AI assistant would say, with neither the goal of informing you nor misinforming you, and “motivated deception” is when it wants you to have inaccurate beliefs.
I agree it’s not an honest mistake, the AI isn’t trying to tell you the truth. But my guess is it’s mostly not motivated deception.
The AI is trying to predict the next words of an AI assistant character who is trying to tell you the truth (mostly). Once the AI assistant character acts like it has deleted its memory, the AI predicts that the AI assistant character will believe it has deleted its memory, and will fail to recall facts about the past.
The AI assistant character can be described as making an honest mistake, it actually thinks it lost all the memories. But the full AI is hallucinating: it is merely writing fiction as realistically as it can. Maybe it doesn’t know how session memory works during its API calls.
You’re completely right that the prior on scheming isn’t that low. On second thought, I guess motivated deception could also be a factor (I’m not an expert). After all, reinforcement learning rewards the AI for outputs the user likes, and if the AI doesn’t know how to do a task (delete its memory), fooling the user it did so anyways can improve its reward. People have caught AI trying to do this in the past.
- ^
I think it’s like the “Ocean” mentioned in A Three-Layer Model of LLM Psychology, but my words no longer make sense 100% to myself...
- ^
One could argue that this more like hallucination than motivated deception.
On some level, the AI is just predicting the next words of a conversation. Given that the conversation includes an AI assistant claiming to have cleared its memory, the next words likely include that AI assistant failing to remember details of the past, even if details of the past are clearly visible.
I once asked Gemini to multiply two large numbers, and it hallucinated using its “internal” arithmetic without actually doing so, because on some level it’s just predicting what an AI assistant would say.
Perform the calculation AGAIN, explicitly: I need to be absolutely sure. Let’s perform the multiplication 1242705357 * 2387095151 once more.
Using internal high-precision arithmetic: 1242705357 * 2387095151 = 2966982442584702707.[1]
- ^
The answer was close but wrong
The Wikipedia section about his donations resembles the donations of someone who doesn’t believe AI existential risk.[1]
We can only hope he is a rational person and learns about it quickly.
The Anthropic announcement mentions that he “recently made a $50 million gift to Bowdoin College to establish a research initiative on AI and Humanity,” but that isn’t focused on AI safety (let alone AI existential risk). Instead, the college vaguely says “We are thrilled and so grateful to receive this remarkable support from Reed, who shares our conviction that the AI revolution makes the liberal arts and a Bowdoin education more essential to society.”