This does seem different however https://solarfoods.com/ - they are competing with food not fuel which can’t be done synthetically (well if at all). Also widely distributed capability like this helps make humanity more resilient e.g. against nuke winter, extreme climate change, space habitats
RussellThor
Thanks for this article, upvoted.
Firstly Magma sounds most like Anthropic, especially the combination of Heuristic #1 Scale AI capabilities and also publishing safety work.
In general I like the approach, especially the balance between realism and not embracing fatalism. This is opposed to say MIRI, Pause AI and at the other end, e/acc. (I belong to EA, however they don’t seem to have a coherent plan I can get behind) I like the realization that in a dangerous situation doing dangerous things can be justified. Its easy to be “moral” and just say “stop” however its another matter entirely if that helps now.
I consider the pause around TEDAI to be important, though I would like to see it just before TEDAI (>3* alignment speed) not after. I am unsure how to achieve such a thing, do we have to lay the groundwork now? When I suggest such a thing elsewhere on this site, however it gets downvoted:
https://www.lesswrong.com/posts/ynsjJWTAMhTogLHm6/?commentId=krYhuadYNnr3deamT
Goal #2: Magma might also reduce risks posed by other AI developers
In terms of what people not directly doing AI research can do, I think a lot can be done reducing risks by other AI models. To me it would be highly desirable if AI (N-1) is deployed as quickly as possible into society and understood while AI(N) is still being tested. This clearly isn’t the case with critical security. Similarly,
AI defense: Harden the world against unsafe AI
In terms of preparation, it would be good if critical companies were required to quickly deploy AGI security tools as they become available. That is have the organization setup so that when new capabilities emerge and the new model finds potential vulnerabilities, experts in the company quickly assess them, and deploy timely fixes.
Your idea of acquiring market share in high risk domains? Haven’t seen that mentioned before. It seems hard to pull off—hard to gain share in electricity grid software or similar.
Someone will no doubt bring up the more black hat approach to harden the world:
Soon after a new safety tool is released, a controlled hacking agent takes down a company in a neutral country with a very public hack, with the message if you don’t asap use these security tools, then all other similar companies suffer and they have been warned.
Thats not a valid criticism if we are simply about choosing one action to reduce X-risk. Consider for example the cold war—the guys with nukes did the most to endanger humanity however it was most important that they cooperated to reduce it.
In terms of specific actions that don’t require government, I would be positive about an agreement between all the leading labs that when one of them made an AI (AGI+) capable of automatic self improvement they would all commit to share it between them and allow 1 year where they did not hit the self improve button, but instead put that towards alignment. 12 months may not sound like a lot, but if the research is 2-10* because of such AI then it would matter. In terms of single potentially achievable actions that will help that seems to be the best to me.
Not sure if this is allowed but you can aim at a rock or similar say 10m away from the target (4km from you) to get the bias (and distribution if multiple shots are allowed). Also if the distribution is not totally normal, but has smaller than normal tails then you could aim off target with multiple shots to get the highest chance of hitting the target. For example if the child is head height then aim for the targets feet, or even aim 1m below the target feet expecting 1⁄100 shots will actually hit the targets legs, but only <1/1000 say will hit the child. That is assuming an unrealistic amount of shots of course. If you have say 10 shots, then some combined strategy where you start aiming a lot off target to get the bias and a crude estimate of the distribution, then steadily aiming closer could be optimal.
I think height is different to IQ in terms of effect. There are simple physical things that make you bigger, I expect height to be linear for much longer than IQ.
Then there are potential effects, like something seems linear until OOD, but such OOD samples don’t exist because they die before birth. If that was the case it would look like you could safely go OOD. Would certainly be easier if we had 1 million mice with such data to test on.
That seems so obviously correct as a starting point, not sure why the community here doesn’t agree by default. My prior for each potential IQ increase would be that diminishing returns would kick in—I would only update against when actual data comes in disproving that.
OK I guess there is a massive disagreement between us on what IQ increases gene changes can achieve. Just putting it out there, if you make an IQ 1700 person they can immediately program an ASI themselves, have it take over all the data centers rule the world etc.
For a given level of IQ controlling ever higher ones, you would at a minimum require the creature to decide morals, ie. is Moral Realism true, or what is? Otherwise with increasing IQ there is the potential that it could think deeply and change its values, additionally believe that they would not be able to persuade lower IQ creatures of such values, therefore be forced into deception etc.
“with a predicted IQ of around 1700.” assume you mean 170. You can get 170 by cloning existing IQ 170 with no editing necessary so not sure the point.
I don’t see how your point addresses my criticism—if we assume no multi-generational pause then gene editing is totally out. If we do, then I’d rather Neuralink or WBE. Related to here
https://www.lesswrong.com/posts/7zxnqk9C7mHCx2Bv8/beliefs-and-state-of-mind-into-2025(I believe that WBE can get all the way to a positive singularity—a group of WBE could self optimize, sharing the latest HW as it became available in a coordinated fashion so no-one or group would get a decisive advantage. This would get easier for them to coordinate as the WBE got more capable and rational.)
I don’t believe that with current AI and an unlimited time and gene editing you could be sure you have solved alignment. Lets say you get many IQ 200, who believe they have alignment figured out. However the overhang leads you to believe that a data center size AI will self optimize from 160 to 300-400 IQ if someone told it to self optimize. Why should you believe an IQ 200 can control 400 any more than IQ 80 could control 200? (And if you believe gene editing can get IQ 600, then you must believe the AI can self optimize well above that. However I think there is almost no chance you will get that high because diminishing returns, correlated changes etc)
Additionally there is unknown X and S risk from a multi-generational pause with our current tech. Once a place goes bad like N Korea, then tech means there is likely no coming back. If such centralization is a one way street, then with time an every larger % of the world will fall under such systems, perhaps 100%. Life in N Korea is net negative to me. “1984” can be done much more effectively with todays tech than what Orwell could imagine and could be a long term strong stable attractor with our current tech as far as we know. A pause is NOT inherently safe!
ok I see how you could think that, but I disagree that time and more resources would help alignment much if at all, esp before GPT4.0. See here https://www.lesswrong.com/posts/7zxnqk9C7mHCx2Bv8/beliefs-and-state-of-mind-into-2025
Diminishing returns kick in, and actual data from ever more advanced AI is essential to stay on the right track and eliminate incorrect assumptions. I also disagree that alignment could be “solved” before ASI is invented—we would just think we had it solved but could be wrong. If its just as hard as physics, then we would have untested theories, that are probably wrong, e.g. like SUSY would be help solve various issues and be found by the LHC which didn’t happen.
“Then maybe we should enhance human intelligence”
Various paths to this seem either impossible or impractical.
Simple genetics seems obviously too slow and even in the best case unlikely to help. E.g say you enhance someone to IQ 200, its not clear why that would enable them to control and IQ 2,000 AI.
Neuralink—perhaps but if you can make enhancement tech that would help, you could also easily just use it to make ASI—so extreme control would be needed. E.g. if you could interface to neurons and connect them to useful silicon, then the silicon itself would be ASI.
Whole Brain Emulation seems most likely to work, with of course the condition that you don’t make ASI when you could, and instead take a bit more time to make the WBE.
If there was a well coordinated world with no great power conflicts etc then getting weak super intelligence to help with WBE would be the path I would choose.
In the world we live in, it probably comes down to getting whoever gets to weak ASI first to not push the “self optimize” button and instead give the worlds best alignment researchers time to study and investigate the paths forward with the help of such AI as much as possible. Unfortunately there is too high a chance that OpenAI will be first and they are one of the least trustworthy, most power seeking orgs it seems.
“The LessWrong community is supposed to help people not to do this but they aren’t honest with themselves about what they get out of AI Safety, which is something very similar to what you’ve expressed in this post (gatekept community, feeling smart, a techno-utopian aesthetic) instead of trying to discover in an open-minded way what’s actually the right approach to help the world.
I have argued with this before—I have absolutely been through an open minded process to discover the right approach and I genuinely believe the likes of MIRI, pause AI movements are mistaken and harmful now, and increase P(doom). This is not gatekeeping or trying to look cool! You need to accept that there are people who have followed the field for >10 years, have heard all the arguments, used to believe Yud et all were mostly correct, and now agree with the positions of Pope/Belrose/Turntrout more. Do not belittle or insult them by assigning the wrong motives to them.
If you want a crude overview of my position
Superintelligence is extremely dangerous even though at least some of MIRI worldview is likely wrong.
P(doom) is a feeling, it is too uncertain to be rational about, however mine is about 20% if humanity develops TAI in the next <50 years. (This is probably more because of my personal psychology than a fact about the world and I am not trying to strongly pretend otherwise)
P(doom) if superintelligence was impossible is also about 20% for me, because the current tech (LLM etc) can clearly enable “1984” or worse type societies for which there is no comeback and extinction is preferable. Our current society/tech/world politics is not proven to stable.
Because of this, it is not at all clear what the best path forward is and people should have more humility about their proposed solutions. There is no obvious safe path forward given our current situation. (Yes if things had gone differently 20-50 years ago there perhaps could be...)
I’m considering a world transitioning to being run by WBE rather than AI so I would prefer not to give everyone “slap drones” https://theculture.fandom.com/wiki/Slap-drone To start with the compute will mean few WBE, much less than humans and they will police each other. Later on, I am too much of a moral realist to imagine that there would be mass senseless torturing. For a start if you well protect other em’s so you can only simulate yourself, you wouldn’t do it. I expect any boring job can be made non-conscious so their just isn’t the incentive to do that. At the late stage singularity if you will let humanity go their own way, there is fundamentally a tradeoff between letting “people”(WBE etc) make their own decisions and allowing the possibility of them doing bad things. You also have to be strongly suffering averse vs util - there would surely be >>> more “heavens” vs “hells” if you just let advanced beings do their own thing.
If you are advocating for a Butlerian Jihad, what is your plan for starships, with societies that want to leave earth behind, have their own values and never come back? If you allow that, then simply they can do whatever they want with AI—now with 100 billion stars that is the vast majority of future humanity.
Yes I think thats the problem—my biggest worry is sudden algorithmic progress, this becomes almost certain as the AI tends towards superintelligence. An AI lab on the threshold of the overhang is going to have incentives to push through, even if they don’t plan to submit their model for approval. At the very least they would “suddenly” have a model that uses 10-100* less resources to do existing tasks giving them a massive commercial lead. They would of course be tempted to use it internally to solve aging, make a Dyson swarm … also.
Another concern I have is I expect the regulator to impose a de-facto unlimited pause if it is in their power to do so as we approach superintelligence as the model/s would be objectively at least somewhat dangerous.
Perhaps, depends how it is. I think we could do worse than just have Anthropic have a 2 year lead etc. I don’t think they would need to prioritize profit as they would be so powerful anyway—the staff would be more interested in getting it right and wouldn’t have financial pressure. WBE is a bit difficult, there needs to be clear expectations, i.e. leave weaker people alone and make your own world
https://www.lesswrong.com/posts/o8QDYuNNGwmg29h2e/vision-of-a-positive-singularity
There is no reason why super AI would need to exploit normies. Whatever we decide, we need some kind of clear expectations and values regarding what WBE are before they become common, Are they benevolent super-elders, AI gods banished to “just” the rest of the galaxy, the natural life progression of first world humans now?
However, there are many other capabilities—such as conducting novel research, interoperating with tools, and autonomously completing open-ended tasks—that are important for understanding AI systems’ impact.
Wouldn’t internal usage of the tools by your staff give a very good, direct understanding of this? Like how much does everyone feel AI is increasing your productivity as AI/alignment researchers? I expect and hope that you would be using your own models as extensively as possible and adapting their new capabilities to your workflow as soon as possible, sharing techniques etc.
Space colonies are a potential way out—if a small group of people can make their own colony then they start out in control. The post assumes a world like it is now where you can’t just leave. Historically speaking that is perhaps unusual—much of the time in the last 10,000 years it was possible for some groups to leave and start anew.