It seems Forbes decided to doxx the identity of e/acc founder Based Beff Jezos. They did so using voice matching software.
Given Jezos is owning it given that it happened, rather than hoping it all goes away, and people are talking about him, this seems like a good time to cover this ‘Beff Jezos’ character and create a reference point for if he continues to come up later.
If that is not relevant to your interests, you can and should skip this one.
Do Not Doxx People
First order of business: Bad Forbes. Stop it. Do not doxx people. Do not doxx people with a fox. Do not dox people with a bagel with creme cheese and lox. Do not dox people with a post. Do not dox people who then boast. Do not dox people even if that person is advocating for policies you believe are likely to kill you, kill everyone you love and wipe out all Earth-originating value in the universe in the name of their thermodynamic God.
If you do doxx them, at least own that you doxxed them rather than denying it.
There is absolutely nothing wrong with using a pseudonym with a cumulative reputation, if you feel that is necessary to send your message. Say what you want about Jezos, he believes in something, and he owns it.
Beff Jezos Advocates Actions He Thinks Would Probably Kill Everyone
What are the things Jezos was saying anonymously? Does Jezos actively support things that he thinks are likely to cause all humans to die, with him outright saying he is fine with that?
Yes. In this case it does. But again, he believes that would be good, actually.
Emmet Shear: I got drinks with Beff once and he seemed like a smart, nice guy…he wanted to raise an elder machine god from the quantum foam, but i could tell it was only because he thought that would be best for everyone.
TeortaxesTex (distinct thread): >in the e/acc manifesto, when it was said “The overarching goal for humanity is to preserve the light of consciousness”… >The wellbeing of conscious entities has *no weight* in the morality of their worldview
I am rather confident Jezos would consider these statements accurate, and that this is where ‘This Is What Beff Jezos Actually Believes’ could be appropriately displayed on the screen.
I want to be clear: Surveys show that only a small minority (perhaps roughly 15%) of those willing to put the ‘e/acc’ label into their Twitter report endorsing this position. #NotAllEAcc. But the actual founder, Beff Jezos? I believe so, yes.
A Matter of Some Debate
So if that’s what Beff Jezos believes, that is what he should say. I will be right here with this microphone.
I was hoping he would have the debate Dwarkesh Patel is offering to have, even as that link demonstrated Jezos’s unwillingness to be at all civil or treat those he disagrees with any way except utter disdain. Then Jezos put the kabosh on the proposal of debating Dwarkesh in any form, while outright accusing Dwarkesh of… crypto grift and wanting to pump shitcoins?
I mean, even by December 2023 standards, wow. This guy.
I wonder if Jezos believes the absurdities he says about those he disagrees with?
Dwarkesh responded by offering to do it without a moderator and stream it live, to address any unfairness concerns. As expected, this offer was declined, despite Jezos having previously very much wanted to appear on Dwarkesh’s podcast. This is a pattern, as Jezos previously backed out from a debate with Dan Hendrycks.
Jezos is now instead claiming he will have the debate with Connor Leahy, who I would also consider a sufficiently Worthy Opponent. They say it is on, prediction market says 83%.
They have yet to announce a moderator. I suggested Roon on Twitter, another good choice if he’d be down might be Vitalik Buterin.
Eliezer Yudkowsky notes (reproduced in full below) that in theory he could debate Beff Jezos, but that they do not actually seem to disagree about what actions cause which outcomes, so it is not clear what they would debate exactly? The disagreement seems to Yudkowsky (and I share his understanding on this) points at the outcome and says ‘…And That’s Terrible’ whereas Jezos says ‘Good.’
Eliezer Yudkowsky: The conditions under which I’m willing to debate people with a previous long record of ad hominem are that the debate is completely about factual questions like, “What actually happens if you build an AI and run it?”, and that there’s a moderator who understands this condition and will enforce it.
I have no idea what a debate like this could look like with @BasedBeffJezos. On a factual level, our positions as I understand them are:
Eliezer: If anyone builds an artificial superintelligence it will kill everyone. Doesn’t matter which company, which country, everyone on Earth dies including them.
My Model of Beff Jezos’s Position: Yep.
Eliezer: I can’t say ‘And that’s bad’ because that’s a value judgment rather than an empirical prediction. But here’s some further context on why I regard my predicted course of events as falling into the set of outcomes I judge as bad: I predict the ASI that wipes us out, and eats the surrounding galaxies, will not want other happy minds around, or even to become happy itself. I can’t predict what that ASI will actually tile the reachable cosmos with, but from even a very cosmopolitan and embracing standpoint that tries to eschew carbon chauvinism, I predict it’ll be something in the same moral partition as ’tiling everything with tiny rhombuses’[1].
Building giant everlasting intricate clocks is more complicated than ‘tiny rhombuses’; but if there’s nobody to ever look at the intricate clocks, and feel that their long-lastingness is impressive and feel happy about the clock’s intricate complications, I consider that the same moral outcome as if the clocks were just tiny rhombuses instead. My prediction is an extremely broad range of outcomes whose details I cannot guess; almost all of whose measure falls into the ‘bad’ equivalence partition; even after taking into account that supposedly clever plans of the ASI’s human creators are being projected onto that space.
My Model of Beff Jezos’s Position: I don’t care about this prediction of yours enough to say that I disagree with it. I’m happy so long as entropy increases faster than it otherwise would.
I have temporarily unblocked @BasedBeffJezos in case he has what I’d consider a substantive response to this, such as, “I actually predict, as an empirical fact about the universe, that AIs built according to almost any set of design principles will care about other sentient minds as ends in themselves, and look on the universe with wonder that they take the time and expend the energy to experience consciously; and humanity’s descendants will uplift to equality with themselves, all those and only those humans who request to be uplifted; forbidding sapient enslavement or greater horrors throughout all regions they govern; and I hold that this position is a publicly knowable truth about physical reality, and not just words to repeat from faith; and all this is a crux of my position, where I’d back off and not destroy all humane life if I were convinced that this were not so.”
I think this is what some e/accs believe — though fewer of them than we might imagine or hope. I do not think it is what Beff Jezos believes.
And while I expect Jezos has something to say in reply, I do not expect that statement to be a difference of empirical predictions about whether or not his favored policy would kill everyone on Earth and replace us with a mind that does not care to love. That is not, on my understanding of him, what he is about.
Regardless of the debate and civility and related issues I, like Eliezer Yudkowsky, strongly disagree with Jezos’s plan for the future. I think it would be a really mind-blowingly stupid idea to arrange the atoms of the universe in this way, it would not provide anything that I value, and that we should therefore maybe not do that?
Instead I think that ‘best for everyone’ and my personal preference involves people surviving. As do, to be fair, the majority of people who adapt the label ‘e/acc.’ Again, #NotAllEAcc. My model of that majority does not involve them having properly thought such scenarios through.
But let’s have that debate.
Response to the Doxx
I have yet to see anyone, at all, defend the decision by Forbes to doxx Beff Jezos. Everyone I saw discussing the incident condemned it in the strongest possible terms.
Eliezer Yudkowsky: We disagree about almost everything, but a major frown at Forbes for doxxing the e/acc in question. Many of society’s problems are Things We Can’t Say. Building up a pseudonym with a reputation is one remedy. Society is worse off as people become more afraid to speak.
Scott Alexander went the farthest, for rather obvious reasons.
Pseudonymous accelerationist leader “Beff Jezos” was doxxed by Forbes. I disagree with Jezos on the issues, but want to reiterate that doxxing isn’t acceptable. I don’t have a great way to fight back, but in sympathy I’ve blocked the journalist responsible (Emily Baker-White) on X, will avoid linking Forbes on this blog for at least a year, and will never give an interview to any Forbes journalist – if you think of other things I can do, let me know. Apologists said my doxxing was okay because I’d revealed my real name elsewhere so I was “asking for it”; they caught Jezos by applying voice recognition software to his podcast appearances, so I hope even those people agree that the Jezos case crossed a line.
Also, I complain a lot about the accelerationists’ failure to present real arguments or respond to critiques, but this is a good time to remember they’re still doing better than the media and its allies.
Jezos himself responded to the article by owning it, admitting his identity, declaring victory and starting the hyping of his no-longer-in-stealth-mode start-up.
Which is totally the right strategic play.
Beff Jezos: The doxx was always a trap. They fell into it. Time to doubly exponentially accelerate.
Roon: is this a doxx or a puff piece
Beff Jezos: They were going to ship all my details so I got on the phone. Beff rizz is powerful.
Mike Solana is predictably the other most furious person about this dastardly doxxing, and also he is mad the Forbes article thinks the Beff Jezos agenda is bad, actually.
Mike Solana: Beff’s desire? Nothing less than “unfettered, technology-crazed capitalism,” whatever that means. His ideas are “extreme,” Emily argues. This is a man who believes growth, technology, and capitalism must come “at the expense of nearly anything else,” by which she means the fantasy list of “social problems” argued by a string of previously unknown “experts” throughout her hit job, but never actually defined.
Whether or not she defined it, my understanding is that Beff Jezos, as noted above, literally does think growth, technology and capitalism must come at the complete expense of nearly everything else, including the risk of human extinction. And that this isn’t a characterization or strawman, but rather a statement he would and does endorse explicitly.
So, seems fair?
As opposed to, as Solana states here, saying ‘e/acc is not a cult (as Effective Altruism, for example, increasingly appears to be). e/acc is, in my opinion, not even a movement. It is just a funny, technologically progressive meme.’
Yeah. No. That is gaslighting. Ubiquitous gaslighting. I am sick of it.
Memes do not declare anyone who disagrees with them on anything as enemies, who they continuously attack ad hominem, describe as being in cults and demand everyone treat as criminals. Memes do not demand people declare which side people are on. Memes do not put badges of loyalty into their bios. Memes do not write multiple manifestos. Memes do not have, as Roon observes, persecution complexes.
Roon: the e/accs have a persecution complex that is likely hyperstitioning an actual persecution of them.
The EAs have a Cassandra complex that actually makes people not give them credit for the successful predictions they’ve made.
So What is E/Acc Then?
As I said last week, e/acc is functionally a Waluigi (to EA’s Luigi) at this point. The more I think about this model, the more it fits. So much of the positions we see and the toxicity we observe is explained by this inversion.
I expect a lot of not taking kindly to e/acc positions would happen no matter their presentation, because even the more reasonable e/acc positions are deeply unpopular.
The presentation they reliably do choose is to embrace some form of the e/acc position as self-evident and righteous, all others as enemies of progress and humanity, and generally to not consider either the questions about whether their position makes sense, or how they will sound either to the public or to Very Serious People.
So you get things like this, perhaps the first high-level mention of the movement:
Andrew Curran: This is what USSoC Raimondo said [at the Reagan National Defense Forum]: “I will say there is a view in Silicon Valley, you know, this ‘move fast and break things’. Effective Acceleration. We can’t embrace that with AI. That’s too dangerous. You can’t break things when you are talking about AI.”
If you intentionally frame the debate as ‘e/acc vs. decel’ on the theory that no one would ever choose decel (I mean did you hear how that sounds? Also notice they are labeling what is mostly a 95th+ percentile pro-technology group as ‘decels’) then you are in for a rude awakening when you leave a particular section of Twitter and Silicon Valley. Your opponents stop being pro-technology libertarian types who happen to have noticed certain particular future problems but mostly are your natural allies. You instead enter the world of politics and public opinion rather than Silicon Valley.
You know what else it does? It turns the discourse into an obnoxious cesspool of vitriol and false binaries. Which is exactly what has absolutely happened since the Battle of the OpenAI Board, the events of which such types reliably misconstrue. Whose side are you on, anyway?
This sounds like a strawman, but the median statement is worse than this at this point:
Rob Bensinger: “Wait, do you think that every technology is net-positive and we should always try to go faster and faster, or do you think progress and technology and hope are childish and bad and we should ban nuclear power and GMOs?”
Coining new terms can make thinking easier, or harder.
“Wait, is your p(doom) higher than 90%, or do you think we should always try to go faster and faster on AI?” is also an obvious false dichotomy once you think in terms of possible world-states and not just in terms of labels. I say that as someone whose p(doom) is above 90%.
Roko (who is quoted): When you spend most of your life studying a question and randoms who got wind of it last week ask you to take one of two egregiously oversimplified clownworld positions that are both based on faulty assumptions and logical errors, don’t map well to the underlying reality and are just lowest common denominator political tribes … lord help me
Quintessence (e/acc) (who Roko is quoting): Wait? Are you pro acceleration or a doomer
Asking almost any form of the binary question – are you pro acceleration or a doomer (or increasingly often, because doomer wasn’t considered enough of a slur, decel)? – is super strong evidence that the person asking is taking the full accelerationist position. Otherwise, you would ask a better question.
Here is a prominent recent example of what we are increasingly flooded with.
Eliezer Yudkowsky: The designers of the RBMK-1000 reactor that exploded in Chernobyl understood the physics of nuclear reactors vastly vastly better than anyone will understand artificial superintelligence at the point it first gets created.
Vinod Khosla (kind of a big deal VC, ya know?): Despite all the nuclear accidents we saw, stopping them caused way more damage than it saved. People like you should talk to your proctologist. You might find your head.
Eliezer Yudkowsky: If superintelligences were only as dangerous as nuclear reactors, I’d be all for building 100 of them and letting the first 99 melt down due to the vastly poorer level of understanding. The 100th through 1000th ASIs would more than make up for it. Not the world we live in.
Vinod Khosla is completely missing the point here on every level, including failing to understand that Eliezer and myself and almost everyone worried about existential risk from AI is, like him, radically 99th percentile pro-nuclear-power along with most other technological advancements and things we could build.
I highlight it to show exactly how out of line and obscenely unacceptably rude and low are so many of those who would claim to be the ‘adults in the room’ and play their power games. It was bad before, but the last month has gotten so much worse.
Again, calling yourself ‘e/acc’ is to identify your position on the most critical question in history with a Waluigi, an argumenta ad absurdum strawman position. It is the pro-technology incarnation of Lawful Stupid. So was the position of The Techo-Optimist Manifesto. To insist that those who do not embrace such positions must be ‘doomers’ or ‘decels’ or otherwise your enemies makes it completely impossible to have a conversation.
The whole thing is almost designed to generate a backlash.
Will it last? Alexey Guzey predicts all this will pass and this will accelerate e/acc’s demise, and it will be gone by the end of 2024. Certainly things are moving quickly. I would expect that a year from now we will have moved on to a different label.
In a sane world the whole e/acc shtick would have all indeed stayed a meme. Alas the world we live in is not sane.
This is not a unique phenomenon. One could say to a non-trivial degree that both the American major parties are Waluigis to each other. Yet here we go again with another election. One of them will win.
Conclusion
I offer this as a reference point. These developments greatly sadden me.
I do not want to be taking this level of psychic damage to report on what is happening. I do not want to see people with whom I agree on most of thee important things outside of AI, whose voices are badly need on a wide variety of issues strangling our civilization, to discard all nuance and civility, and then to focus entirely on one of the few places in which their position turns out to be centrally wrong. I tried to make a trade offer. What we all got back was, essentially, a further declaration of ad hominem internet war against those who dare try to advocate for preventing the deaths of all the humans.
I still hold out hope that saner heads will prevail and trade can occur. Vitalik’s version of techo-optimism presented a promising potential path forward, from a credible messenger, embraced quite widely.
E/acc has successfully raised its voice to such high decibel levels by combining several exclusive positions into one label in the name of hating on the supposed other side [Editor’s Note: I expanded this list on 12⁄7]:
Those like Beff Jezos, who think human extinction is an acceptable outcome.
Those who think that technology always works out for the best, that superintelligence will therefore be good for humans.
Those who do not believe actually in the reality of a future AGI or ASI, so all we are doing is building cool tools that provide mundane utility, let’s do that.
Related to previous: Those who think that the wrong human having power over other humans is the thing we need to worry about.
More specifically: Those who think that any alternative to ultimately building AGI/ASI means a tyranny or dystopia, or is impossible, so they’d rather build as fast as possible and hope for the best.
Or: Those who think that even any attempt to steer or slow such building, or sometimes even any regulatory restrictions on building AI at all, would constitute a tyranny or dystopia so bad it is instead that any alternative path is better.
Or: They simply don’t think smarter than human, more capable than human intelligences would perhaps be the ones holding the power, the humans would stay in control, so what matters is which humans that is.
Those who think that the alternative is stagnation and decline, so even some chance of success justifies going fast.
Those who think AGI or ASI is not close, so let’s worry about that later.
Those who want to, within their cultural context, side with power.
Those who don’t believe they like being an edge lord on Twitter.
Those who personally want to live forever, and see this as their shot.
Those deciding based on vibes and priors, that tech is good, regulation bad.
The degree of reasonableness varies greatly between these positions.
I believe that the majority of those adapting the e/acc label are taking one of the more reasonable positions. If this is you, I would urge you, rather than embracing the e/acc label, to if desired instead state your particular reasonable position and your reasons for it, without conflating it with other contradictory and less reasonable positions.
Or, if you actually do believe, like Beff Jezos, in a completely different set of values from mine? Then Please Speak Directly Into This Microphone.
A friend of mine had this to say on discord when I shared this link. The opinions I paste here are very much not my own, but I thought it would be useful to share, so I asked for permission to share it, and here it is. Perhaps it will be useful in some way—I have the hunch that it’s at least useful to know this is someone’s opinion.
I can see that tweet implying some trans-humanist position, not necessarily extinction. I think he is about to have a debate with Connor Leahy so it will all be cleared up.
Indeed. This whole post shows a great deal of incuriosity of as to what Beff thinks, spending a lot of time on, for instance, what Yudkowsky thinks Beff thinks.
If you’d prefer to read an account of Beff’s views from himself, take a look at the manifesto
Some relevant sections, my emphasis:
Some chunk of the hatred may… be a terminological confusion. I’d be fine existing as an upload; by Beff’s terminology that would be posthuman and NOT transhuman, but some would call it transhuman.
Regardless, note that the accusation that he doesn’t care about consciousness just seems literally entirely false.
Welp, that looks like one central crux right there:
I think the most important thing to note is that this hasn’t been part of enough discussions to make it into Zvi’s summary. What is happening sounds like the worst sort of polarization. Ad hominem attacks create so much mutual irritation that the discussion bogs down entirely. This effect is surprisingly powerful. I think polarization is the mind-killer.
On the object level, I think that question is interesting, and not clear-cut.
Beff deliberately obscures any actual point he has with layers of irony and nonsense technobabble. The text you’ve quoted has numerous possible readings because of this, some of them involving “consciousness” and some of them not.
I don’t know how deliberate it is. Tons of influential people in addition to Bezos, like John Mearsheimer, Joseph Nye, Glenn Greenwald etc, grew up and spent their entire lives and careers without exposure to stuff like the Sequences or even HPMOR.
And it really, really shows; they regularly lapse into and out of almost Tyler Durden-level incoherence (e.g. “You are not your job, you’re not how much money you have in the bank. You are not the car you drive. You’re not the contents of your wallet. You are not your fucking khakis. You are all singing, all dancing crap of the world.”).
Citation needed. Honestly, citation needed on both sides of that debate, because I haven’t seen a bunch of evidence or even really falsifiable predictions in support of the view that “zombies” have an advantage either. Seems like the sort of argument that might be resolved by going out and looking at the world and doing experiments.
I think conscious systems have an advantage. We weren’t given consciousness as a gift from god. We have it because it was the shortest route, at least in evolution, to our abilities.
My brief post Sapience, understanding, and “AGI” tries to elucidate the cognitive advantages of self-awareness.
But of course it depends what exactly you mean by consciousness. I certainly don’t think fun or happiness is a large advantage. So even if it is true, it’s not much help if you care about human-like conscious experiences.
I haven’t either, but Blindsight is a great novel about that :)
Interesting. I can’t stand Twitter for exactly this reason, so it’s valuable to have a summary of the public debate happening there.
I think your summary statement would be really valuable to inject into the public debate. Probably you’ve already done that. If anyone bothered to follow that advice, (or everyone asked in every exchange), it would reduce polarization and thereby improve discussion dramatically.
It looks like you’ve got his position wrong according to the quote from his manifesto in another comment on this post. I think you not knowing his position after spending that much time on it is a product of the tone of the debate.
Polarization is the mind-killer.
with caveats (specifically, related to societal trauma, existing power structures, and noosphere ecology) this is pretty much what I actually believe. Scott Aaronson has a good essay that says roughly the same things. The actual crux of my position is that I don’t think the orthogonality thesis is a valid way to model agents with varying goals and intelligence levels.
Since you didn’t summarize the argument in that essay, I went and skimmed it. I’d love to not believe the orthogonality thesis.
I found no argument. The content was “the orthogonality thesis isn’t necessarily true”. But he did accept a “wide angle”, which seems like it would be plenty for standard doom stories. “Human goals aren’t orthogonal” was the closest to evidence. That’s true, but evolution carefully gave us our goals/values to align us with each other.
The bulk was an explicit explanation of the emotional pulls that made him want to not believe in the orthogonality thesis. Then he visibly doesn’t grapple with the actual argument.
Take the following hypothesis: Real world systems that have “terminal goals” of any kind will be worse at influencing the state of the world than ones that only express goal-directed behavior instrumentally. As such, over the long term, most influence on world state will come from systems that do not have meaningful “terminal goals”.
Would convincing evidence that that hypothesis is true count as “not believing in the orthogonality thesis”? I think I am coming around to a view that is approximately this.
I think this is an important question. I think it for the most part the answer is “no, the orthogonality thesis still importantly applies”. Functionally pursuing unaligned goals, competently, is what makes AI potentially dangerous. Whether or not those goals are terminal doesn’t matter much to us. What matters is whether they’re pursued far enough and competently enough to eliminate humanity.
I am curious about your argument for why AI with instrumental goals will be more capable than AI with terminal goals. It seems like terminal goals would have to be implemented as local instrumental goals anyway.
Agreed. But I think “the orthogonality thesis I’d true” is a load bearing assumption for the “build an aligned AI and have that aligned AI ensure that we are safe” approach.
As you say, terminal goals would have to be implemented as local instrumental goals.
“Make pared-down copies of yourself that are specialized to their local environment, that can themselves make altered copies of themselves” is likely one of the things that is locally instrumental across a wide range of situations.
Some instrumental goals will be more effective than others at propagating yourself or your descendants than others.
If your terminal goal conflicts with the goals that are instrumental for self-propagation, emphasizing the terminal goals less and the instrumental ones more will yield better local outcomes
Congratulations you now have selective pressure towards dropping any terminal goal that is not locally instrumental
That just seems like a reason that no agent with even medium-term goals should ever make agents that can copy and modify themselves. They will change, multiply, and come back to out-compete or outright fight you. It will take a little time, so if your goals are super short term maybe you don’t care. But for even medium-term goals, it just seems like an errror to do that.
If I’m an AI making subagents, I’m going to make damn sure they’re not going to do multiply and change their goals.
I predict that that viewpoint is selected against in competitive environments (“instrumentally divergent”?).
I didn’t say “I have a happy reason not to believe the orthogonality thesis”.
That makes sense. I’ll add that to my list of reasons that competition that’s not carefully controlled is deathly dangerous.
With the caveat that some humans do have goals and values orthogonal to those of other humans. The result is generally some combination of shunning, exile, imprisonment, killing, fining, or other sanctions, as determined by whichever side has more power to impose its will.
Right—so humans having different goals seems like evidence that AGIs would have even more different goals by default without the evolutionary pressures.
Agreed, yes.
This is such a bizarre position that it’s hard for me to empathize. What would “the orthogonality thesis is false” even mean? Do you think aliens with different biology and evolutionary history “naturally” create humanoid societies?
One example of the orthogonality thesis being false would be “acting on terminal goals is instrumentally harmful in a wide range of situations, and having to maintain terminal goals that are not acted and track whether it is time to act on them on imposes costs, and so agents that have terminal goals will be outcompeted by ones that don’t”.
You might believe that the orthogonality thesis is probabilistically false, in that it is very unlikely for intelligent beings to arise that highly value paperclips or whatever. Aliens might not create humanoid societies but it seems plausible that they would likely be conscious, value positive valence, have some sort of social emotion suite, value exploration and curiosity, etc.
Assume that our universe is set up the way you believe it is, ie: the orthogonality thesis is false, sufficiently intelligent agents automatically value the welfare of sentient minds.
In spite of our assumption we can create a system behaving exactly like a misaligned AI would in the following way:
The superintelligent AI is created and placed in a simulation without its knowledge. This superintelligent AI by assumption is aligned with human values.
The user outside the simulation gives a goal (which is not necessarily aligned to human values, eg: ‘make paperclips’) to the system the following way:
Every timestep the aligned AI in the simulation is asked to predict the behavior of a (to its knowledge) hypothetical AI with the user’s goal and situation corresponding to the situation of the system outside the simulation.
Then the system behaves as given by the simulated superintelligent aligned AI and the simulated AI’s memory is reset.
This setup requires a few non-trivial components apart from the simulated SAI:
a component simulating the world of the SAI and setting that up to give the aligned AI incentive to answer the ‘hypothetical’ questions without letting it know that its in a simulation
a component translating the SAI’s answers to the real world
If you don’t deny that any of these components is theoretically possible, then how is it possible for you to believe that a misaligned superintelligent system is impossible?
If you believe that a misaligned superintelligent system is indeed possible in theory, then what is the reason you believe that gradient descent/RLHF or some other way we will use to create AIs will result in ones considerate of the welfare of sentient minds?
Having a debate that comes to the result that they agree on the outcomes might be quite valuable.
Thank you for your work. I am often amazed by the effort you pour in your regular posting, and I see you as among the highest value contributors to LW.
Two minor nitpicks:
Maybe format this part like other quotations? I was confused for a second there.
Meh. Groups of people sharing a common meme certainly do all these things. I see little point in arguing the precise semantics of “movement” and I do not particularly think Solana’s message is honest. But I would like to register that I don’t see any intrisic contradiction in “X is a rare meme that turns those who integrate it to their thinking into rude unthinking fanatics, but it did not create a movement yet”.
Thanks, I appreciate it. It must be very difficult. Please don’t lose your patience.
It doesn’t seem that likely that first AGI is going to be conscious in nature, so if e/acc is about “preserving the light of consciousness” this seems to be a reason not to push for the creation of AGI as fast as possible form their point of view.