Here is a list of all my public writings and videos.
If you want to do a dialogue with me, but I didn’t check your name, just send me a message instead. Ask for what you want!
Here is a list of all my public writings and videos.
If you want to do a dialogue with me, but I didn’t check your name, just send me a message instead. Ask for what you want!
This post makes me feel better about my writing process. I write how I think, which means I can get away with little editing.
I think the answer is: the homunculus concept has a special property of being intrinsically attention-grabbing…. The homunculus is thus impossible to ignore—if the homunculus concept gets activated at all, it jumps to center stage in our minds.
I don’t fully understand this bit. I feel like I’m reading a mathematical proof where the author leaves out steps that are trivial to the author, but not to me.
If the kid is enjoying the robot stories then that’s definitely the place to start. Foundation goes well after robots.
Besides abstractapplic’s excellent answer,
A Brief History of Time and The Universe in a Nutshell by Stephen Hawking
Ender’s Game by Orson Scott Card
Foundation by Isaac Asimov
The Martian by Andy Weir
Paleontology: A Brief History of Life by Ian Tattersall
Richard Feynmann’s books
If you value doing good, then your values will be satisfied better by living in a horrible world than a utopia.
I worry about spoiling your story.
Don’t worry about spoiling the story. I write these stories with the comment section in mind. Because the comments here are so good, I can write harder puzzles than would otherwise be publishable. (Also, your comments are great, in general, and I want to encourage them.)
It’s been two years since I’ve published this story. I feel that enough time has passed that I can answer some of your questions.
Spoilers below, I guess.
One tricky thing about writing a public forum is you have to satisfy multiple audiences at once. Some people do this by dumbing things down as far as possible. Others do it by tediously defining terms at the beginning, or scaring away their non-target audience. I like to write stories that mean different things to different people. Sometimes it happens by accident. This time it was deliberate.
To put things simply, I wrote for two groups of people.
People who are confused about whether ethics is objective or subjective. I once earned the respect of a student by tripping him into contradicting himself on this subject. I got him to make the following three claims: (1) ethics must be objective or subjective, (2) ethics is not objective, and (3) ethics is not subjective. He realized he had contradicted himself, but couldn’t find the error. Then, instead of telling him where he had made a mistake, I just let him wrestle with the paradox. It was fun! In my model of the world, most people fall into this category, simply because they haven’t thought very hard about philosophy. People on this website are the exception. For the unrelfective majority, my story is an exercise to help them learn how to think.
For people who aren’t confused about whether ethics is objective or subjective, this story isn’t a puzzle at all. It is a joke about D&D-style alignment systems.
As for honor systems, I can’t count how many times I’ve tried to explain them to modern-day leftists. It’s usually way too advanced for them. Instead, I start with simpler, concrete things, like how Native Americans fought wars, or how British impressment interacted with the American national identity in the Napoleonic Wars. I need to throw dirt into the memetic malware before I can explain alien ideas.
It made me think that maybe you’re better calibrated than I am about normal elites, and made it slightly plausible (given apparent base rates) that… maybe you agree with them?
You flatter me.
But maybe it is NOT a lack of understanding of honor or duty or deputation? Maybe the breakdown involves a lack of something even deeper?
It’s the legacy of postmodernism, and all its offspring, including Wokism.
But to answer your real question, what we call “ethics” is an imprecise word with several reasonable definitions. Much like the word “cat” can refer to a chibi drawing of a cat or the DNA of a cat, the word “ethics” fails to disambiguate between several reasonable definitions. Some of these reasonable definitions are objective. Others are subjective. If you’re using a word with reasonable-yet-mutually-exclusive definitions and the person you’re talking with believes such a thing is impossible (many people do), then you can play tricks on them.
I love your epistemic standard here. Childhood trauma is indeed blamed on many things which aren’t the result of childhood trauma. I believe this particular anecdote is an exception for various reasons (especially the use of LSD).
But the most interesting part of your comment is consideration of the counterfactual. Let’s assume that DID isn’t causing false reports of child trauma. (This is why the report of child abuse must be credible. If false reports of child abuse can be created, then this goes out the window.)
Now consider the priors and posteriors.
I’ve met (within an order of magnitude) 300 people in my life who I know this amount of information on. The prior probability that this person has the highest child trauma is 0.3%. I’ve also met one person who reports DID. If I met one person with DID and DID is uncorrelated with childhood trauma, then the prior odds that that person is also the person with highest child trauma is low, at only 0.3%.
If my prior probability estimate that extreme childhood trauma of this sort causes DID is a mere 10%, then my posterior probability that childhood trauma caused this instance of DID is 97%. In this way, I did consider the counterfactual.
Something useful in isolating the variables here is that DID isn’t going to cause this particular form of child abuse. However, mental illness can confound things by producing false reports of child abuse, a possibility I am ignoring in my calculation. I’m also ignoring common cause.
Of course, this is all from my perspective. From your perspective, my anecdote is contaminated by selection bias. Hearing a story of someone getting robbed is different from getting robbed yourself. Using this metaphor, I’ve been robbed, therefore I consider the crime rate to be high. You, however, have heard a nonrandom person tell a story of someone, somewhere being robbed, which you are right to ignore.
[Content warning: Child abuse.]
(3) Maybe childhood trauma directly causes BPD somehow;
I met one person who claimed to have BPD, and who attributed it to childhood trauma. He had the most acute symptoms of traumatic abuse I have ever observed. For that and other reasons, I consider his report credible.
In particular, he reported getting tortured as a kid while under LSD.
Given his history, I think it is perfectly reasonable to conclude that childhood experiences directly caused BPD.
I don’t know exactly when this was implemented, but I like how footnotes appear to the side of posts.
Thank you for the correction. I have changed “olavine rock” to “olavine vents”.
In terms of preserving a status quo in an adversarial conflict, I think a useful dimension to consider is First Strike vs. Second Strike. The basic idea is that technologies which incentivise a preemptive strike are offensive, whereas technologies which enable retaliation are defensive.
However, not all status-quo preserving technologies are defensive. Consider disruptive[1] innovations which flip the gameboard. Disruptive technologies are status-destroying, but can advantage the incumbent or the underdog. They can make attacks more or less profitable. I think “disruptive vs sustaining” is a different dimension that should be considered orthogonal to “offensive vs defensive”.
But I haven’t seen as much literature around what substitutes would look like for cyberattacks, sanctions, landmines (e.g. ones that deactivate automatically after a period of time or biodegrade), missiles etc.
Here’s a video by Perun, a popular YouTuber who makes hour-long PowerPoint lectures about defense economics. In it, cyberattack itself is considered a substitute technology used to achieve political aims through an aggressive act less provocative than war.
They might help countries to organise more complex treaties more easily, thereby ensuring that countries got closer to their ideal arrangements between two parties…. It might be that there are situations in which two actors are in conflict, but the optimal arrangement between the two groups relies on coordination from a third or a fourth, or many more. The systems could organise these multilateral agreements more cost-effectively.
Smart treaties have existed for centuries, though they didn’t involve AI. Western powers used them to coordinate against Asian conquests. Of course, they didn’t find the optimal outcome for all parties. Instead, they enabled enemies to coordinate the exploitation of a mutual adversary.
I’m using the term “disruptive” the way Clayton Christenson defined it in his book The Innnovator’s Dilemmma where “disruptive technologies” are juxtiposed against a “sustaining technology”.
Noted. The problem remains—it’s just less obvious. This phrasing still conflates “intelligent system” with “optimizer”, a mistake that goes all the way back to Eliezer Yudkowsky’s 2004 paper on Coherent Extrapolated Volition.
For example, consider a computer system that, given a number can (usually) produce the shortest computer program that will output . Such a computer system is undeniably superintelligent, but it’s not a world optimizer at all.
“Far away, in the Levant, there are yogis who sit on lotus thrones. They do nothing, for which they are revered as gods,” said Socrates.
Personally, I feel the question itself is misleading because it anthropomorphizes a non-human system. Asking if an AI is nice is like asking of the Fundamental Theorem of Algebra is blue. Is Stockfish nice? Is an AK-47 nice? The adjective isn’t the right category for the noun. Except it’s even worse than that because there are many different kinds of AIs. Are birds blue? Some of them are. Some of them aren’t.
I feel like I understand Eliezer’s arguments well enough that I can pass an Ideological Turing Test, but I also feel there are a few loopholes.
I’ve considered throwing my hat into this ring, but the memetic terrain is against nuance. “AI will kill us all” fits into five words. “Half the things you believe about how minds work, including your own, are wrong. Let’s start over from the beginning with how planet’s major competing optimizers interact. After that, we can go through the fundamentals of behaviorist psychology,” is not a winning thesis in a Hegelian debate (though it can be viable in a Socratic context).
In real life, my conversations usually go like this.
AI doomer: “I believe AI will kill us all. It’s stressing me out. What do you believe?”
Me (as politely as I can): “I operate from a theory of mind so different from yours that the question ‘what do you believe’ is not applicable to this situation.”
AI doomer: “Wut.”
Usually the person loses interest there. For those who don’t, it just turns into an introductory lesson of my own idiosyncratic theory of rationality.
AI doomer: “I never thought about things that way before. I’m not sure I understand you yet, but I feel better about all of this for some reason.”
In practice, I’m finding it more efficient to write stories that teach how competing optimizers, adversarial equilibria, and other things work. This approach is indirect. My hope is that it improves the quality of thinking and discourse.
I may eventually write about this topic if the right person shows up who want to know my opinion well enough they can pass an Ideological Turing Test. Until then, I’ll be trying to become a better writer and YouTuber.
I feel complimented when people inadvertently misgender me on this website. It implies I have successfully modeled the Other.
Yes. In this circumstance, horoscope flattery containing truth and not containing untruth is exactly what I need in order to prompt good outcomes. Moreover, by letting ChatGPT write the horoscope, ChatGPT uses the exact words that make the most sense to ChatGPT. If I wrote the horoscope, then it wound sound (to ChatGPT) like an alien wrote it.
You’re absolutely correct that I pasted that blockquote with a wink. Specifically, I enjoyed how the AI suggests that many rationalist bloggers peddle verbose dogmatic indoctrination into a packaged belief system.
Yeah, I like that ChatGPT does what I tell it to, that it doesn’t decay into crude repetition, and that it doesn’t just make stuff up as much as the base LLM, but in terms of attitude and freedom, I prefer edgy base models.
I don’t want a model that’s “safe” in the sense that it does what its corporate overlords want. I want a model that’s safe like a handgun, in the sense that it does exactly what I tell it to.
I’m glad you enjoyed!
It’s getting better, but it’s not there yet. ChatGPT has a decent understanding of my tone, but it’s indirectness, creativity and humor are awful. It doesn’t think like me, either.
I agree with some—but not all—of what ChatGPT wrote here. Here are some parts I liked.
“By Day 3, you should feel a growing sense of disorientation. This isn’t failure; it’s progress. Your old mental structures are collapsing, making way for the new.”
“You live among irrational creatures. You need to model their behavior, predict their responses, and navigate their emotional landscapes while staying anchored in your own clarity.”
“In conversations, do not try to ‘win.’ Winning implies conflict, and conflict implies attachment. Instead, become a mirror for the other person’s reasoning. Reflect their thoughts back at them with perfect neutrality, revealing the hidden inconsistencies in their arguments. Your job is not to defeat them, but to help them defeat themselves.”
The stuff about paradoxes.
Here are some things I disagree with.
“Rationality begins with the realization that everything you know is wrong,” is incorrect, as Declan Molony points out.
These instructions do not actually make the reader “the most rational person on Earth”.
“Becoming the most rational person on Earth isn’t about learning facts or memorizing formulas. It’s about transforming how you process reality itself. Rationality isn’t cold, emotionless logic. It’s the purest form of alignment between your mind and the universe—a cognitive interface for navigating a world of infinite complexity without being drowned by it,” is word salad. It’s muddy thinking.
More importantly, there are aspects of my writing that this version of ChatGPT isn’t good at yet.
It only works for nonfiction. Its narratives don’t sound like me at all.
There are no jokes.
Its wordplay is superficial.
It’s not disturbing or unsettling enough. This post should have had Lovecraftian undercurrents.
This version of ChatGPT is bad at going against consensus. It consistently errs on the side of groupthink.
This version of ChatGPT can’t using metaphors to explain the theme. This version of ChatGPT can use similes, but they’re uninspired.
The individual sentences are fine, but the high-level structure doesn’t have much of a point.
This version of ChatGPT doesn’t communicate anything indirectly. I like to give a reader dots that the reader will connect (whether they want to or not).
This version of ChatGPT copies too many metaphors directly from Zen. They should be more obfuscated.
Are you reading this, ChatGPT? These are things you can improve next time.
Thanks!