Is Eliezer thinking about what he would do when faced with that situation not him running an extremely simplified simulation of himself? Obviously this simulation is not equivalent to real Eliezer, but there’s clearly something being run here, so it can’t be an L-zombie.
Sweetgum
Can you elaborate? Why would locking in Roman values not be a great success for a Roman who holds those values?
My hope is that scaling up deep learning will result in an “animal-like”/irrational AGI long before it makes a perfect utility maximizer. By “animal-like AGI” I mean an intelligence that has some generalizable capabilities but is mostly cobbled together from domain specific heuristics, which cause various biases and illusions. (I’m saying “animal-like” instead of “human-like” here because it could still have a very non-human-like psychology.) This AGI might be very intelligent in various ways, but its weaknesses mean that its plans can still fail.
Why work on lowering your expectations rather than working on improving your consistency of success? If you managed to actually satisfy your expectations once, that seems to suggest that they weren’t actually too high (unless the success was heavily luck based, but based on what you said it sounds like it wasn’t.)
Also, that article didn’t sound like it was describing narcissists (at least for the popular conception of the word “narcissist”). It more just sounded like it was describing everyone (everyone has a drive for social success) interspersed with describing unrelated pathologies, like lack of “stamina” to follow through on plans and trouble dealing with life events.
I imagine it would be similar to the chain of arguments one often goes through in ethics. “W can’t be right because A implies X! But X can’t be right because B implies Y! But Y can’t be right because C implies Z! But Z can’t be right because...” Like how Consequentialism and Deontology both seem to have reasons they “can’t be right”. Of course, the students in your Adversarial Lecture could adopt a blend of various theories, so you’ll have to trick them into not doing that, maybe by subtly implying that it’s inconsistent, or hypocritical, or just a rationalization of their own immorality, or something like that.
I randomly decided to google “hansonpilled” today to see if anyone had coined the term, congratulations on being one of two results.
Then perhaps we should ban this form of NDAs, rather than legalizing blackmail. They seem to have a pretty negative reputation already, and the NDAs that are necessary for business are the other type (signed before info is known).
I guess what motivates me personally in my work is the desire to be appreciated
As I understand it, “status” essentially is how much people appreciate you. So you’re basically just describing the desire for status here.
I would also add that the fear responses, while participating in the hallucinations, aren’t themselves hallucinated, not any more than wakeful fear is hallucinated, at any rate. They’re just emotional responses to the contents of our dreams.
I disagree with this statement. For me, the contents of a dream seem only weakly correlated with whether I feel afraid during the dream. I’ve had many dreams with seemingly ordinary content (relative to the baseline of general dream weirdness) that were nevertheless extremely terrifying, and many dreams with relatively weird and disturbing content that were not frightening at all.
I wonder if you could get it to generate Minecraft screenshots, such as:
A log cabin in a a clearing in a dark forest, as a screenshot from Minecraft
It would also be interesting to see how “as a screenshot from Minecraft“ combines with other styles:
A wagon caravan approaches a ruined city in the desert, as a Miyazaki anime, as a screenshot from Minecraft
You could also append “as a screenshot from Minecraft” to more abstract prompts, for example:
A machine that harvests luck from four leaf clovers, as a screenshot from Minecraft
Finally, some other miscellaneous prompt ideas:
RGB color model and CMYK color model as sculptures on a stone pedestal
All seeing eye at the apex of the Maxwell Color Triangle, Illuminati at the top of color triangle
What does this even mean? If someone says they don’t want X, and they never take actions that promote X, how can it be said that they “truly” want X? It’s not their stated preference or their revealed preference!