Meta-note related to the question: asking this question here, now, means you’re answer will be filtered for people who stuck around with capital r Rationality and the current LessWrong denizens, not the historical ones who have left the community. But I think that most of the interesting answers you’d get are from people who aren’t here at all or rarely engage with the site due to the cultural changes over the last decade.
iceman
OK, but we’ve been in that world where people have cried wolf too early at least since The Hacker Learns to Trust, where Connor doesn’t release his GPT-2 sized model after talking to Buck.
There’s already been a culture of advocating for high recall with no regards to precision for quite some time. We are already at the “no really guys, this time there’s a wolf!” stage.
Right now, I wouldn’t recommend trying either Replika or character.ai: they’re both currently undergoing major censorship scandals. character.ai has censored their service hard, to the point where people are abandoning ship because the developers have implemented terrible filters in an attempt to clamp down on NSFW conversations, but this has negatively affected SFW chats. And Replika is currently being investigated by the Italian authorities, though we’ll see what happens over the next week.
In addition to ChatGPT, both Replika and character.ai are driving towards people towards running their own AIs locally, AI non-proliferation is probably not in the cards now. /g/ has mostly coalesced around pygmalion-ai, but the best model they have is a 6B. As you allude to in a footnote, I am deliberately not looking at this tech until it’s feasible to run locally because I don’t want my waifu to disappear.
(More resources: current /g/ thread, current /mlp/ thread)
Didn’t read the spoiler and didn’t guess until half way through “Nothing here is ground truth”.
I suppose I didn’t notice because I already pattern matched to “this is how academics and philosophers write”. It felt slightly less obscurant than a Nick Land essay, though the topic/tone aren’t a match to Land. Was that style deliberate on your part or was it the machine?
Like things, simulacra are probabilistically generated by the laws of physics (the simulator), but have properties that are arbitrary with respect to it, contingent on the initial prompt and random sampling (splitting of the timeline).
What do the smarter simulacra think about the physics of which they find themselves in? If one was very smart, could they look at what the probabilities of the next token, and wonder about why some tokens get picked over others? Would they then wonder about how the “waveform collapse” happens and what it means?
While it’s nice to have empirical testbeds for alignment research, I worry that companies using alignment to help train extremely conservative and inoffensive systems could lead to backlash against the idea of AI alignment itself.
On the margin, this is already happening.
Stability.ai delayed the release of Stable Diffusion 2.0 to retrain the entire system on a dataset filtered without any NSFW content. There was a pretty strong backlash against this and it seems to have caused a lot of people to move towards the idea that they have to train their own models. (SD2.0 appeared to have worse performance on humans, presumably because they pruned out a large chunk of pictures with humans in it since they didn’t understand how the range of the LAION
punsafe
classifier, and the evidence of this is in the SD2.1 model card where they fine tuned 2.0 with a radically differentpunsafe
value.)I know of at least one 4x A100 machine that someone purchased for fine tuning because of just that incident, and have heard rumors of a second. We should expect censored and deliberately biased models to lead to more proliferation of differently trained models, compute capacity, and the expertise to fine tune and train models.
Zack’s series of posts in late 2020/early 2021 were really important to me. They were a sort of return to form for LessWrong, focusing on the valuable parts.
What are the parts of The Sequences which are still valuable? Mainly, the parts that build on top of Korzybski’s General Semantics and focus hard core on map-territory distinctions. This part is timeless and a large part of the value that you could get by (re)reading The Sequences today. Yudkowsky’s credulity about results from the social sciences and his mind projection fallacying his own mental quirks certainly hurt the work as a whole though, which is why I don’t recommend people read the majority of it.
The post is long though, but it kind of has to be. For reasons not directly related to the literal content of this essay, people seem to have collectively rejected the sort of map-territory thinking that we should bring from The Sequences into our own lives. This post has to be thorough because there are a number of common rejoinders that have to be addressed. This is why I think this post is better for inclusion than something like Communication Requires Common Interests or Differential Signal Costs, which is much shorter, but only addresses a subset of the problem.
Since the review instructions ask how this affected my thinking, well...
Zack writes generally, but he writes because he believes people are not correctly reasoning in a current politically contentious topic. But that topic is sort of irrelevant: the value comes in pointing out that high status members of the rationalist community are completely flubbing lawful thinking. That made it thinkable that actually, they might be failing in other contexts.
Would I have been receptive to Christiano’s point that MIRI doesn’t actually have a good prediction track record had Zack not written his sequence on this? That’s a hard counterfactual, especially since I had already lost a ton of respect for Yudkowsky by this point, in part because of the quality of thought in his other social media posting. But I think it’s probable enough and these series of posts certainly made the thought more available.
- Jan 5, 2023, 9:07 PM; 3 points) 's comment on The 2021 Review Phase by (
- Jan 27, 2023, 11:35 PM; 2 points) 's comment on Highlights and Prizes from the 2021 Review Phase by (
The funny thing is that I had assumed the button was going to be buggy, though I was wrong how. The map header has improperly swallowed mouse scroll wheel events whenever it’s shown; I had wondered if the button would also interpret them likewise since it was positioned in the same way, so I spent most of the day carefully dragging the scrollbar.
There must be some method to do something, legitimately and in good-faith, for people’s own good.
“Must”? There “must” be? What physical law of the universe implies that there “must” be...?
Let’s take the local Anglosphere cultural problem off the table. Let’s ignore that in the United States, over the last 2.5 years, or ~10 years, or 21 years, or ~60 years (depending on where you want to place the inflection point), social trust has been shredded, policies justified under the banner of “the common good” have primarily been extractive and that in the US, trust is an exhausted resource. Let’s ignore that OP is specifically about trying to not make one aspect of this problem worse. Let’s ignore that high status individuals in the LessWrong and alignment community have made statements about whose values are actually worthwhile, in an public abandonment of the neutrality of CEV which might have made some sort of deal thinkable. Let’s ignore that because that would be focusing on one local culture in a large multipolar world, and at the global scale, questions are even harder:
How do you intend to convince the United States Government to surrender control to the Chinese Communist Party, or vice versa, and form a global hegemon necessary to actually prevent research into AI? If you don’t have one control the other, why should either trust that the other isn’t secretly doing whatever banned AI research required the authoritarian scheme in the first place, when immediately defecting and continuing to develop AI has a risky, but high payout? If you do have one control the other, how does the subjugated government maintain the legitimacy with its people necessary to continue to be their government?
How do you convince all nuclear sovereign states to sign on to this pact? What do you do with nations which refuse? They’re nuclear sovereign states. The lesson of Gaddafi and the lesson of Ukraine is that you do not give up your deterrent no matter what because your treaty counterparties won’t uphold their end of a deal when it’s inconvenient for them. A nuclear tipped Ukraine wouldn’t have been invaded by Russia. There is a reason that North Korea continues to exist. (Also, what do you do when North Korea refuses to sign on?)
This seems mostly wrong? A large portion of the population seems to have freedom/resistance to being controlled as a core value, which makes sense because the outside view on being controlled is that it’s almost always value pumping. “It’s for your own good,” is almost never true and people feel that in their bones and expect any attempt to value pump them to have a complicated verbal reason.
The entire space of paternalistic ideas is just not viable, even if limited just to US society. And once you get to anarchistic international relations...
I agree that paternalism without buy-in is a problem, but I would note LessWrong has historically been in favor of that: Bostrom has weakly advocated for a totalitarian surveillance state for safety reasons and Yudkowsky is still pointing towards a Pivotal Act which takes full control of the future of the light cone. Which I think is why Yudkowsky dances around what the Pivotal Act would be instead: it’s the ultimate paternalism without buy-in and would (rationally!) cause everyone to ally against it.
What changed with the transformer? To some extent, the transformer is really a “smarter” or “better” architecture than the older RNNs. If you do a head-to-head comparison with the same training data, the RNNs do worse.
But also, it’s feasible to scale transformers much bigger than we could scale the RNNs. You don’t see RNNs as big as GPT-2 or GPT-3 simply because it would take too much compute to train them.
You might be interested in looking at the progress being made on the RWKV-LM architecture, if you aren’t following it. It’s an attempt to train an RNN like a transformer. Initial numbers look pretty good.
I think the how-to-behave themes of the LessWrong Sequences are at best “often wrong but sometimes motivationally helpful because of how they inspire people to think as individuals and try to help the world”, and at worst “inspiring of toxic relationships and civilizational disintegration.”
I broadly agree with this. I stopped referring people to the Sequences because of it.
One other possible lens to filter a better Sequences: is it a piece relying on Yudkowsky citing current psychology at the time? He was way too credulous, when the correct amount to update on most social science research of that era was: lol.
Concretely to your project above though: I think you should remove all of Why We Fight series: Something to Protect is Yudkowsky typical minding about where your motivation comes from (and is wrong, lots of people are selfishly motivated as if Tomorrow is The Gift I Give Myself), and I’ve seen A Sense That More is Possible invoked as Deep Wisdom to justify anything that isn’t the current status quo. Likewise, I think Politics is the Mind Killer should also be removed for similar reasons. Whatever its actual content, the phrase has taken on a life of its own and that interpretation is not helpful.
I want to summarize what’s happened from the point of view of a long time MIRI donor and supporter:
My primary takeaway of the original post was that MIRI/CFAR had cultish social dynamics, that this lead to the spread of short term AI timelines in excess of the evidence, and that voices such as Vassar’s were marginalized (because listening to other arguments would cause them to “downvote Eliezer in his head”). The actual important parts of this whole story are a) the rationalistic health of these organizations, b) the (possibly improper) memetic spread of the short timelines narrative.
It has been months since the OP, but my recollection is that Jessica posted this memoir, got a ton of upvotes, then you posted your comment claiming that being around Vassar induced psychosis, the karma on Jessica’s post dropped in half while your comment that Vassar had magical psychosis inducing powers is currently sitting at almost five and a half times the karma of the OP. At this point, things became mostly derailed into psychodrama about Vassar, drugs, whether transgender people have higher rates of psychosis, et cetera, instead of discussion about the health of these organizations and how short AI timelines came to be the dominant assumption in this community.
I do not actually care about the Vassar matter per say. I think you should try to make amends with him and Jessica, and I trust that you will attempt to do so. But all the personal drama is inconsequential next to the question of whether MIRI and CFAR have good epistemics and how the short timelines meme became widely believed. I would ask that any amends you try to make also address that your comment also derailed these very vital discussions.
That sort of thinking is why we’re where we are right now.
Be the change you wish to see in the world.
I have no idea how that cashes out game theoretically. There is a difference between moving from the mutual cooperation square to one of the exploitation squares, and moving from an exploitation square to mutual defection. The first defection is worse because it breaks the equilibrium, while the defection in response is a defensive play.
swarriner’s post, including the tone, is True and Necessary.
It’s just plain wrong that we have to live in an adversarial communicative environment where we can’t just take claims at face value without considering political-tribe-maneuvering implications.
Oh? Why is it wrong and what prevents you from ending up in this equilibrium in the presence of defectors?
More generally, I have ended up thinking people play zero-sum status games because they enjoy playing zero-sum status games; evolution would make us enjoy that. This would imply that coordination beats epistemics, and historically that’s been true.
[The comment this was a response to has disappeared and left this orphaned? Leaving my reply up.]
But there’s no reason to believe that it would work out like this. He presents no argument for the above, just pure moral platitudes. It seems like a pure fantasy.
As I pointed out in the essay, if I were running one of the organizations accepting those donations and offering those prizes, I would selectively list only those targets who I am genuinely satisfied are guilty of the violation of the “non-aggression principle.” But as a practical matter, there is no way that I could stop a DIFFERENT organization from being set up and operating under DIFFERENT moral and ethical principles, especially if it operated anonymously, as I anticipate the “Assassination Politics”-type systems will be. Thus, I’m forced to accept the reality that I can’t dictate a “strongly limited” system that would “guarantee” no “unjustified” deaths: I can merely control my little piece of the earth and not assist in the abuse of others. I genuinely believe, however, that the operation of this system would be a vast improvement over the status quo.
Bell’s organization acts as (a) where Bell’s organization can dictate who is and is not a valid moral target. If we are talking about purely anonymous uncontrolled markets (and I assume we both are, since I separated them from (a) and you’re referring to anonymous markets on Ethereum), then we should instead expect them to be used to usher in hell.
Mu.
The unpopular answer is that Dath Ilan is a fantasy setting. It treats economics as central, when economics is really downstream of power. Your first question implies you understand that whatever “econoliteracy” is, it isn’t a stable equilibrium. Your second question notices that governments are powerful enough to stop these experiments which are a threat to their power.
My background assumption is that any attempt at building prediction markets would either:
a) …have little effect because it becomes another mechanism for actual power to manipulate procedural outcomes, most likely through selective subsidies, manipulation of the monetary supply, or education or social pressure resulting in all right minded people voting the way power centers want (ie, how things work today).
b) …be used as a coordination points for a Point Deer Call Horse style coup (see also: how publicly betting in cockfights can be more about signaling alliances, not predictions).
c) …devolves into Jim Bell’s Assassination Markets because there actually isn’t a way for power elites to prevent some markets from being made (and we should expect any general way to prevent some markets being made to go back to (a)).
you just need to find the experts they’re anchoring on.
I believe we are in the place we are in because Musk is listening and considering the arguments of experts. Contra Yudkowsky, there is no Correct Contrarian Cluster: while Yudkowsky and Bostrom make a bunch of good and convincing arguments about the dangers of AI and the alignment problem and even shorter timelines, I’ve always found any discussion of human values or psychology or even how coordination works to be one giant missing mood.
(Here’s a tangential but recent example: Yudkowsky wrote his Death with Dignity post. As far as I can tell, the real motivating point was “Please don’t do idiotic things like blowing up an Intel fab because you think it’s the consequentialist thing to do because you aren’t thinking about the second order consequences which will completely overwhelm any ‘good’ you might have achieved.” Instead, he used the Death with Dignity frame which didn’t actually land with people. Hell, my first read reaction was “this is all bullshit you defeatist idiot I am going down swinging” before I did a second read and tried to work a defensible point out of the text.)
My model of what happened was that Musk read Superintelligence, thought: this is true, this is true, this is true, this point is questionable, this point is total bullshit...how do I integrate all this together?
This response is enraging.
Here is someone who has attempted to grapple with the intellectual content of your ideas and your response is “This is kinda long.”? I shouldn’t be that surprised because, IIRC, you said something similar in response to Zack Davis’ essays on the Map and Territory distinction, but that’s ancillary and AI is core to your memeplex.
I have heard repeated claims that people don’t engage with the alignment communities’ ideas (recent example from yesterday). But here is someone who did the work. Please explain why your response here does not cause people to believe there’s no reason to engage with your ideas because you will brush them off. Yes, nutpicking e/accs on Twitter is much easier and probably more hedonic, but they’re not convincible and Quinton here is.