LessWrong team member / moderator. I’ve been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I’ve been interested in improving my own epistemic standards and helping others to do so as well.
Raemon(Raymond Arnold)
This was also my read (and, while I don’t have links onhand and might be misremembering, I think he has other twitter threads that basically state this explicitly)
What has come of it? Besides perhaps some fun prediction websites.
fwiw I actually think the state of prediction markets has, at this point, meaningfully improved my decisionmaking. (i.e. during the Ukraine nuclear scare, having explicit prediction markets tracking likelihood of nuclear strike made it a lot easier to set triggers for evacuation plans. And seeing the prediction markets on various AI related capabilities has been helpful for orienting to the world)
I’m around ~40% on “4 years from now, I’ll think it was clearly the right call for alignment folk to just stop working at OpenAI, completely.”
But, I think it’s much more likely that I’ll continue endorsing something like “Treat OpenAI as a manipulative adversary by default, do not work there or deal with them unless you have a concrete plan for how you are being net positive. And because there’s a lot of optimization power in their company, be pretty skeptical that any plans you make will work. Do not give them free resources (like inviting them to EA global or job fairs)”.
I think it’s nonetheless good to have some kind of “stated terms” for what actions OpenAI / Sam etc could take that might make it more worthwhile to work with them in the future (or, to reduce active opposition to them). Ultimately, I think OpenAI is on track to destroy the world, and I think actually stopping them will somehow require their cooperation at some point. So I don’t think I’d want to totally burn bridges.
But I also don’t think there’s anything obvious Sam or OpenAI can do to “regain trust.” I think the demonstrated actions with the NDAs, and Sam’s deceptive non-apology, means they’ve lost the ability to credibly signal good faith.
...
Some background:
Last year, when I was writing “Carefully Bootstrapped Alignment” is organizationally hard, I chatted with people at various AI labs.
I came away with the impression that Anthropic kinda has a culture/leadership that (might, possibly) be worth investing in (but which I’d still need to see more proactive positive steps to really trust), and that DeepMind was in a weird state where it’s culture wasn’t very unified, but the leadership seemed at least vaguely in the right place.
I still had a lot of doubts about those companies, but when I talked to people I knew there, I got at least some sense that there was an internal desire to be safety-conscious.
When I talked to people at OpenAI, the impression I came away with was “there’s really no hope of changing the culture there. Do not bother trying.”
(I think the people I talked to at all orgs were generally not optimistic about changing culture, and instead more focused on developing standards that could eventually turn into regulations, which would make it harder for the orgs to back out of agreements)
That was last year, before the seriousness of the Nondisparagement clauses and the pressure put on people became more clear cut. And, before reading Zach’s post about AI companies aren’t really using external evaluators.
I’m reading this as you saying something like “I’m trying to build a practical org that successfully onramps people into doing useful work. I can’t actually do that for arbitrary domains that people aren’t providing funding for. I’m trying to solve one particular part of the problem and that’s hard enough as it is.”
Is that roughly right?
Fwiw I appreciate your Manifund regrantor Request for Proposals announcement.
I’ll probably have more thoughts later.
Thanks for the thoughts (no need to be nervous about arguing against a post – that’s kinda the whole point of the site)
For an example of what I mean, here’s another post on a pretty similar subject, by someone with experience seeing how it played out at different large companies (Dan Luu)
One thing it took me quite a while to understand is how few bits of information it’s possible to reliably convey to a large number of people. When I was at MS, I remember initially being surprised at how unnuanced their communication was, but it really makes sense in hindsight.
For example, when I joined Azure, I asked people what the biggest risk to Azure was and the dominant answer was that if we had more global outages, major customers would lose trust in us and we’d lose them forever, permanently crippling the business.
Meanwhile, the only message VPs communicated was the need for high velocity. When I asked why there was no communication about the thing considered the highest risk to the business, the answer was if they sent out a mixed message that included reliability, nothing would get done.
The fear was that if they said that they needed to ship fast and improve reliability, reliability would be used as an excuse to not ship quickly and needing to ship quickly would be used as an excuse for poor reliability and they’d achieve none of their goals.
When I first heard this, I thought it was odd, but having since paid attention to what happens when VPs and directors attempt to communicate information downwards, I have to concede that it seems like the MS VPs were right and nuanced communication usually doesn’t work at scale.
I’ve seen quite a few people in upper management attempt to convey a mixed/nuanced message since my time at MS and I have yet to observe a case of this working in a major org at a large company (I have seen this work at a startup, but that’s a very different environment).
I’ve noticed this problem with my blog as well. E.g., I have some posts saying BigCo $ is better than startup $ for p50 and maybe even p90 outcomes and that you should work at startups for reasons other than pay. People often read those posts as “you shouldn’t work at startups”.
I see this for every post, e.g., when I talked about how latency hadn’t improved, one of the most common responses I got was about how I don’t understand the good reasons for complexity. I literally said there are good reasons for complexity in the post!
As noted previously, most internet commenters can’t follow constructions as simple as an AND, and I don’t want to be in the business of trying to convey what I’d like to convey to people who won’t bother to understand an AND since I’d rather convey nuance
But that’s because, if I write a blog post and 5% of HN readers get it and 95% miss the point, I view that as a good outcome since was useful for 5% of people and, if you want to convey nuanced information to everyone, I think that’s impossible and I don’t want to lose the nuance
If people won’t read a simple AND, there’s no way to simplify a nuanced position, which will be much more complex, enough that people in general will follow it, so it’s a choice between conveying nuance to people who will read and avoiding nuance since most people don’t read
But it’s different if you run a large org. If you send out a nuanced message and 5% of people get it and 95% of people do contradictory things because they understood different parts of the message, that’s a disaster. I see this all the time when VPs try to convey nuance.
BTW, this is why, despite being widely mocked, “move fast & break things” can be a good value. It coneys which side of the trade-off people should choose. A number of companies I know of have put velocity & reliability/safety/etc. into their values and it’s failed every time.
MS leadership eventually changed the message from velocity to reliability First one message, then the next. Not both at once When I checked a while ago, measured by a 3rd party, Azure reliability was above GCP and close enough to AWS that it stopped being an existential threat
Azure has, of course, also lapped Google on enterprise features & sales and is a solid #2 in cloud despite starting with infrastructure that was a decade behind Google’s, technically. I can’t say that I enjoyed working for Azure, but I respect the leadership and learned a lot.
One motivating example at the time was seeing how the EA community organizers/leaders had lots of trouble communicating nuanced ideas. For example, “EA is talent constrained” was how a blogpost about “EA needs more extremely talented people in particular domains, more than it needs marginal money, right now”. But people heard it as “EA needs people who are talented… I’m talented!” and then felt frustrated when they tried to apply for jobs, but, actually, what the post originally meant was specific talent gaps.
<3
Yeah. This prompts me to make a brief version of a post I’d had on my TODO list for awhile:
“In the 21st century, being quick and competent at ‘orienting’ is one of the most important skills.”
(in the OODA Loop sense, i.e. observe → orient → decide → act)
We don’t know exactly what’s coming with AI or other technologies, we can make plans informed by our best-guesses, but we should be on the lookout for things that should prompt some kind of strategic orientation. @jacobjacob has helped prioritize noticing things like “LLMs are pretty soon going to be affect the strategic landscape, we should be ready to take advantage of the technology and/or respond to a world where other people are doing that.”
I like Robert’s comment here because it feels skillful at noticing a subtle thing that is happening, and promoting it to strategic attention. The object-level observation seems important and I hope people in the AI landscape get good at this sort of noticing.
It also feels kinda related to the original context of OODA-looping, which was about fighter pilots dogfighting. One of the skills was “get inside of the enemy’s OODA loop and disrupt their ability to orient.” If this were intentional on OpenAI’s part (or part of subconscious strategy), it’d be a kinda clever attempt to disrupt our observation step.
I agree with this overall point, although I think “trade secrets” in the domain of AI can be relevant for people having surprising timelines views that they can’t talk about.
Ah yeah, that actually seems like maybe a good format given that the event-in-question I’m preparing for is “a blogging festival”. There is trouble with (one of my goals) being “make something that makes for an interesting in-person-event” (we sorta made our jobs hard by framing an in-person-event around blogging, although I think something like “get two attendees to do this sort of debate framework beforehand, and then maybe have an interviewer/facilitator have a “takeaways discussion panel” might be good)
Copying the text here for convenience:
Here’s a debate protocol that I’d like to try. Both participants independently write statements of up to 10K words and send them to each other at the same time. (This can be done through an intermediary, to make sure both statements are sent before either is received.) Then they take a day to revise their statements, fixing the uncovered weak points and preemptively attacking the other’s weak points, and send them to each other again. This continues for multiple rounds, until both participants feel they have expressed their position well and don’t need to revise more, reaching a kind of Nash equilibrium. Then the final revisions of both statements are released to the public, side by side.
Note that in this kind of debate the participants don’t try to change each other’s mind. They just try to write something that will eventually sway the public. But they know that if they write wrong stuff that the other side can easily disprove, they won’t sway the public. So only the best arguments remain, within the size limit.
I think maybe things like this should just actually be “private tags” that are designed such that they don’t carry the weight of the site’s voice (which people have asked about over the years, for various reasons)
New concept for my “qualia-first calibration” app idea that I just crystallized. The following are all the same “type”:
1. “this feels 10% likely”
2. “this feels 90% likely”
3. “this feels exciting!”
4. “this feels confusing :(”
5. “this is coding related”
6. “this is gaming related”
All of them are a thing you can track: “when I observe this, my predictions turn out to come true N% of the time”.
Numerical-probabilities are merely a special case (tho it still gets additional tooling, since they’re easier to visualize graphs and calculate brier scores for)
And then a major goal of the app is to come up with good UI to help you visualize and compare results for the “non-numeric-qualia”.
Depending on circumstances, it might seem way more important to your prior “this feels confusing” than “this feels 90% likely”. (I’m guessing there is some actual conceptual/mathy work that would need doing to build the mature version of this)
“Can we build a better Public Doublecrux?”
Something I’d like to try at LessOnline is to somehow iterate on the “Public Doublecrux” format.
Public Doublecrux is a more truthseeking oriented version of Public Debate. (The goal of a debate is to change your opponent’s mind or the public’s mind. The goal of a doublecrux is more like “work with your partner to figure out if you should change your mind, and vice vera”)
Reasons to want to do _public_ doublecrux include:
it helps showcase subtle mental moves that are hard to write down explicitly (i.e. tacit knowledge transfer)
there’s still something good and exciting about seeing high profile smart people talk about ideas. Having some variant of that format seems good for LessOnline. And having at least 1-2 “doublecruxes” rather than “debates” or “panels” or “interviews” seems good for culture setting.
Historically I think public doublecruxes have had some problems:
two people actually changing *their* minds tend to get into idiosyncratic frames that are hard for observers to understand. You’re chasing *your* cruxes, rather than presenting “generally compelling arguments.” This tends to get into weeds and go down rabbit holes
– having the audience there makes it a bit more awkward and performative.
...
...
With that in mind, here are some ideas:
Maybe have the double cruxers in a private room, with videocameras. The talk is broadcast live to other conference-goers, but the actual chat is in a nice cozy room.
Have _two_ (or three?) dedicated facilitators. One is in the room with the doublecruxers, focused on helping them steer towards useful questions. (this has been tried before seems to go well if the facilitator prepares). The SECOND (and maybe third) facilitator hangs out with the audience outside, and is focused on tracking “what is the audience confused about?”. The audience participates in a live google doc where they’re organizing the conversational threads and asking questions.
(the first facilitator is periodically surreptitiously checking the google doc or chat and sometimes asking the Doublecruxers questions about it)it’s possibly worth investing in developing a doublcrux process that’s explicitly optimized for public consumption. This might be as simple as having the facilitator periodically asking participants to recap the open threads, what the goal of the current rabbit hole is, etc. But, like, brainstorming and doing “user tests” of it might be worthwhile.
...
Anyway those are some thoughts for now. Curious if anyone’s got takes.
So there’s “being honest” and “trying to convince people of things you think are true”, and I think those are at least somewhat different projects. I feel like the first is more obviously good than the second.
I would first ask “what’s my goal” (and, doublecheck why it’s your goal and if you’re being honest with yourself). Like, “I want to be able to say my true thoughts out loud and have an honest open relationship with my relatives” is different from “i don’t want my relatives to believe false things” (the win-condition for the former is about you, the latter is about them). The latter is subtly different from “I want to have presented my best case to them, that they’ll actually listen to, but then let them make up their own mind.”
I’d also note there are additional soft skills you can gain like:
feeling safe/nonjudgmental to talk to
making it feel safe for people to give up ideology (via living-through-example as someone who is happy without being religious)
helping people grieve/orient
Young people (metaphorically or literally) are welcome!
Are the disagree reacts with ‘small icons are good for this reason (enough to override other concerns)’ or ‘I didn’t update previously?’
I aspire to a kind of honesty that’s similar to what’s described here. I thought maybe this post was going overboard, but then it kept including caveats that feel similar to the caveats and specifics I go for.
One thing I might add or rephrase:
I think doing a good job with honesty, and having it be actually helpful, includes having a bunch of related soft social skills.
Sometimes the truth hurts people (which might in turn hurt you). One attitude here is “whelp, then either I must not care as much about truth as I thought (because you aren’t willing to inflict or take on that hurt) or I’m just going to deal with a bunch of random costs for sticking with the truth”. But another attitude is “learn the goddamn communication skills to present important truths in a way that hurts less.”
(while you’re still gaining those skills, one solution is various flavors of meta-honesty, which you touch on here. i.e. be clear to people ‘hey, I won’t directly lie, and I will try to tell you useful, unbiased info, but I won’t always go out of my way to do so’. Another is to be like ‘nope, I’mma be deeply honest all the times even when I’m too clumsy to do it without causing harm’, which comes with upsides and downsides)
There’s soft skills in “communicating to others without hurting them”, (i.e. “tact”) and there are also soft skills for absorbing information that might have otherwise hurt you, without getting hurt. (i.e. “thick skin”). Both seem worth investing in, if you want a world with more honesty in it.
I’ve recently updated on how useful it’d be to have small icons representing users. Previously some people were like “it’ll help me scan the comment section for people!” and I was like ”...yeah that seems true, but I’m scared of this site feeling like facebook, or worse, LinkedIn.”
I’m not sure whether that was the right tradeoff, but, I was recently sold after realizing how space-efficient it is for showing lots of commenters. Like, in slack or facebook, you’ll see things like:This’d be really helpful, esp. in the Quick Takes and Popular comments sections, where you can see which people you know/like have commented to a thing
how old are your kids? (also how much experience do you have, how many times has this happened?)
I don’t have advice-born-of-experience, but I have some guesses that depend a bit on the context.
I maybe want to clarify: there will still be presentations at LessOnline, we’re just trying to design the event such that they’re clearly more of a secondary thing.
Man I just want to say I appreciate you following up on each subthread and noting where you agree/disagree, it feels earnestly truthseeky to me.