Andrew_Critch(Andrew Critch)

Karma: 4,002

This is Dr. Andrew Critch’s professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_Critch24 May 2023 0:02 UTC

272 points

39 comments8 min readLW link

What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC

271 points

64 comments22 min readLW link 1 review

Slow motion videos as AI risk intuition pumps

Andrew_Critch14 Jun 2022 19:31 UTC

237 points

41 comments2 min readLW link 1 review

Some AI research areas and their relevance to existential safety

Andrew_Critch19 Nov 2020 3:18 UTC

204 points

37 comments50 min readLW link 2 reviews

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Andrew_Critch10 Jul 2023 8:09 UTC

190 points

46 comments11 min readLW link

Power dynamics as a blind spot or blurry spot in our collective world-modeling, especially around AI

Andrew_Critch1 Jun 2021 18:45 UTC

182 points

26 comments6 min readLW link

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC

175 points

30 comments8 min readLW link

«Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch26 Jul 2022 23:03 UTC

158 points

32 comments7 min readLW link

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

Andrew_Critch19 Apr 2022 20:25 UTC

138 points

55 comments7 min readLW link 1 review

Modal Fixpoint Cooperation without Löb’s Theorem

Andrew_Critch5 Feb 2023 0:58 UTC

133 points

32 comments3 min readLW link

Intergenerational trauma impeding cooperative existential safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC

128 points

29 comments3 min readLW link

GPT can write Quines now (GPT-4)

Andrew_Critch14 Mar 2023 19:18 UTC

111 points

30 comments1 min readLW link

Announcing Encultured AI: Building a Video Game

Andrew_Critch and Nick Hay

18 Aug 2022 2:16 UTC

103 points

26 comments4 min readLW link

Pivotal outcomes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC

95 points

31 comments4 min readLW link

Andrew_Critch 23 Dec 2022 23:43 UTC
90 points
41
on: Let’s think about slowing down AI
Katja, many thanks for writing this, and Oliver, thanks for this comment pointing out that everyday people are in fact worried about AI x-risk. Since around 2017 when I left MIRI to rejoin academia, I have been trying continually to point out that everyday people are able to easily understand the case for AI x-risk, and that it’s incorrect to assume the existence of AI x-risk can only be understood by a very small and select group of people. My arguments have often been basically the same as yours here: in my case, informal conversations with Uber drivers, random academics, and people at random public social events. Plus, the argument is very simple: If things are smarter than us, they can outsmart us and cause us trouble. It’s always seemed strange to say there’s an “inferential gap” of substance here.
However, for some reason, the idea that people outside the LessWrong community might recognize the existence of AI x-risk — and therefore be worth coordinating with on the issue — has felt not only poorly received on LessWrong, but also fraught to even suggest. For instance, I tried to point it out in this previous post:
“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments
I wrote the following, targeting multiple LessWrong-adjacent groups in the EA/rationality communities who thought “pivotal acts” with AGI were the only sensible way to reduce AI x-risk:
In fact, before you get to AGI, your company will probably develop other surprising capabilities, and you can demonstrate those capabilities to neutral-but-influential outsiders who previously did not believe those capabilities were possible or concerning. In other words, outsiders can start to help you implement helpful regulatory ideas, rather than you planning to do it all on your own by force at the last minute using a super-powerful AI system.
That particular statement was very poorly received, with a 139-karma retort from John Wentworth arguing,
What exactly is the model by which some AI organization demonstrating AI capabilities will lead to world governments jointly preventing scary AI from being built, in a world which does not actually ban gain-of-function research?
I’m not sure what’s going on here, but it seems to me like the idea of coordinating with “outsiders” or placing trust or hope in judgement of “outsiders” has been a bit of taboo here, and that arguments that outsiders are dumb or wrong or can’t be trusted will reliably get a lot of cheering in the form of Karma.
Thankfully, it now also seems to me that perhaps the core LessWrong team has started to think that ideas from outsiders matter more to the LessWrong community’s epistemics and/or ability to get things done than previously represented, such as by including material written outside LessWrong in the 2021 LessWrong review posted just a few weeks ago, for the first time:
https://www.lesswrong.com/posts/qCc7tm29Guhz6mtf7/the-lesswrong-2021-review-intellectual-circle-expansion
I consider this a move in a positive direction, but I am wondering if I can draw the LessWrong team’s attention to a more serious trend here. @Oliver, @Raemon, @Ruby, and @Ben Pace, and others engaged in curating and fostering intellectual progress on LessWrong:
Could it be that the LessWrong community, or the EA community, or the rationality community, has systematically discounted the opinions and/or agency of people outside that community, in a way that has lead the community to plan more drastic actions in the world than would otherwise be reasonable if outsiders of that community could also be counted upon to take reasonable actions?
This is a leading question, and my gut and deliberate reasoning have both been screaming “yes” at me for about 5 or 6 years straight, but I am trying to get you guys to take a fresh look at this hypothesis now, in question-form. Thanks in any case for considering it.
What links here?
- Andrew_Critch's comment on Let’s think about slowing down AI by KatjaGrace (24 Dec 2022 23:41 UTC; 9 points)