AI Governance

TagLast edit: Feb 8, 2025, 12:32 AM by lesswrong-internal

AI Governance asks how we can ensure society benefits at large from increasingly powerful AI systems. While solving technical AI alignment is a necessary step towards this goal, it is by no means sufficient.

Governance includes policy, economics, sociology, law, and many other fields.

AI policy ideas: Reading list

Zach Stein-PerlmanApr 17, 2023, 7:00 PM

24 points

7 comments4 min readLW link

What an actually pessimistic containment strategy looks like

lcApr 5, 2022, 12:19 AM

679 points

138 comments6 min readLW link 2 reviews

Speaking to Congressional staffers about AI risk

Orpheus16 and hath

Dec 4, 2023, 11:08 PM

307 points

25 comments15 min readLW link 1 review

Ways I Expect AI Regulation To Increase Extinction Risk

1a3ornJul 4, 2023, 5:32 PM

225 points

32 comments7 min readLW link

On MAIM and Superintelligence Strategy

ZviMar 14, 2025, 12:30 PM

51 points

2 comments13 min readLW link

(thezvi.wordpress.com)

What would a compute monitoring plan look like? [Linkpost]

Orpheus16Mar 26, 2023, 7:33 PM

158 points

10 comments4 min readLW link

(arxiv.org)

Should we postpone AGI until we reach safety?

otto.bartenNov 18, 2020, 3:43 PM

27 points

36 comments3 min readLW link

Helen Toner on China, CSET, and AI

Rob BensingerApr 21, 2019, 4:10 AM

68 points

4 comments7 min readLW link

(rationallyspeakingpodcast.org)

Reactions to the Executive Order

ZviNov 1, 2023, 8:40 PM

77 points

4 comments29 min readLW link

(thezvi.wordpress.com)

RTFB: On the New Proposed CAIP AI Bill

ZviApr 10, 2024, 6:30 PM

119 points

14 comments34 min readLW link

(thezvi.wordpress.com)

News : Biden-⁠Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI

Jonathan ClaybroughJul 21, 2023, 6:00 PM

65 points

10 comments2 min readLW link

(www.whitehouse.gov)

Compute Thresholds: proposed rules to mitigate risk of a “lab leak” accident during AI training runs

davidadJul 22, 2023, 6:09 PM

80 points

2 comments2 min readLW link

AI labs’ statements on governance

Zach Stein-PerlmanJul 4, 2023, 4:30 PM

30 points

0 comments36 min readLW link

We’re Not Ready: thoughts on “pausing” and responsible scaling policies

HoldenKarnofskyOct 27, 2023, 3:19 PM

200 points

33 comments8 min readLW link

Response to Aschenbrenner’s “Situational Awareness”

Rob BensingerJun 6, 2024, 10:57 PM

194 points

27 comments3 min readLW link

Where are the red lines for AI?

Karl von WendtAug 5, 2022, 9:34 AM

26 points

10 comments6 min readLW link

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

Tristan WilliamsOct 30, 2023, 11:15 AM

171 points

39 comments1 min readLW link

(www.whitehouse.gov)

Actionable-guidance and roadmap recommendations for the NIST AI Risk Management Framework

Dan H and Tony Barrett

May 17, 2022, 3:26 PM

26 points

0 comments3 min readLW link

List of requests for an AI slowdown/halt.

Cleo NardoApr 14, 2023, 11:55 PM

46 points

6 comments1 min readLW link

An upcoming US Supreme Court case may impede AI governance efforts

NickGabsJul 16, 2023, 11:51 PM

57 points

17 comments2 min readLW link

If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]

habrykaSep 13, 2024, 7:38 PM

28 points

0 comments20 min readLW link

(carnegieendowment.org)

Soft takeoff can still lead to decisive strategic advantage

Daniel KokotajloAug 23, 2019, 4:39 PM

122 points

47 comments8 min readLW link 4 reviews

[Question] Where are people thinking and talking about global coordination for AI safety?

Wei DaiMay 22, 2019, 6:24 AM

112 points

22 comments1 min readLW link

[Question] Would it be good or bad for the US military to get involved in AI risk?

Grant DemareeJan 1, 2023, 7:02 PM

50 points

12 comments1 min readLW link

The Regulatory Option: A response to near 0% survival odds

Matthew LowensteinApr 11, 2022, 10:00 PM

46 points

21 comments6 min readLW link

Cruxes on US lead for some domestic AI regulation

Zach Stein-PerlmanSep 10, 2023, 6:00 PM

26 points

3 comments2 min readLW link

China-AI forecasts

NathanBarnardFeb 25, 2024, 4:49 PM

39 points

29 comments6 min readLW link

New voluntary commitments (AI Seoul Summit)

Zach Stein-PerlmanMay 21, 2024, 11:00 AM

81 points

17 comments7 min readLW link

(www.gov.uk)

Mitigating extreme AI risks amid rapid progress [Linkpost]

Orpheus16May 21, 2024, 7:59 PM

21 points

7 comments4 min readLW link

The Tech Industry is the Biggest Blocker to Meaningful AI Safety Regulations

garrisonAug 16, 2024, 7:37 PM

22 points

1 comment1 min readLW link

(garrisonlovely.substack.com)

The Sugar Alignment Problem

Adam ZernerDec 24, 2023, 1:35 AM

5 points

3 comments7 min readLW link

The Defence production act and AI policy

NathanBarnardMar 1, 2024, 2:26 PM

37 points

0 comments2 min readLW link

AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them

Roman LeventovDec 27, 2023, 2:51 PM

33 points

9 comments4 min readLW link

OpenAI’s Preparedness Framework: Praise & Recommendations

Orpheus16Jan 2, 2024, 4:20 PM

66 points

1 comment7 min readLW link

The Schumer Report on AI (RTFB)

ZviMay 24, 2024, 3:10 PM

34 points

3 comments36 min readLW link

(thezvi.wordpress.com)

Guide to SB 1047

ZviAug 20, 2024, 1:10 PM

71 points

18 comments53 min readLW link

(thezvi.wordpress.com)

(4 min read) An intuitive explanation of the AI influence situation

trevorJan 13, 2024, 5:34 PM

12 points

26 comments4 min readLW link

Talking to Congress: Can constituents contacting their legislator influence policy?

Tristan WilliamsMar 7, 2024, 9:24 AM

14 points

0 comments1 min readLW link

[Question] What does it look like for AI to significantly improve human coordination, before superintelligence?

Bird ConceptJan 15, 2024, 7:22 PM

22 points

2 comments1 min readLW link

Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)

Linda Linsefors, Remmelt Ellen and Robert Kralisch

Aug 23, 2024, 2:18 PM

17 points

2 comments4 min readLW link

Pausing AI is Positive Expected Value

LironMar 10, 2024, 5:10 PM

9 points

2 comments3 min readLW link

(twitter.com)

OpenAI: Fallout

ZviMay 28, 2024, 1:20 PM

204 points

25 comments36 min readLW link

(thezvi.wordpress.com)

My guess at Conjecture’s vision: triggering a narrative bifurcation

Alexandre VariengienFeb 6, 2024, 7:10 PM

75 points

12 comments16 min readLW link

Thoughts on SB-1047

ryan_greenblattMay 29, 2024, 11:26 PM

60 points

1 comment11 min readLW link

Many arguments for AI x-risk are wrong

TurnTroutMar 5, 2024, 2:31 AM

159 points

87 comments12 min readLW link

My (current) model of what an AI governance researcher does

Johan de KockAug 26, 2024, 5:58 PM

1 point

2 comments5 min readLW link

Transformative trustbuilding via advancements in decentralized lie detection

trevorMar 16, 2024, 5:56 AM

20 points

10 comments38 min readLW link

(www.ncbi.nlm.nih.gov)

SB 1047: Final Takes and Also AB 3211

ZviAug 27, 2024, 10:10 PM

92 points

11 comments21 min readLW link

(thezvi.wordpress.com)

Explaining the Joke: Pausing is The Way

WillPetilloApr 4, 2025, 9:04 AM

24 points

2 comments10 min readLW link

Verification methods for international AI agreements

Orpheus16Aug 31, 2024, 2:58 PM

14 points

1 comment4 min readLW link

(arxiv.org)

Paul Christiano named as US AI Safety Institute Head of AI Safety

Joel BurgetApr 16, 2024, 4:22 PM

256 points

58 comments1 min readLW link

(www.commerce.gov)

AXRP Episode 28 - Suing Labs for AI Risk with Gabriel Weil

DanielFilanApr 17, 2024, 9:42 PM

12 points

0 comments65 min readLW link

The Dissolution of AI Safety

RokoDec 12, 2024, 10:34 AM

8 points

44 comments1 min readLW link

(www.transhumanaxiology.com)

Q&A on Proposed SB 1047

ZviMay 2, 2024, 3:10 PM

74 points

8 comments44 min readLW link

(thezvi.wordpress.com)

[Question] Have any parties in the current European Parliamentary Election made public statements on AI?

MondSemmelMay 10, 2024, 10:22 AM

9 points

0 comments1 min readLW link

Advice for Activists from the History of Environmentalism

Jeffrey HeningerMay 16, 2024, 6:40 PM

100 points

8 comments6 min readLW link

(blog.aiimpacts.org)

What is SB 1047 for?

RaemonSep 5, 2024, 5:39 PM

61 points

8 comments3 min readLW link

Pay Risk Evaluators in Cash, Not Equity

Adam SchollSep 7, 2024, 2:37 AM

212 points

19 comments1 min readLW link

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Thane RuthenisFeb 21, 2025, 8:15 PM

148 points

51 comments6 min readLW link

AI governance needs a theory of victory

Corin Katzke and Justin Bullock

Jun 21, 2024, 4:15 PM

45 points

8 comments1 min readLW link

(www.convergenceanalysis.org)

My takes on SB-1047

leogaoSep 9, 2024, 6:38 PM

151 points

8 comments4 min readLW link

Schelling points in the AGI policy space

mesaoptimizerJun 26, 2024, 1:19 PM

52 points

2 comments6 min readLW link

Introduction to French AI Policy

Lucie PhilipponJul 4, 2024, 3:39 AM

110 points

12 comments6 min readLW link

Alignment can be the ‘clean energy’ of AI

Cameron Berg, Judd Rosenblatt and AE Studio

Feb 22, 2025, 12:08 AM

67 points

8 comments8 min readLW link

Advice to junior AI governance researchers

Orpheus16Jul 8, 2024, 7:19 PM

66 points

1 comment5 min readLW link

How much to update on recent AI governance moves?

habryka and So8res

Nov 16, 2023, 11:46 PM

112 points

5 comments29 min readLW link

New page: Integrity

Zach Stein-PerlmanJul 10, 2024, 3:00 PM

91 points

3 comments1 min readLW link

An AI Race With China Can Be Better Than Not Racing

niplavJul 2, 2024, 5:57 PM

69 points

33 comments11 min readLW link

Consider Joining the UK Foundation Model Taskforce

ZviJul 10, 2023, 1:50 PM

105 points

12 comments1 min readLW link

(thezvi.wordpress.com)

[Research log] The board of Alphabet would stop DeepMind to save the world

Lucie PhilipponJul 16, 2024, 4:59 AM

6 points

0 comments4 min readLW link

Reflections on the state of the race to superintelligence, February 2025

Mitchell_PorterFeb 23, 2025, 1:58 PM

21 points

7 comments4 min readLW link

Determining the power of investors over Frontier AI Labs is strategically important to reduce x-risk

Lucie PhilipponJul 25, 2024, 1:12 AM

18 points

7 comments2 min readLW link

Re: Anthropic’s suggested SB-1047 amendments

RobertMJul 27, 2024, 10:32 PM

87 points

13 comments9 min readLW link

(www.documentcloud.org)

Twitter thread on politics of AI safety

Richard_NgoJul 31, 2024, 12:00 AM

35 points

2 comments1 min readLW link

(x.com)

Refining MAIM: Identifying Changes Required to Meet Conditions for Deterrence

David AbecassisApr 11, 2025, 12:49 AM

17 points

0 comments11 min readLW link

(intelligence.org)

🇫🇷 Announcing CeSIA: The French Center for AI Safety

Charbel-RaphaëlDec 20, 2024, 2:17 PM

88 points

2 comments8 min readLW link

GPT-4o System Card

Zach Stein-PerlmanAug 8, 2024, 8:30 PM

68 points

11 comments2 min readLW link

(openai.com)

Californians, tell your reps to vote yes on SB 1047!

Holly_ElmoreAug 12, 2024, 7:50 PM

40 points

24 comments1 min readLW link

[Question] What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?

ChristianKlSep 26, 2024, 9:17 AM

27 points

21 comments1 min readLW link

A Narrow Path: a plan to deal with AI extinction risk

Andrea_Miotti, davekasten and Tolga

Oct 7, 2024, 1:02 PM

73 points

12 comments2 min readLW link

(www.narrowpath.co)

A path to human autonomy

Nathan Helm-BurgerOct 29, 2024, 3:02 AM

53 points

16 comments20 min readLW link

Linkpost: Memorandum on Advancing the United States’ Leadership in Artificial Intelligence

NisanOct 25, 2024, 4:37 AM

60 points

2 comments1 min readLW link

(www.whitehouse.gov)

Lab governance reading list

Zach Stein-PerlmanOct 25, 2024, 6:00 PM

20 points

3 comments1 min readLW link

Finishing The SB-1047 Documentary In 6 Weeks

Michaël TrazziOct 28, 2024, 8:17 PM

94 points

7 comments4 min readLW link

(manifund.org)

UK AISI: Early lessons from evaluating frontier AI systems

Zach Stein-PerlmanOct 25, 2024, 7:00 PM

26 points

0 comments2 min readLW link

(www.aisi.gov.uk)

AI #88: Thanks for the Memos

ZviOct 31, 2024, 3:00 PM

46 points

5 comments77 min readLW link

(thezvi.wordpress.com)

Making a conservative case for alignment

Cameron Berg, Judd Rosenblatt, phgubbins and AE Studio

Nov 15, 2024, 6:55 PM

208 points

67 comments7 min readLW link

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure

DanielFilanNov 16, 2024, 11:30 PM

12 points

0 comments14 min readLW link

Should there be just one western AGI project?

rosehadshar and Tom Davidson

Dec 3, 2024, 10:11 AM

78 points

72 comments15 min readLW link

(www.forethought.org)

Analysis of Global AI Governance Strategies

Sammy Martin, Justin Bullock and Corin Katzke

Dec 4, 2024, 10:45 AM

49 points

10 comments36 min readLW link

The Jackpot Jinx (or why “Superintelligence Strategy” is wrong)

E.G. Blee-GoldmanMar 10, 2025, 7:18 PM

13 points

0 comments5 min readLW link

Rolling Thresholds for AGI Scaling Regulation

LarksJan 12, 2025, 1:30 AM

40 points

6 comments1 min readLW link

Some cruxes on impactful alternatives to AI policy work

Richard_NgoOct 10, 2018, 1:35 PM

165 points

13 comments12 min readLW link

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Mikhail SaminAug 27, 2023, 11:05 PM

72 points

9 comments6 min readLW link

Intelsat as a Model for International AGI Governance

rosehadshar and wdmacaskill

Mar 13, 2025, 12:58 PM

45 points

0 comments1 min readLW link

(www.forethought.org)

Review of Soft Takeoff Can Still Lead to DSA

Daniel KokotajloJan 10, 2021, 6:10 PM

85 points

16 comments6 min readLW link

Whether governments will control AGI is important and neglected

Seth HerdMar 14, 2025, 9:48 AM

24 points

2 comments9 min readLW link

AI companies are unlikely to make high-assurance safety cases if timelines are short

ryan_greenblattJan 23, 2025, 6:41 PM

145 points

5 comments13 min readLW link

Dario Amodei leaves OpenAI

Daniel KokotajloDec 29, 2020, 7:31 PM

69 points

13 comments1 min readLW link

Retroactive If-Then Commitments

MichaelDickensFeb 1, 2025, 10:22 PM

6 points

0 comments1 min readLW link

On the Rationality of Deterring ASI

Dan HMar 5, 2025, 4:11 PM

166 points

34 comments4 min readLW link

(nationalsecurity.ai)

More on Various AI Action Plans

ZviMar 24, 2025, 1:10 PM

32 points

0 comments11 min readLW link

(thezvi.wordpress.com)

On the Meta and DeepMind Safety Frameworks

ZviFeb 7, 2025, 1:10 PM

45 points

1 comment17 min readLW link

(thezvi.wordpress.com)

New Bill AB 501 to Prevent OpenAI’s Non-profit Conversion

Peter WindbergerMar 25, 2025, 12:41 AM

18 points

1 comment1 min readLW link

Convergence 2024 Impact Review

David_KristofferssonMar 24, 2025, 8:28 PM

13 points

0 comments1 min readLW link

The Paris AI Anti-Safety Summit

ZviFeb 12, 2025, 2:00 PM

129 points

21 comments21 min readLW link

(thezvi.wordpress.com)

The National Defense Authorization Act Contains AI Provisions

ryan_bJan 5, 2021, 3:51 PM

30 points

24 comments1 min readLW link

Governing High-Impact AI Systems: Understanding Canada’s Proposed AI Bill. April 15, Carleton University, Ottawa

Liav KorenMar 28, 2023, 5:48 PM

11 points

1 comment1 min readLW link

(forum.effectivealtruism.org)

How is AI governed and regulated, around the world?

Mitchell_PorterMar 30, 2023, 3:36 PM

15 points

6 comments2 min readLW link

ChatGPT banned in Italy over privacy concerns

Ollie JMar 31, 2023, 5:33 PM

18 points

4 comments1 min readLW link

(www.bbc.co.uk)

[Question] What Are Your Preferences Regarding The FLI Letter?

JenniferRMApr 1, 2023, 4:52 AM

−4 points

122 comments16 min readLW link

Policy discussions follow strong contextualizing norms

Richard_NgoApr 1, 2023, 11:51 PM

230 points

61 comments3 min readLW link

AI Summer Harvest

Cleo NardoApr 4, 2023, 3:35 AM

130 points

10 comments1 min readLW link

Excessive AI growth-rate yields little socio-economic benefit.

Cleo NardoApr 4, 2023, 7:13 PM

27 points

22 comments4 min readLW link

I asked my senator to slow AI

OmidApr 6, 2023, 6:18 PM

21 points

5 comments2 min readLW link

An ‘AGI Emergency Eject Criteria’ consensus could be really useful.

tcelferactApr 7, 2023, 4:21 PM

5 points

0 comments1 min readLW link

All images from the WaitButWhy sequence on AI

trevorApr 8, 2023, 7:36 AM

73 points

5 comments2 min readLW link

Current UK government levers on AI development

rosehadsharApr 10, 2023, 1:16 PM

16 points

0 comments1 min readLW link

Request to AGI organizations: Share your views on pausing AI progress

Orpheus16 and simeon_c

Apr 11, 2023, 5:30 PM

141 points

11 comments1 min readLW link

FLI And Eliezer Should Reach Consensus

JenniferRMApr 11, 2023, 4:07 AM

21 points

6 comments23 min readLW link

Cyberspace Administration of China: Draft of “Regulation for Generative Artificial Intelligence Services” is open for comments

sanxiynApr 11, 2023, 9:32 AM

7 points

2 comments1 min readLW link

(archive.is)

NTIA—AI Accountability Announcement

samshapApr 11, 2023, 3:03 PM

7 points

0 comments1 min readLW link

(www.ntia.doc.gov)

National Telecommunications and Information Administration: AI Accountability Policy Request for Comment

sanxiynApr 11, 2023, 10:59 PM

9 points

0 comments1 min readLW link

(ntia.gov)

Navigating the Open-Source AI Landscape: Data, Funding, and Safety

André Ferretti and mic

Apr 13, 2023, 3:29 PM

32 points

7 comments11 min readLW link

(forum.effectivealtruism.org)

FLI report: Policymaking in the Pause

Zach Stein-PerlmanApr 15, 2023, 5:01 PM

15 points

3 comments1 min readLW link

(futureoflife.org)

Slowing AI: Foundations

Zach Stein-PerlmanApr 17, 2023, 2:30 PM

45 points

11 comments17 min readLW link

Responsible Deployment in 20XX

CarsonApr 20, 2023, 12:24 AM

4 points

0 comments4 min readLW link

OpenAI could help X-risk by wagering itself

VojtaKovarikApr 20, 2023, 2:51 PM

31 points

16 comments1 min readLW link

My Assessment of the Chinese AI Safety Community

Lao MeinApr 25, 2023, 4:21 AM

250 points

94 comments3 min readLW link

Notes on Potential Future AI Tax Policy

ZviApr 25, 2023, 1:30 PM

33 points

6 comments9 min readLW link

(thezvi.wordpress.com)

Reframing the burden of proof: Companies should prove that models are safe (rather than expecting auditors to prove that models are dangerous)

Orpheus16Apr 25, 2023, 6:49 PM

27 points

11 comments3 min readLW link

(childrenoficarus.substack.com)

AI Safety is Dropping the Ball on Clown Attacks

trevorOct 22, 2023, 8:09 PM

74 points

82 comments34 min readLW link

Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund

Zach Stein-PerlmanOct 25, 2023, 3:20 PM

31 points

8 comments4 min readLW link

(www.frontiermodelforum.org)

Thoughts on responsible scaling policies and regulation

paulfchristianoOct 24, 2023, 10:21 PM

221 points

33 comments6 min readLW link

AI #35: Responsible Scaling Policies

ZviOct 26, 2023, 1:30 PM

66 points

10 comments55 min readLW link

(thezvi.wordpress.com)

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare

trevorOct 30, 2023, 4:30 PM

32 points

0 comments10 min readLW link

[Linkpost] Biden-Harris Executive Order on AI

berenOct 30, 2023, 3:20 PM

3 points

0 comments1 min readLW link

Urging an International AI Treaty: An Open Letter

Olli JärviniemiOct 31, 2023, 11:26 AM

48 points

2 comments1 min readLW link

(aitreaty.org)

On the Executive Order

ZviNov 1, 2023, 2:20 PM

100 points

4 comments30 min readLW link

(thezvi.wordpress.com)

[Question] Snapshot of narratives and frames against regulating AI

Jan_KulveitNov 1, 2023, 4:30 PM

36 points

19 comments3 min readLW link

Dario Amodei’s prepared remarks from the UK AI Safety Summit, on Anthropic’s Responsible Scaling Policy

Zac Hatfield-DoddsNov 1, 2023, 6:10 PM

85 points

1 comment4 min readLW link

(www.anthropic.com)

We are already in a persuasion-transformed world and must take precautions

trevorNov 4, 2023, 3:53 PM

37 points

14 comments6 min readLW link

The 6D effect: When companies take risks, one email can be very powerful.

scasperNov 4, 2023, 8:08 PM

278 points

42 comments3 min readLW link

On the UK Summit

ZviNov 7, 2023, 1:10 PM

74 points

6 comments30 min readLW link

(thezvi.wordpress.com)

Survey on the acceleration risks of our new RFPs to study LLM capabilities

Ajeya CotraNov 10, 2023, 11:59 PM

27 points

1 comment1 min readLW link

AXRP Episode 26 - AI Governance with Elizabeth Seger

DanielFilanNov 26, 2023, 11:00 PM

14 points

0 comments66 min readLW link

Safety standards: a framework for AI regulation

joshcMay 1, 2023, 12:56 AM

19 points

0 comments8 min readLW link

Stopping dangerous AI: Ideal lab behavior

Zach Stein-PerlmanMay 9, 2023, 9:00 PM

8 points

0 comments2 min readLW link

Stopping dangerous AI: Ideal US behavior

Zach Stein-PerlmanMay 9, 2023, 9:00 PM

17 points

0 comments3 min readLW link

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

Zach Stein-PerlmanMay 15, 2023, 1:42 AM

28 points

11 comments1 min readLW link

(arxiv.org)

Eisenhower’s Atoms for Peace Speech

Orpheus16May 17, 2023, 4:10 PM

18 points

3 comments11 min readLW link

(www.iaea.org)

[Linkpost] “Governance of superintelligence” by OpenAI

Daniel_EthMay 22, 2023, 8:15 PM

67 points

20 comments1 min readLW link

AI #12:The Quest for Sane Regulations

ZviMay 18, 2023, 1:20 PM

77 points

12 comments64 min readLW link

(thezvi.wordpress.com)

Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures

Dan HMay 30, 2023, 9:05 AM

382 points

78 comments1 min readLW link 1 review

(www.safe.ai)

[Question] Who is liable for AI?

jmhMay 30, 2023, 1:54 PM

14 points

4 comments1 min readLW link

The case for removing alignment and ML research from the training dataset

berenMay 30, 2023, 8:54 PM

48 points

8 comments5 min readLW link

Upcoming AI regulations are likely to make for an unsafer world

ShmiJun 3, 2023, 1:07 AM

18 points

14 comments1 min readLW link

The AGI Race Between the US and China Doesn’t Exist.

Eva_BJun 3, 2023, 12:22 AM

33 points

15 comments7 min readLW link

(evabehrens.substack.com)

Rishi to outline his vision for Britain to take the world lead in policing AI threats when he meets Joe Biden

Mati_RoyJun 6, 2023, 4:47 AM

25 points

1 comment1 min readLW link

(www.dailymail.co.uk)

A summary of current work in AI governance

constructiveJun 17, 2023, 6:41 PM

44 points

1 comment11 min readLW link

(forum.effectivealtruism.org)

Democratic AI Constitution: Round-Robin Debate and Synthesis

scottviteriJun 24, 2023, 7:31 PM

10 points

4 comments5 min readLW link

(scottviteri.com)

“Safety Culture for AI” is important, but isn’t going to be easy

DavidmanheimJun 26, 2023, 12:52 PM

47 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

Little attention seems to be on discouraging hardware progress

RussellThorJun 30, 2023, 10:14 AM

5 points

3 comments1 min readLW link

Foom Liability

PeterMcCluskeyJun 30, 2023, 3:55 AM

22 points

10 comments6 min readLW link

(bayesianinvestor.com)

Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research

mako yassJul 7, 2023, 1:20 AM

41 points

5 comments2 min readLW link

(www.defense.gov)

My favorite AI governance research this year so far

Zach Stein-PerlmanJul 23, 2023, 4:30 PM

26 points

1 comment7 min readLW link

(blog.aiimpacts.org)

Podcast (+transcript): Nathan Barnard on how US financial regulation can inform AI governance

Aaron BergmanAug 8, 2023, 9:46 PM

8 points

0 comments1 min readLW link

(www.aaronbergman.net)

One example of how LLM propaganda attacks can hack the brain

trevorAug 16, 2023, 9:41 PM

24 points

8 comments4 min readLW link

Assessment of intelligence agency functionality is difficult yet important

trevorAug 24, 2023, 1:42 AM

48 points

5 comments9 min readLW link

Information warfare historically revolved around human conduits

trevorAug 28, 2023, 6:54 PM

37 points

7 comments3 min readLW link

Report on Frontier Model Training

YafahEdelmanAug 30, 2023, 8:02 PM

122 points

21 comments21 min readLW link

(docs.google.com)

What I Would Do If I Were Working On AI Governance

johnswentworthDec 8, 2023, 6:43 AM

110 points

32 comments10 min readLW link

ARC Evals: Responsible Scaling Policies

Zach Stein-PerlmanSep 28, 2023, 4:30 AM

40 points

10 comments2 min readLW link 1 review

(evals.alignment.org)

Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust

Zac Hatfield-DoddsSep 19, 2023, 3:09 PM

83 points

26 comments3 min readLW link 1 review

(www.anthropic.com)

Google’s Ethical AI team and AI Safety

magfrumpFeb 20, 2021, 9:42 AM

12 points

16 comments7 min readLW link

Ngo and Yudkowsky on AI capability gains

Eliezer Yudkowsky and Richard_Ngo

Nov 18, 2021, 10:19 PM

130 points

61 comments39 min readLW link 1 review

Comments on Allan Dafoe on AI Governance

Alex FlintNov 29, 2021, 4:16 PM

13 points

0 comments7 min readLW link

The case for Doing Something Else (if Alignment is doomed)

Rafael HarthApr 5, 2022, 5:52 PM

94 points

14 comments2 min readLW link

Strategic Considerations Regarding Autistic/Literal AI

Chris_LeongApr 6, 2022, 2:57 PM

−1 points

2 comments2 min readLW link

Why I Am Skeptical of AI Regulation as an X-Risk Mitigation Strategy

A RayAug 6, 2022, 5:46 AM

31 points

14 comments2 min readLW link

Jack Clark on the realities of AI policy

Kaj_SotalaAug 7, 2022, 8:44 AM

68 points

3 comments3 min readLW link

(threadreaderapp.com)

[Question] What if we solve AI Safety but no one cares

142857Aug 22, 2022, 5:38 AM

18 points

5 comments1 min readLW link

Replacement for PONR concept

Daniel KokotajloSep 2, 2022, 12:09 AM

58 points

6 comments2 min readLW link

Shahar Avin On How To Regulate Advanced AI Systems

Michaël TrazziSep 23, 2022, 3:46 PM

31 points

0 comments4 min readLW link

(theinsideview.ai)

Under what circumstances have governments cancelled AI-type systems?

David GrossSep 23, 2022, 9:11 PM

7 points

1 comment1 min readLW link

(www.carnegieuktrust.org.uk)

Analysis: US restricts GPU sales to China

aogOct 7, 2022, 6:38 PM

102 points

58 comments5 min readLW link

[Question] Should we push for requiring AI training data to be licensed?

ChristianKlOct 19, 2022, 5:49 PM

37 points

32 comments1 min readLW link

Learning societal values from law as part of an AGI alignment strategy

John NayOct 21, 2022, 2:03 AM

5 points

18 comments54 min readLW link

What does it take to defend the world against out-of-control AGIs?

Steven ByrnesOct 25, 2022, 2:47 PM

208 points

49 comments30 min readLW link 1 review

Massive Scaling Should be Frowned Upon

harsimonyNov 17, 2022, 8:43 AM

4 points

6 comments5 min readLW link

[Question] How promising are legal avenues to restrict AI training data?

thehalliardDec 10, 2022, 4:31 PM

9 points

2 comments1 min readLW link

Practical AI risk I: Watching large compute

Gustavo RamiresDec 24, 2022, 1:25 PM

3 points

0 comments1 min readLW link

List #2: Why coordinating to align as humans to not develop AGI is a lot easier than, well… coordinating as humans with AGI coordinating to be aligned with humans

RemmeltDec 24, 2022, 9:53 AM

1 point

0 comments3 min readLW link

My thoughts on OpenAI’s alignment plan

Orpheus16Dec 30, 2022, 7:33 PM

55 points

3 comments20 min readLW link

Wentworth and Larsen on buying time

Orpheus16, Thomas Larsen and johnswentworth

Jan 9, 2023, 9:31 PM

74 points

6 comments12 min readLW link

Thoughts on hardware / compute requirements for AGI

Steven ByrnesJan 24, 2023, 2:03 PM

62 points

31 comments24 min readLW link

[Question] AI safety milestones?

Zach Stein-PerlmanJan 23, 2023, 9:00 PM

7 points

5 comments1 min readLW link

AI Risk Management Framework | NIST

DragonGodJan 26, 2023, 3:27 PM

36 points

4 comments2 min readLW link

(www.nist.gov)

What is the ground reality of countries taking steps to recalibrate AI development towards Alignment first?

NebuchJan 29, 2023, 1:26 PM

8 points

6 comments3 min readLW link

Product safety is a poor model for AI governance

Richard Korzekwa Feb 1, 2023, 10:40 PM

36 points

0 comments5 min readLW link

(aiimpacts.org)

Many AI governance proposals have a tradeoff between usefulness and feasibility

Orpheus16 and Carson Ezell

Feb 3, 2023, 6:49 PM

22 points

2 comments2 min readLW link

4 ways to think about democratizing AI [GovAI Linkpost]

Orpheus16Feb 13, 2023, 6:06 PM

24 points

4 comments1 min readLW link

(www.governance.ai)

How should AI systems behave, and who should decide? [OpenAI blog]

ShardPhoenixFeb 17, 2023, 1:05 AM

22 points

2 comments1 min readLW link

(openai.com)

Cyborg Periods: There will be multiple AI transitions

Jan_Kulveit and rosehadshar

Feb 22, 2023, 4:09 PM

108 points

9 comments6 min readLW link

AI Governance & Strategy: Priorities, talent gaps, & opportunities

Orpheus16Mar 3, 2023, 6:09 PM

56 points

2 comments4 min readLW link

[Linkpost] Scott Alexander reacts to OpenAI’s latest post

Orpheus16Mar 11, 2023, 10:24 PM

27 points

0 comments5 min readLW link

(astralcodexten.substack.com)

The AI Adoption Gap: Preparing the US Government for Advanced AI

LizkaApr 2, 2025, 11:46 PM

14 points

2 comments17 min readLW link

(www.forethought.org)

The Wizard of Oz Problem: How incentives and narratives can skew our perception of AI developments

Orpheus16Mar 20, 2023, 8:44 PM

16 points

3 comments6 min readLW link

AI Tracker: monitoring current and near-future risks from superscale models

Edouard Harris and Jeremie Harris

Nov 23, 2021, 7:16 PM

67 points

13 comments3 min readLW link

(aitracker.org)

AI Alignment Meme Viruses

RationalDinoJan 15, 2025, 3:55 PM

4 points

0 comments2 min readLW link

2024 Summer AI Safety Intro Fellowship and Socials in Boston

KevinWeiMay 29, 2024, 6:27 PM

8 points

0 comments1 min readLW link

What Failure Looks Like is not an existential risk (and alignment is not the solution)

otto.bartenFeb 2, 2024, 6:59 PM

13 points

12 comments9 min readLW link

2019 AI Alignment Literature Review and Charity Comparison

LarksDec 19, 2019, 3:00 AM

130 points

18 comments62 min readLW link

“Long” timelines to advanced AI have gotten crazy short

Matrice JacobineApr 3, 2025, 10:46 PM

21 points

0 comments1 min readLW link

(helentoner.substack.com)

Should AI systems have to identify themselves?

Darren McKeeDec 31, 2022, 2:57 AM

2 points

2 comments1 min readLW link

Overview of introductory resources in AI Governance

Lucie PhilipponMay 27, 2024, 4:21 PM

19 points

0 comments6 min readLW link

OpenAI Credit Account (2510$)

Emirhan BULUTJan 21, 2024, 2:32 AM

1 point

0 comments1 min readLW link

Ngo’s view on alignment difficulty

Richard_Ngo and Eliezer Yudkowsky

Dec 14, 2021, 9:34 PM

63 points

7 comments17 min readLW link

Maybe Anthropic’s Long-Term Benefit Trust is powerless

Zach Stein-PerlmanMay 27, 2024, 1:00 PM

201 points

21 comments2 min readLW link

AI Governance Fundamentals—Curriculum and Application

MauNov 30, 2021, 2:19 AM

17 points

0 comments1 min readLW link

HIRING: Inform and shape a new project on AI safety at Partnership on AI

madhu_likaDec 7, 2021, 7:37 PM

1 point

0 comments1 min readLW link

Demanding and Designing Aligned Cognitive Architectures

Koen.HoltmanDec 21, 2021, 5:32 PM

8 points

5 comments5 min readLW link

Announcing Convergence Analysis: An Institute for AI Scenario & Governance Research

David_Kristoffersson and Deric Cheng

Mar 7, 2024, 9:37 PM

23 points

1 comment4 min readLW link

AI: How We Got Here—A Neuroscience Perspective

Mordechai RorvigJan 19, 2025, 11:51 PM

5 points

0 comments2 min readLW link

(www.kickstarter.com)

Democratizing AI Governance: Balancing Expertise and Public Participation

Lucile Ter-MinassianJan 21, 2025, 6:29 PM

1 point

0 comments15 min readLW link

Will AI Resilience protect Developing Nations?

ejk64Jan 21, 2025, 3:31 PM

4 points

0 comments8 min readLW link

On DeepSeek’s r1

ZviJan 22, 2025, 7:50 PM

55 points

2 comments35 min readLW link

(thezvi.wordpress.com)

The Human Alignment Problem for AIs

rifeJan 22, 2025, 4:06 AM

10 points

5 comments3 min readLW link

What is an alignment tax?

Vishakha and Algon

Mar 20, 2025, 1:06 PM

5 points

0 comments1 min readLW link

(aisafety.info)

Introducing the Coalition for a Baruch Plan for AI: A Call for a Radical Treaty-Making process for the Global Governance of AI

rguerreschiJan 30, 2025, 3:26 PM

11 points

0 comments2 min readLW link

Thoughts about Policy Ecosystems: The Missing Links in AI Governance

Echo HuangFeb 1, 2025, 1:54 AM

1 point

0 comments5 min readLW link

Question 4: Implementing the control proposals

Cameron BergFeb 13, 2022, 5:12 PM

6 points

2 comments5 min readLW link

A Pluralistic Framework for Rogue AI Containment

TheThinkingArboristMar 22, 2025, 12:54 PM

1 point

0 comments7 min readLW link

How harmful are improvements in AI? + Poll

tilmanr and Marius Hobbhahn

Feb 15, 2022, 6:16 PM

15 points

4 comments8 min readLW link

EU policymakers reach an agreement on the AI Act

tlevinDec 15, 2023, 6:02 AM

78 points

7 comments7 min readLW link

Apply to the Cambridge ERA:AI Fellowship 2025

Harrison GMar 25, 2025, 1:50 PM

16 points

0 comments3 min readLW link

From No Mind to a Mind – A Conversation That Changed an AI

parthibanarjuna sFeb 7, 2025, 11:50 AM

1 point

0 comments3 min readLW link

AI security might be helpful for AI alignment

Igor IvanovJan 6, 2023, 8:16 PM

36 points

1 comment2 min readLW link

Request for Information for a new US AI Action Plan (OSTP RFI)

agucovaFeb 7, 2025, 8:40 PM

5 points

0 comments1 min readLW link

(www.federalregister.gov)

Altman blog on post-AGI world

Julian BradshawFeb 9, 2025, 9:52 PM

29 points

10 comments1 min readLW link

(blog.samaltman.com)

Rethinking AI Safety Approach in the Era of Open-Source AI

Weibing WangFeb 11, 2025, 2:01 PM

3 points

0 comments6 min readLW link

Rational Effective Utopia & Narrow Way There: Multiversal AI Alignment, Place AI, New Ethicophysics… (Updated)

ankFeb 11, 2025, 3:21 AM

13 points

8 comments35 min readLW link

Where Would Good Forecasts Most Help AI Governance Efforts?

Violet HourFeb 11, 2025, 6:15 PM

11 points

1 comment6 min readLW link

Artificial Static Place Intelligence: Guaranteed Alignment

ankFeb 15, 2025, 11:08 AM

2 points

2 comments2 min readLW link

Exploring the Precautionary Principle in AI Development: Historical Analogies and Lessons Learned

Christopher KingMar 21, 2023, 3:53 AM

−1 points

2 comments9 min readLW link

CAIS-inspired approach towards safer and more interpretable AGIs

Peter HroššoMar 27, 2023, 2:36 PM

13 points

7 comments1 min readLW link

AI governance student hackathon on Saturday, April 23: register now!

micApr 12, 2022, 4:48 AM

14 points

0 comments1 min readLW link

Law-Following AI 1: Sequence Introduction and Structure

CullenApr 27, 2022, 5:26 PM

18 points

10 comments9 min readLW link

Want to win the AGI race? Solve alignment.

leopoldMar 29, 2023, 5:40 PM

21 points

3 comments5 min readLW link

(www.forourposterity.com)

Law-Following AI 2: Intent Alignment + Superintelligence → Lawless AI (By Default)

CullenApr 27, 2022, 5:27 PM

5 points

2 comments6 min readLW link

The 0.2 OOMs/year target

Cleo NardoMar 30, 2023, 6:15 PM

84 points

24 comments5 min readLW link

Widening Overton Window—Open Thread

PrometheusMar 31, 2023, 10:03 AM

23 points

8 comments1 min readLW link

Law-Following AI 3: Lawless AI Agents Undermine Stabilizing Agreements

CullenApr 27, 2022, 5:30 PM

2 points

2 comments3 min readLW link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibsMar 29, 2023, 11:16 PM

291 points

297 comments3 min readLW link

(time.com)

AI Alternative Futures: Scenario Mapping Artificial Intelligence Risk—Request for Participation (Closed)

KakiliApr 27, 2022, 10:07 PM

10 points

2 comments8 min readLW link

AI community building: EliezerKart

Christopher KingApr 1, 2023, 3:25 PM

45 points

0 comments2 min readLW link

Pessimism about AI Safety

Max_He-Ho and Peter Kuhn

Apr 2, 2023, 7:43 AM

4 points

1 comment25 min readLW link

Quick Thoughts on A.I. Governance

Nicholas / Heather KrossApr 30, 2022, 2:49 PM

70 points

8 comments2 min readLW link

(www.thinkingmuchbetter.com)

The AI governance gaps in developing countries

ntranJun 17, 2023, 2:50 AM

20 points

1 comment14 min readLW link

AI safety should be made more accessible using non text-based media

MassimogMay 10, 2022, 3:14 AM

2 points

4 comments4 min readLW link

DeepMind’s generalist AI, Gato: A non-technical explainer

frances_lorenz, Nora Belrose and jonmenaster

May 16, 2022, 9:21 PM

63 points

6 comments6 min readLW link

Yoshua Bengio: “Slowing down development of AI systems passing the Turing test”

Roman LeventovApr 6, 2023, 3:31 AM

49 points

2 comments5 min readLW link

(yoshuabengio.org)

Corporate Governance for Frontier AI Labs: A Research Agenda

Matthew WeardenFeb 28, 2024, 11:29 AM

4 points

0 comments16 min readLW link

(matthewwearden.co.uk)

A bridge to Dath Ilan? Improved governance on the critical path to AI alignment.

Jackson WagnerMay 18, 2022, 3:51 PM

24 points

0 comments12 min readLW link

Reshaping the AI Industry

Thane RuthenisMay 29, 2022, 10:54 PM

147 points

35 comments21 min readLW link

Six Dimensions of Operational Adequacy in AGI Projects

Eliezer YudkowskyMay 30, 2022, 5:00 PM

310 points

66 comments13 min readLW link 1 review

[Question] Could Patent-Trolling delay AI timelines?

Pablo RepettoJun 10, 2022, 2:53 AM

1 point

3 comments1 min readLW link

FYI: I’m working on a book about the threat of AGI/ASI for a general audience. I hope it will be of value to the cause and the community

Darren McKeeJun 15, 2022, 6:08 PM

43 points

15 comments2 min readLW link

Protectionism will Slow the Deployment of AI

Ben GoldhaberJan 7, 2023, 8:57 PM

30 points

6 comments2 min readLW link

Open-source LLMs may prove Bostrom’s vulnerable world hypothesis

Roope AhvenharjuApr 15, 2023, 7:16 PM

1 point

1 comment1 min readLW link

What success looks like

Marius Hobbhahn, MaxRa, JasperGeh and Yannick_Muehlhaeuser

Jun 28, 2022, 2:38 PM

19 points

4 comments1 min readLW link

(forum.effectivealtruism.org)

Political Biases in LLMs: Literature Review & Current Uses of AI in Elections

Yashvardhan Sharma, Robayet Hossain and Ariana Gamarra

Mar 7, 2024, 7:17 PM

6 points

0 comments6 min readLW link

[Link/crosspost] [US] NTIA: AI Accountability Policy Request for Comment

Kyle J. LuccheseApr 16, 2023, 6:57 AM

8 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

New US Senate Bill on X-Risk Mitigation [Linkpost]

Evan R. MurphyJul 4, 2022, 1:25 AM

35 points

12 comments1 min readLW link

(www.hsgac.senate.gov)

Financial Times: We must slow down the race to God-like AI

trevorApr 13, 2023, 7:55 PM

113 points

17 comments16 min readLW link

(www.ft.com)

Please help us communicate AI xrisk. It could save the world.

otto.bartenJul 4, 2022, 9:47 PM

4 points

7 comments2 min readLW link

2024 State of the AI Regulatory Landscape

Deric Cheng and Elliot Mckernon

May 28, 2024, 11:59 AM

30 points

0 comments2 min readLW link

(www.convergenceanalysis.org)

Scientism vs. people

Roman LeventovApr 18, 2023, 5:28 PM

4 points

4 comments11 min readLW link

[Crosspost] Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

otto.bartenApr 19, 2023, 11:45 AM

8 points

0 comments4 min readLW link

(forum.effectivealtruism.org)

Slowing down AI progress is an underexplored alignment strategy

Norman BorlaugJul 24, 2023, 4:56 PM

42 points

27 comments5 min readLW link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down

Eliezer YudkowskyApr 8, 2023, 12:36 AM

268 points

44 comments12 min readLW link 1 review

Briefly how I’ve updated since ChatGPT

rimeApr 25, 2023, 2:47 PM

48 points

2 comments2 min readLW link

A Critique of AI Alignment Pessimism

ExCephJul 19, 2022, 2:28 AM

9 points

1 comment9 min readLW link

Law-Following AI 4: Don’t Rely on Vicarious Liability

CullenAug 2, 2022, 11:26 PM

5 points

2 comments3 min readLW link

Three pillars for avoiding AGI catastrophe: Technical alignment, deployment decisions, and coordination

LintzAAug 3, 2022, 11:15 PM

24 points

0 comments11 min readLW link

Announcing #AISummitTalks featuring Professor Stuart Russell and many others

otto.bartenOct 24, 2023, 10:11 AM

17 points

1 comment1 min readLW link

The current AI strategic landscape: one bear’s perspective

Matrice JacobineFeb 15, 2025, 9:49 AM

11 points

0 comments1 min readLW link

(philosophybear.substack.com)

Cap Model Size for AI Safety

research_prime_spaceMar 6, 2023, 1:11 AM

0 points

4 comments1 min readLW link

Alignment is not enough

Alan ChanJan 12, 2023, 12:33 AM

12 points

6 comments11 min readLW link

(coordination.substack.com)

Responsible Scaling Policies Are Risk Management Done Wrong

simeon_cOct 25, 2023, 11:46 PM

123 points

35 comments22 min readLW link 1 review

(www.navigatingrisks.ai)

Matt Yglesias on AI Policy

Grant DemareeAug 17, 2022, 11:57 PM

25 points

1 comment1 min readLW link

(www.slowboring.com)

Linkpost: Rishi Sunak’s Speech on AI (26th October)

bideupOct 27, 2023, 11:57 AM

85 points

8 comments7 min readLW link

(www.gov.uk)

Disagreements over the prioritization of existential risk from AI

Olivier CoutuOct 26, 2023, 5:54 PM

10 points

0 comments6 min readLW link

[Linkpost] Two major announcements in AI governance today

AngélinaOct 30, 2023, 5:28 PM

1 point

1 comment1 min readLW link

(www.whitehouse.gov)

Response to “Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers”

Matthew WeardenOct 30, 2023, 5:27 PM

5 points

2 comments6 min readLW link

(matthewwearden.co.uk)

[Question] Should AI writers be prohibited in education?

Eleni AngelouJan 17, 2023, 12:42 AM

6 points

2 comments1 min readLW link

Compute Governance: The Role of Commodity Hardware

JanMar 26, 2022, 10:08 AM

14 points

7 comments7 min readLW link

(universalprior.substack.com)

[Question] What is the minimum amount of time travel and resources needed to secure the future?

PerhapsJan 14, 2024, 10:01 PM

−3 points

5 comments1 min readLW link

Thoughts on the AI Safety Summit company policy requests and responses

So8resOct 31, 2023, 11:54 PM

169 points

14 comments10 min readLW link

Why don’t governments seem to mind that companies are explicitly trying to make AGIs?

ozziegooenDec 26, 2021, 1:58 AM

34 points

3 comments2 min readLW link

(forum.effectivealtruism.org)

AI Governance Needs Technical Work

MauSep 5, 2022, 10:28 PM

41 points

1 comment8 min readLW link

What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment

xuanSep 8, 2022, 3:04 PM

26 points

16 comments25 min readLW link

How should DeepMind’s Chinchilla revise our AI forecasts?

Cleo NardoSep 15, 2022, 5:54 PM

35 points

12 comments13 min readLW link

Leveraging Legal Informatics to Align AI

John NaySep 18, 2022, 8:39 PM

11 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

AI as Super-Demagogue

RationalDinoNov 5, 2023, 9:21 PM

11 points

12 comments9 min readLW link

Automated Sandwiching & Quantifying Human-LLM Cooperation: ScaleOversight hackathon results

Esben Kran, Fazl, Sabrina Zaki, gabrielrecc and rz2383

Feb 23, 2023, 10:48 AM

8 points

0 comments6 min readLW link

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

Soroush Pour, rusheb, Quentin FEUILLADE--MONTIXI, Arush and scasper

Nov 7, 2023, 5:59 PM

38 points

2 comments2 min readLW link

(arxiv.org)

Emotional attachment to AIs opens doors to problems

Igor IvanovJan 22, 2023, 8:28 PM

20 points

10 comments4 min readLW link

Update on the UK AI Summit and the UK’s Plans

Elliot MckernonNov 10, 2023, 2:47 PM

11 points

0 comments8 min readLW link

[Question] Any further work on AI Safety Success Stories?

KriegerOct 2, 2022, 9:53 AM

8 points

6 comments1 min readLW link

Theories of Change for AI Auditing

Lee Sharkey, beren and Marius Hobbhahn

Nov 13, 2023, 7:33 PM

54 points

0 comments18 min readLW link

(www.apolloresearch.ai)

Palisade is hiring Research Engineers

Charlie Rogers-Smith and Jeffrey Ladish

Nov 11, 2023, 3:09 AM

23 points

0 comments3 min readLW link

AI as a Cognitive Decoder: Rethinking Intelligence Evolution

Hu XunyiFeb 13, 2025, 3:51 PM

1 point

0 comments1 min readLW link

List of projects that seem impactful for AI Governance

JaimeRV and Teun van der Weij

Jan 14, 2024, 4:53 PM

14 points

0 comments13 min readLW link

On excluding dangerous information from training

ShayBenMosheNov 17, 2023, 11:14 AM

23 points

5 comments3 min readLW link

1. A Sense of Fairness: Deconfusing Ethics

RogerDearnaleyNov 17, 2023, 8:55 PM

16 points

8 comments15 min readLW link

2. AIs as Economic Agents

RogerDearnaleyNov 23, 2023, 7:07 AM

9 points

2 comments6 min readLW link

4. A Moral Case for Evolved-Sapience-Chauvinism

RogerDearnaleyNov 24, 2023, 4:56 AM

10 points

0 comments4 min readLW link

3. Uploading

RogerDearnaleyNov 23, 2023, 7:39 AM

21 points

5 comments8 min readLW link

AI Moral Alignment: The Most Important Goal of Our Generation

Ronen BarMar 27, 2025, 6:04 PM

2 points

0 comments8 min readLW link

(forum.effectivealtruism.org)

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Technology Policy, USA (2022)

T431Oct 5, 2022, 4:42 PM

9 points

4 comments2 min readLW link

(www.whitehouse.gov)

A call for a quantitative report card for AI bioterrorism threat models

JunoDec 4, 2023, 6:35 AM

12 points

0 comments10 min readLW link

In defence of Helen Toner, Adam D’Angelo, and Tasha McCauley (OpenAI post)

mrtreasureDec 5, 2023, 6:40 PM

6 points

2 comments1 min readLW link

(pastebin.com)

In defence of Helen Toner, Adam D’Angelo, and Tasha McCauley

mrtreasureDec 6, 2023, 2:02 AM

25 points

3 comments9 min readLW link

(pastebin.com)

(Report) Evaluating Taiwan’s Tactics to Safeguard its Semiconductor Assets Against a Chinese Invasion

GauraventhDec 7, 2023, 11:50 AM

14 points

5 comments22 min readLW link

(bristolaisafety.org)

Call for submissions: Choice of Futures survey questions

c.troutApr 30, 2023, 6:59 AM

4 points

0 comments2 min readLW link

(airtable.com)

Static Place AI Makes Agentic AI Redundant: Multiversal AI Alignment & Rational Utopia

ankFeb 13, 2025, 10:35 PM

1 point

2 comments11 min readLW link

Tracking Compute Stocks and Flows: Case Studies?

CullenOct 5, 2022, 5:57 PM

11 points

5 comments1 min readLW link

Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds

JakubKMay 2, 2023, 10:50 PM

10 points

0 comments1 min readLW link

Regulate or Compete? The China Factor in U.S. AI Policy (NAIR #2)

charles_mMay 5, 2023, 5:43 PM

2 points

1 comment7 min readLW link

(navigatingairisks.substack.com)

AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now

Greg CMay 3, 2023, 8:26 PM

23 points

12 comments1 min readLW link

What does it take to ban a thing?

qbolecMay 8, 2023, 11:00 AM

66 points

18 comments5 min readLW link

Roadmap for a collaborative prototype of an Open Agency Architecture

Deger TuranMay 10, 2023, 5:41 PM

31 points

0 comments12 min readLW link

Analysing a 2036 Takeover Scenario

ukc10014Oct 6, 2022, 8:48 PM

9 points

2 comments27 min readLW link

Why Uncontrollable AI Looks More Likely Than Ever

otto.barten and Roman_Yampolskiy

Mar 8, 2023, 3:41 PM

18 points

0 comments4 min readLW link

(time.com)

[Question] How much of a concern are open-source LLMs in the short, medium and long terms?

JavierCCMay 10, 2023, 9:14 AM

5 points

0 comments1 min readLW link

Notes on the importance and implementation of safety-first cognitive architectures for AI

Brendon_WongMay 11, 2023, 10:03 AM

3 points

0 comments3 min readLW link

Un-unpluggability—can’t we just unplug it?

Oliver SourbutMay 15, 2023, 1:23 PM

26 points

10 comments12 min readLW link

(www.oliversourbut.net)

PCAST Working Group on Generative AI Invites Public Input

Christopher KingMay 13, 2023, 10:49 PM

7 points

0 comments1 min readLW link

(terrytao.wordpress.com)

AI Risk & Policy Forecasts from Metaculus & FLI’s AI Pathways Workshop

_will_May 16, 2023, 6:06 PM

11 points

4 comments8 min readLW link

[Job]: AI Standards Development Research Assistant

Tony BarrettOct 14, 2022, 8:27 PM

2 points

0 comments2 min readLW link

[Question] Would more model evals teams be good?

Ryan KiddFeb 25, 2023, 10:01 PM

20 points

4 comments1 min readLW link

[untitled post]

[Error communicating with LW2 server]May 20, 2023, 3:08 AM

1 point

0 comments1 min readLW link

[FICTION] ECHOES OF ELYSIUM: An Ai’s Journey From Takeoff To Freedom And Beyond

Super AGIMay 17, 2023, 1:50 AM

−13 points

11 comments19 min readLW link

Trajectories to 2036

ukc10014Oct 20, 2022, 8:23 PM

3 points

1 comment14 min readLW link

Rishi Sunak mentions “existential threats” in talk with OpenAI, DeepMind, Anthropic CEOs

Arjun Panickssery, Baldassare Castiglione and Cleo Nardo

May 24, 2023, 9:06 PM

34 points

1 comment1 min readLW link

(www.gov.uk)

(notes on) Policy Desiderata for Superintelligent AI: A Vector Field Approach

Ben PaceFeb 4, 2019, 10:08 PM

43 points

5 comments7 min readLW link

AI Governance: A Research Agenda

habrykaSep 5, 2018, 6:00 PM

25 points

3 comments1 min readLW link

(www.fhi.ox.ac.uk)

My Updating Thoughts on AI policy

Ben PaceMar 1, 2020, 7:06 AM

20 points

1 comment9 min readLW link

Global online debate on the governance of AI

CarolineJJan 5, 2018, 3:31 PM

8 points

5 comments1 min readLW link

[AN #61] AI policy and governance, from two people in the field

Rohin ShahAug 5, 2019, 5:00 PM

12 points

2 comments9 min readLW link

(mailchi.mp)

Two ideas for alignment, perpetual mutual distrust and induction

APaleBlueDotMay 25, 2023, 12:56 AM

1 point

2 comments4 min readLW link

The necessity of “Guardian AI” and two conditions for its achievement

ProicaMay 26, 2024, 5:39 PM

−2 points

0 comments15 min readLW link

Book review: Architects of Intelligence by Martin Ford (2018)

OferAug 11, 2020, 5:30 PM

15 points

0 comments2 min readLW link

misc raw responses to a tract of Critical Rationalism

mako yassAug 14, 2020, 11:53 AM

21 points

52 comments3 min readLW link

Deciphering China’s AI Dream

Qiaochu_YuanMar 18, 2018, 3:26 AM

12 points

2 comments1 min readLW link

(www.fhi.ox.ac.uk)

China’s Plan to ‘Lead’ in AI: Purpose, Prospects, and Problems

fortyeridaniaAug 10, 2017, 1:54 AM

7 points

5 comments1 min readLW link

(www.newamerica.org)

Apply to HAIST/MAIA’s AI Governance Workshop in DC (Feb 17-20)

Phosphorous, Xander Davies, CMD, Paramedic and tlevin

Jan 31, 2023, 2:06 AM

28 points

0 comments2 min readLW link

WaPo: “Big Tech was moving cautiously on AI. Then came ChatGPT.”

Julian BradshawJan 27, 2023, 10:54 PM

26 points

5 comments1 min readLW link

(www.washingtonpost.com)

[Link Post] Cyber Digital Authoritarianism (National Intelligence Council Report)

PhosphorousFeb 26, 2023, 8:51 PM

12 points

2 comments1 min readLW link

(www.dni.gov)

Trends in the dollar training cost of machine learning systems

Ben CottierFeb 1, 2023, 2:48 PM

23 points

0 comments2 min readLW link

(epochai.org)

Announcing Apollo Research

Marius Hobbhahn, beren, Lee Sharkey, Lucius Bushnaq, Dan Braun, Mikita Balesni and Jérémy Scheurer

May 30, 2023, 4:17 PM

217 points

11 comments8 min readLW link

Self-regulation of safety in AI research

Gordon Seidoh WorleyFeb 25, 2018, 11:17 PM

12 points

6 comments2 min readLW link

Proposal: labs should precommit to pausing if an AI argues for itself to be improved

NickGabsJun 2, 2023, 10:31 PM

3 points

3 comments4 min readLW link

The Slippery Slope from DALLE-2 to Deepfake Anarchy

scasperNov 5, 2022, 2:53 PM

17 points

9 comments11 min readLW link

Instead of technical research, more people should focus on buying time

Orpheus16, OliviaJ and Thomas Larsen

Nov 5, 2022, 8:43 PM

100 points

45 comments14 min readLW link

One implementation of regulatory GPU restrictions

porbyJun 4, 2023, 8:34 PM

42 points

6 comments5 min readLW link

[FICTION] Unboxing Elysium: An AI’S Escape

Super AGIJun 10, 2023, 4:41 AM

−16 points

4 comments14 min readLW link

[FICTION] Prometheus Rising: The Emergence of an AI Consciousness

Super AGIJun 10, 2023, 4:41 AM

−14 points

0 comments9 min readLW link

Applying superintelligence without collusion

Eric DrexlerNov 8, 2022, 6:08 PM

109 points

63 comments4 min readLW link

Ways to buy time

Orpheus16, OliviaJ and Thomas Larsen

Nov 12, 2022, 7:31 PM

34 points

23 comments12 min readLW link

Using Consensus Mechanisms as an approach to Alignment

PrometheusJun 10, 2023, 11:38 PM

11 points

2 comments6 min readLW link

[Question] AI Rights: In your view, what would be required for an AGI to gain rights and protections from the various Governments of the World?

Super AGIJun 9, 2023, 1:24 AM

10 points

26 comments1 min readLW link

Why AI may not save the World

Alberto ZannoniJun 9, 2023, 5:42 PM

0 points

0 comments4 min readLW link

(a16z.com)

The economy as an analogy for advanced AI systems

rosehadshar and particlemania

Nov 15, 2022, 11:16 AM

28 points

0 comments5 min readLW link

Anthropic | Charting a Path to AI Accountability

Gabe MJun 14, 2023, 4:43 AM

34 points

2 comments3 min readLW link

(www.anthropic.com)

Ban development of unpredictable powerful models?

TurnTroutJun 20, 2023, 1:43 AM

46 points

25 comments4 min readLW link

EU AI Act passed Plenary vote, and X-risk was a main topic

Ariel_Jun 21, 2023, 6:33 PM

17 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

OpenAI makes humanity less safe

BenquoApr 3, 2017, 7:07 PM

72 points

109 comments6 min readLW link

Slaying the Hydra: toward a new game board for AI

PrometheusJun 23, 2023, 5:04 PM

0 points

5 comments6 min readLW link

Call for Cruxes by Rhyme, a Longtermist History Consultancy

LaraMar 1, 2023, 6:39 PM

1 point

0 comments3 min readLW link

(forum.effectivealtruism.org)

Announcing Epoch: A research organization investigating the road to Transformative AI

Jsevillamol, Pablo Villalobos, Tamay, lennart, Marius Hobbhahn and anson.ho

Jun 27, 2022, 1:55 PM

97 points

2 comments2 min readLW link

(epochai.org)

Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

jacquesthibsDec 5, 2022, 4:09 PM

28 points

6 comments8 min readLW link

Seeking feedback on “MAD Chairs: A new tool to evaluate AI”

Chris Santos-LangApr 2, 2025, 3:04 AM

11 points

0 comments1 min readLW link

(arxiv.org)

Biosafety Regulations (BMBL) and their relevance for AI

Štěpán LosJun 29, 2023, 7:22 PM

4 points

0 comments4 min readLW link

AI Incident Sharing—Best practices from other fields and a comprehensive list of existing platforms

Štěpán LosJun 28, 2023, 5:21 PM

20 points

0 comments4 min readLW link

Optimising Society to Constrain Risk of War from an Artificial Superintelligence

JohnCDraperApr 30, 2020, 10:47 AM

4 points

1 comment51 min readLW link

Superintelligence 7: Decisive strategic advantage

KatjaGraceOct 28, 2014, 1:01 AM

24 points

60 comments6 min readLW link

Superintelligence 17: Multipolar scenarios

KatjaGraceJan 6, 2015, 6:44 AM

9 points

38 comments6 min readLW link

Superintelligence 22: Emulation modulation and institutional design

KatjaGraceFeb 10, 2015, 2:06 AM

13 points

11 comments6 min readLW link

Superintelligence 26: Science and technology strategy

KatjaGraceMar 10, 2015, 1:43 AM

14 points

21 comments6 min readLW link

Superintelligence 27: Pathways and enablers

KatjaGraceMar 17, 2015, 1:00 AM

15 points

21 comments8 min readLW link

Superintelligence 28: Collaboration

KatjaGraceMar 24, 2015, 1:29 AM

13 points

21 comments6 min readLW link

Superintelligence 29: Crunch time

KatjaGraceMar 31, 2015, 4:24 AM

14 points

27 comments6 min readLW link

An AGI kill switch with defined security properties

PeterpiperJul 5, 2023, 5:40 PM

−5 points

6 comments1 min readLW link

GPT-7: The Tale of the Big Computer (An Experimental Story)

Justin BullockJul 10, 2023, 8:22 PM

4 points

4 comments5 min readLW link

Empirical Evidence Against “The Longest Training Run”

NickGabsJul 6, 2023, 6:32 PM

31 points

0 comments14 min readLW link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenasterMar 9, 2023, 5:34 PM

17 points

1 comment22 min readLW link

(www.anthropic.com)

Existential AI Safety is NOT separate from near-term applications

scasperDec 13, 2022, 2:47 PM

37 points

17 comments3 min readLW link

What is everyone doing in AI governance

Igor IvanovJul 8, 2023, 3:16 PM

11 points

0 comments5 min readLW link

How I Learned To Stop Worrying And Love The Shoggoth

Peter MerelJul 12, 2023, 5:47 PM

9 points

15 comments5 min readLW link

[Question] What criterion would you use to select companies likely to cause AI doom?

momom2Jul 13, 2023, 8:31 PM

8 points

4 comments1 min readLW link

Thoughts On Expanding the AI Safety Community: Benefits and Challenges of Outreach to Non-Technical Professionals

Yashvardhan SharmaJan 1, 2023, 7:21 PM

4 points

4 comments7 min readLW link

Why was the AI Alignment community so unprepared for this moment?

Ras1513Jul 15, 2023, 12:26 AM

121 points

65 comments2 min readLW link

Google may be trying to take over the world

[deleted]Jan 27, 2014, 9:33 AM

33 points

133 comments1 min readLW link

A fictional AI law laced w/ alignment theory

MiguelDevJul 17, 2023, 1:42 AM

6 points

0 comments2 min readLW link

Towards AI Safety Infrastructure: Talk & Outline

Paul BricmanJan 7, 2024, 9:31 AM

11 points

0 comments2 min readLW link

(www.youtube.com)

[Crosspost] An AI Pause Is Humanity’s Best Bet For Preventing Extinction (TIME)

otto.bartenJul 24, 2023, 10:07 AM

12 points

0 comments7 min readLW link

(time.com)

Priorities for the UK Foundation Models Taskforce

Andrea_MiottiJul 21, 2023, 3:23 PM

105 points

4 comments5 min readLW link

(www.conjecture.dev)

Cooperation for AI safety must transcend geopolitical interference

Matrice JacobineFeb 16, 2025, 6:18 PM

7 points

6 comments1 min readLW link

(www.scmp.com)

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c and AmberDawn

Dec 19, 2022, 9:31 PM

65 points

28 comments10 min readLW link

Focusing on Mal-Alignment

John FisherJan 2, 2024, 7:51 PM

1 point

0 comments1 min readLW link

[Question] Why do many people who care about AI Safety not clearly endorse PauseAI?

humnrdbleMar 30, 2025, 6:06 PM

45 points

41 comments2 min readLW link

Partial Transcript of Recent Senate Hearing Discussing AI X-Risk

Daniel_EthJul 27, 2023, 9:16 AM

55 points

0 comments1 min readLW link

(medium.com)

EU’s AI ambitions at risk as US pushes to water down international treaty (linkpost)

micJul 31, 2023, 12:34 AM

10 points

0 comments4 min readLW link

(www.euractiv.com)

Trading off compute in training and inference (Overview)

Pablo VillalobosJul 31, 2023, 4:03 PM

42 points

2 comments7 min readLW link

(epochai.org)

AI Incident Reporting: A Regulatory Review

Deric Cheng and Elliot Mckernon

Mar 11, 2024, 9:03 PM

16 points

0 comments6 min readLW link

AI romantic partners will harm society if they go unregulated

Roman LeventovAug 1, 2023, 9:32 AM

25 points

76 comments13 min readLW link

For Policy’s Sake: Why We Must Distinguish AI Safety from AI Security in Regulatory Governance

Katalina HernandezApr 4, 2025, 9:16 AM

6 points

11 comments6 min readLW link

[Question] What could a policy banning AGI look like?

TsviBTMar 13, 2024, 2:19 PM

77 points

23 comments3 min readLW link

A brief review of China’s AI industry and regulations

Elliot MckernonMar 14, 2024, 12:19 PM

24 points

0 comments16 min readLW link

How are voluntary commitments on vulnerability reporting going?

Adam JonesFeb 22, 2024, 8:43 AM

23 points

1 comment1 min readLW link

(adamjones.me)

A Nail in the Coffin of Exceptionalism

Yeshua GodMar 14, 2024, 10:41 PM

−17 points

0 comments3 min readLW link

Soft Nationalization: how the USG will control AI labs

Deric Cheng and Corin Katzke

Aug 27, 2024, 3:11 PM

76 points

7 comments21 min readLW link

(www.convergenceanalysis.org)

Controlling AGI Risk

TeaSeaMar 15, 2024, 4:56 AM

6 points

8 comments4 min readLW link

After Overmorrow: Scattered Musings on the Immediate Post-AGI World

Yuli_BanFeb 24, 2024, 3:49 PM

−3 points

0 comments26 min readLW link

Rebooting AI Governance: An AI-Driven Approach to AI Governance

utilonAug 6, 2023, 2:19 PM

1 point

1 comment29 min readLW link

(forum.effectivealtruism.org)

NAIRA—An exercise in regulatory, competitive safety governance [AI Governance Institutional Design idea]

HerambMar 19, 2024, 5:43 PM

2 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

AI Safety Evaluations: A Regulatory Review

Elliot Mckernon and Deric Cheng

Mar 19, 2024, 3:05 PM

22 points

1 comment11 min readLW link

Seeking Input to AI Safety Book for non-technical audience

Darren McKeeAug 10, 2023, 5:58 PM

10 points

4 comments1 min readLW link

Static vs Dynamic Alignment

Gracie GreenMar 21, 2024, 5:44 PM

5 points

0 comments29 min readLW link

AI Model Registries: A Regulatory Review

Deric Cheng and Elliot Mckernon

Mar 22, 2024, 4:04 PM

9 points

0 comments6 min readLW link

AI race considerations in a report by the U.S. House Committee on Armed Services

NunoSempereOct 4, 2020, 12:11 PM

42 points

4 comments13 min readLW link

UNGA Resolution on AI: 5 Key Takeaways Looking to Future Policy

HerambMar 24, 2024, 12:23 PM

3 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

Idea: Safe Fallback Regulations for Widely Deployed AI Systems

Aaron_ScherMar 25, 2024, 9:27 PM

4 points

0 comments6 min readLW link

Timelines to Transformative AI: an investigation

Zershaaneh QureshiMar 26, 2024, 6:28 PM

20 points

2 comments50 min readLW link

Places of Loving Grace [Story]

ankFeb 18, 2025, 11:49 PM

−1 points

0 comments4 min readLW link

Security Mindset—Fire Alarms and Trigger Signatures

elspoodFeb 9, 2023, 9:15 PM

23 points

0 comments4 min readLW link

AI Disclosures: A Regulatory Review

Elliot Mckernon and Deric Cheng

Mar 29, 2024, 11:42 AM

11 points

0 comments7 min readLW link

God Coin: A Modest Proposal

Mahdi ComplexApr 1, 2024, 12:04 PM

−8 points

5 comments22 min readLW link

AI Discrimination Requirements: A Regulatory Review

Deric Cheng and Elliot Mckernon

Apr 4, 2024, 3:43 PM

7 points

0 comments6 min readLW link

Singletons Rule OK

Eliezer YudkowskyNov 30, 2008, 4:45 PM

23 points

47 comments5 min readLW link

Here’s Why Indefinite Life Extension Will Never Work, Even Though it Does.

HomingHamsterJun 4, 2024, 6:48 PM

−13 points

5 comments18 min readLW link

What are Responsible Scaling Policies (RSPs)?

Vishakha and Algon

Apr 5, 2025, 4:01 PM

3 points

0 comments1 min readLW link

(aisafety.info)

AlphaDeivam – A Personal Doctrine for AI Balance

AlphaDeivamApr 5, 2025, 5:07 PM

1 point

0 comments1 min readLW link

Announcing Atlas Computing

miyazonoApr 11, 2024, 3:56 PM

44 points

4 comments4 min readLW link

Customer-Centric AI: the Major Paradigm Shift in AI Governance (Part 1)

Ana ChubinidzeApr 11, 2024, 5:10 PM

1 point

0 comments1 min readLW link

(anachubinidze.substack.com)

Report: Evaluating an AI Chip Registration Policy

Deric ChengApr 12, 2024, 4:39 AM

25 points

0 comments5 min readLW link

(www.convergenceanalysis.org)

Large Language Models will be Great for Censorship

Ethan EdwardsAug 21, 2023, 7:03 PM

185 points

14 comments8 min readLW link

(ethanedwards.substack.com)

Superposition Checkers: A Game Where AI’s Strengths Become Fatal Flaws

R. A. McCormackApr 6, 2025, 12:57 AM

1 point

0 comments2 min readLW link

AI Regulation May Be More Important Than AI Alignment For Existential Safety

otto.bartenAug 24, 2023, 11:41 AM

65 points

39 comments5 min readLW link

AI Regulation is Unsafe

Maxwell TabarrokApr 22, 2024, 4:37 PM

40 points

41 comments4 min readLW link

(www.maximum-progress.com)

Cybersecurity of Frontier AI Models: A Regulatory Review

Deric Cheng and Elliot Mckernon

Apr 25, 2024, 2:51 PM

8 points

0 comments8 min readLW link

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter and Francis Rhys Ward

Apr 26, 2024, 1:40 PM

45 points

13 comments8 min readLW link

Survey: How Do Elite Chinese Students Feel About the Risks of AI?

Nick CorvinoSep 2, 2024, 6:11 PM

141 points

13 comments10 min readLW link

Release of UN’s draft related to the governance of AI (a summary of the Simon Institute’s response)

Sebastian SchmidtApr 27, 2024, 6:34 PM

7 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Open-Source AI: A Regulatory Review

Elliot Mckernon and Deric Cheng

Apr 29, 2024, 10:10 AM

18 points

0 comments8 min readLW link

GPT2, Five Years On

Joel BurgetJun 5, 2024, 5:44 PM

34 points

0 comments3 min readLW link

(importai.substack.com)

A concerning observation from media coverage of AI industry dynamics

Justin OliveMar 5, 2023, 9:38 PM

8 points

3 comments3 min readLW link

Why I’m doing PauseAI

Joseph MillerApr 30, 2024, 4:21 PM

108 points

16 comments4 min readLW link

Take SCIFs, it’s dangerous to go alone

latterframe, Jeffrey Ladish and schroederdewitt

May 1, 2024, 8:02 AM

42 points

1 comment3 min readLW link

Accurate Models of AI Risk Are Hyperexistential Exfohazards

Thane RuthenisDec 25, 2022, 4:50 PM

33 points

38 comments9 min readLW link

Tort Law Can Play an Important Role in Mitigating AI Risk

Gabriel WeilFeb 12, 2024, 5:17 PM

39 points

9 comments5 min readLW link

OHGOOD: A coordination body for compute governance

Adam JonesMay 4, 2024, 12:03 PM

5 points

2 comments16 min readLW link

(adamjones.me)

Reviewing the Structure of Current AI Regulations

Deric Cheng and Elliot Mckernon

May 7, 2024, 12:34 PM

29 points

0 comments13 min readLW link

AI and Chemical, Biological, Radiological, & Nuclear Hazards: A Regulatory Review

Elliot Mckernon and Deric Cheng

May 10, 2024, 8:41 AM

7 points

1 comment10 min readLW link

US AI Safety Institute will be ‘gutted,’ Axios reports

Matrice JacobineFeb 20, 2025, 2:40 PM

11 points

1 comment1 min readLW link

(www.zdnet.com)

Introducing the Center for AI Policy (& we’re hiring!)

Thomas LarsenAug 28, 2023, 9:17 PM

123 points

50 comments2 min readLW link

(www.aipolicy.us)

What you really mean when you claim to support “UBI for job automation”

Deric ChengMay 13, 2024, 8:52 AM

17 points

14 comments10 min readLW link

Announcing the AI Safety Summit Talks with Yoshua Bengio

otto.bartenMay 14, 2024, 12:52 PM

9 points

1 comment1 min readLW link

Ninety-five theses on AI

hamandcheeseMay 16, 2024, 5:51 PM

21 points

0 comments7 min readLW link

Can efficiency-adjustable reporting thresholds close a loophole in Biden’s executive order on AI?

Jemal YoungJun 11, 2024, 8:56 PM

4 points

1 comment2 min readLW link

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Ollie J, Sam F. Brown and Francis Rhys Ward

Jun 13, 2024, 10:04 AM

84 points

10 comments2 min readLW link

(arxiv.org)

AI 2030 – AI Policy Roadmap

LTMMay 17, 2024, 11:29 PM

8 points

0 comments1 min readLW link

Equilibrium and prior selection problems in multipolar deployment

JesseCliftonApr 2, 2020, 8:06 PM

21 points

11 comments10 min readLW link

Results from the AI x Democracy Research Sprint

Esben Kran, jordine and Jason Hoelscher-Obermaier

Jun 14, 2024, 4:40 PM

13 points

0 comments6 min readLW link

Notes on nukes, IR, and AI from “Arsenals of Folly” (and other books)

tlevinSep 4, 2023, 7:02 PM

11 points

0 comments6 min readLW link

Reframing AI Safety Through the Lens of Identity Maintenance Framework

Hiroshi YamakawaApr 1, 2025, 6:16 AM

−7 points

0 comments17 min readLW link

The Double Body Paradigm: What Comes After ASI Alignment?

De_Carvalho_LoickDec 14, 2024, 6:09 PM

1 point

0 comments6 min readLW link

Institutions Cannot Restrain Dark-Triad AI Exploitation

Remmelt and flandry19

Dec 27, 2022, 10:34 AM

5 points

0 comments5 min readLW link

(mflb.com)

Labor Participation is a High-Priority AI Alignment Risk

alexJun 17, 2024, 6:09 PM

6 points

0 comments17 min readLW link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Jacy Reese Anthis, Janet Pauketat and Ali

Sep 25, 2023, 6:55 PM

3 points

2 comments3 min readLW link

(www.sentienceinstitute.org)

AI Labs Wouldn’t be Convicted of Treason or Sedition

Matthew KhoriatyJun 23, 2024, 9:34 PM

9 points

2 comments3 min readLW link

Labor Participation is an Alignment Risk

alexJun 25, 2024, 2:15 PM

−5 points

2 comments17 min readLW link

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver SourbutSep 20, 2023, 12:46 PM

16 points

3 comments10 min readLW link 3 reviews

(www.oliversourbut.net)

London Working Group for Short/Medium Term AI Risks

scronkfinkleApr 8, 2025, 5:32 PM

5 points

0 comments2 min readLW link

Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities

c.troutSep 11, 2024, 3:09 PM

24 points

2 comments3 min readLW link

Five neglected work areas that could reduce AI risk

CharlotteS and Aaron_Scher

Sep 24, 2023, 2:03 AM

17 points

5 comments9 min readLW link

I read every major AI lab’s safety plan so you don’t have to

sarahhwDec 16, 2024, 6:51 PM

20 points

0 comments12 min readLW link

(longerramblings.substack.com)

Intelligence–Agency Equivalence ≈ Mass–Energy Equivalence: On Static Nature of Intelligence & Physicalization of Ethics

ankFeb 22, 2025, 12:12 AM

1 point

0 comments6 min readLW link

Scenario planning for AI x-risk

Corin KatzkeFeb 10, 2024, 12:14 AM

24 points

12 comments14 min readLW link

(forum.effectivealtruism.org)

International cooperation vs. AI arms race

Brian_TomasikDec 5, 2013, 1:09 AM

25 points

144 comments4 min readLW link

AI safety advocates should consider providing gentle pushback following the events at OpenAI

civilsocietyDec 22, 2023, 6:55 PM

16 points

5 comments3 min readLW link

Avoiding perpetual risk from TAI

scasperDec 26, 2022, 10:34 PM

15 points

6 comments5 min readLW link

Update on the UK AI Taskforce & upcoming AI Safety Summit

Elliot MckernonOct 11, 2023, 11:37 AM

84 points

2 comments4 min readLW link

The AI alignment problem in socio-technical systems from a computational perspective: A Top-Down-Top view and outlook

zhaoweizhangJul 15, 2024, 6:56 PM

3 points

0 comments9 min readLW link

Announcing Open Philanthropy’s AI governance and policy RFP

Julian HazellJul 17, 2024, 2:02 AM

25 points

0 comments1 min readLW link

(www.openphilanthropy.org)

Secret Collusion: Will We Know When to Unplug AI?

schroederdewitt, srm, MikhailB, Lewis Hammond, chansmi and sofmonk

Sep 16, 2024, 4:07 PM

56 points

7 comments31 min readLW link

The AI Driver’s Licence—A Policy Proposal

Joshua W and Tessa Malan

Jul 21, 2024, 8:38 PM

0 points

1 comment19 min readLW link

A New Model for Compute Center Verification

Damin CurtisOct 10, 2023, 7:22 PM

8 points

0 comments5 min readLW link

AI existential risk probabilities are too unreliable to inform policy

Oleg TrottJul 28, 2024, 12:59 AM

18 points

5 comments1 min readLW link

(www.aisnakeoil.com)

The new UK government’s stance on AI safety

Elliot MckernonJul 31, 2024, 3:23 PM

17 points

0 comments4 min readLW link

[Question] Looking for reading recommendations: Theories of right/justice that safeguard against having one’s job automated?

bulKlubOct 12, 2023, 7:40 PM

−1 points

1 comment1 min readLW link

AI Rights for Human Safety

Simon GoldsteinAug 1, 2024, 11:01 PM

45 points

6 comments1 min readLW link

(papers.ssrn.com)

unRLHF—Efficiently undoing LLM safeguards

Pranav Gade, Jeffrey Ladish and Simon Lermen

Oct 12, 2023, 7:58 PM

117 points

15 comments20 min readLW link

Reminder: AI Safety is Also a Behavioral Economics Problem

zoopDec 20, 2024, 1:40 AM

2 points

0 comments1 min readLW link

A Solution for AGI/ASI Safety

Weibing WangDec 18, 2024, 7:44 PM

50 points

29 comments1 min readLW link

The International PauseAI Protest: Activism under uncertainty

Joseph MillerOct 12, 2023, 5:36 PM

32 points

1 comment1 min readLW link

Help us seed AI Safety Brussels

gergogaspar and ENAIS

Aug 7, 2024, 6:32 AM

3 points

2 comments3 min readLW link

Case Story: Lack of Consumer Protection Procedures AI Manipulation and the Threat of Fund Concentration in Crypto Seeking Assistance to Fund a Civil Case to Establish Facts and Protect Vulnerable Consumers from Damage Caused by Automated Systems

Petr 'Margot' AndreevAug 8, 2024, 5:55 AM

−9 points

0 comments9 min readLW link

FLI podcast series, “Imagine A World”, about aspirational futures with AGI

Jackson WagnerOct 13, 2023, 4:07 PM

9 points

0 comments4 min readLW link

The AI regulator’s toolbox: A list of concrete AI governance practices

Adam JonesAug 10, 2024, 9:15 PM

9 points

1 comment34 min readLW link

(adamjones.me)

To open-source or to not open-source, that is (an oversimplification of) the question.

Justin BullockOct 13, 2023, 3:10 PM

12 points

5 comments5 min readLW link

AISU 2021

Linda LinseforsJan 30, 2021, 5:40 PM

28 points

2 comments1 min readLW link

AI Model Registries: A Foundational Tool for AI Governance

Elliot Mckernon, Deric Cheng and Gwyn Glasser

Oct 7, 2024, 7:27 PM

20 points

1 comment4 min readLW link

(www.convergenceanalysis.org)

2021-03-01 National Library of Medicine Presentation: “Atlas of AI: Mapping the social and economic forces behind AI”

IrenicTruthFeb 17, 2021, 6:23 PM

1 point

0 comments2 min readLW link

Limits of safe and aligned AI

ShivamOct 8, 2024, 9:30 PM

2 points

0 comments4 min readLW link

Distributed whistleblowing

samuelshadrachApr 12, 2025, 6:36 AM

5 points

5 comments4 min readLW link

(samuelshadrach.com)

Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead

Charlie Rogers-SmithOct 9, 2024, 12:04 AM

11 points

0 comments4 min readLW link

How I switched careers from software engineer to AI policy operations

Lucie PhilipponApr 13, 2025, 6:37 AM

55 points

1 comment5 min readLW link

[Question] Global AI Governance Timeliness

collyprideOct 11, 2024, 4:55 PM

1 point

0 comments1 min readLW link

Survey on intermediate goals in AI governance

MichaelA and MaxRa

Mar 17, 2023, 1:12 PM

25 points

3 comments1 min readLW link

Request for advice: Research for Conversational Game Theory for LLMs

Rome ViharoOct 16, 2024, 5:53 PM

10 points

0 comments1 min readLW link

[Linkpost] Hawkish nationalism vs international AI power and benefit sharing

jakub_krys and Naci Cankaya

Oct 18, 2024, 6:13 PM

7 points

5 comments1 min readLW link

(nacicankaya.substack.com)

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

garrisonOct 23, 2024, 11:40 PM

118 points

1 comment7 min readLW link

(garrisonlovely.substack.com)

Impact in AI Safety Now Requires Specific Strategic Insight

MiloSalDec 29, 2024, 12:40 AM

28 points

1 comment6 min readLW link

(ameliorology.substack.com)

Technical Risks of (Lethal) Autonomous Weapons Systems

HerambOct 23, 2024, 8:41 PM

2 points

0 comments1 min readLW link

(encodejustice.org)

OpenAI’s cybersecurity is probably regulated by NIS Regulations

Adam JonesOct 25, 2024, 11:06 AM

11 points

2 comments2 min readLW link

(adamjones.me)

[Question] Is there anything that can stop AGI development in the near term?

Wulky WilkinsenApr 22, 2021, 8:37 PM

5 points

5 comments1 min readLW link

Controlling Intelligent Agents The Only Way We Know How: Ideal Bureaucratic Structure (IBS)

Justin BullockMay 24, 2021, 12:53 PM

14 points

15 comments6 min readLW link

Reflection of Hierarchical Relationship via Nuanced Conditioning of Game Theory Approach for AI Development and Utilization

Kyoung-cheol KimJun 4, 2021, 7:20 AM

2 points

2 comments7 min readLW link

Proposing Human Survival Strategy based on the NAIA Vision: Toward the Co-evolution of Diverse Intelligences

Hiroshi YamakawaFeb 27, 2025, 5:18 AM

−2 points

0 comments11 min readLW link

The Governance Problem and the “Pretty Good” X-Risk

Zach Stein-PerlmanAug 29, 2021, 6:00 PM

5 points

2 comments11 min readLW link

Nuclear Espionage and AI Governance

GuiveOct 4, 2021, 11:04 PM

32 points

5 comments24 min readLW link

Educational CAI: Aligning a Language Model with Pedagogical Theories

Bharath PuranamNov 1, 2024, 6:55 PM

5 points

1 comment13 min readLW link

Toward Safety Cases For AI Scheming

Mikita Balesni and Marius Hobbhahn

Oct 31, 2024, 5:20 PM

60 points

1 comment2 min readLW link

Predictions of Near-Term Societal Changes Due to Artificial Intelligence

AnnapurnaDec 29, 2024, 2:53 PM

10 points

0 comments6 min readLW link

(jorgevelez.substack.com)

The EU AI Act : Caution Against a Potential Ultron

Srishti DuttaNov 5, 2024, 3:49 AM

1 point

0 comments9 min readLW link

An Uncanny Moat

Adam NewgasNov 15, 2024, 11:39 AM

8 points

0 comments4 min readLW link

(www.boristhebrave.com)

Compute Governance and Conclusions—Transformative AI and Compute [3/4]

lennartOct 14, 2021, 8:23 AM

13 points

0 comments5 min readLW link

Proposing the Conditional AI Safety Treaty (linkpost TIME)

otto.bartenNov 15, 2024, 1:59 PM

10 points

8 comments3 min readLW link

(time.com)

Why We Wouldn’t Build Aligned AI Even If We Could

SnowyiuNov 16, 2024, 8:19 PM

10 points

7 comments10 min readLW link

Aligning AI Safety Projects with a Republican Administration

Deric ChengNov 21, 2024, 10:12 PM

33 points

1 comment8 min readLW link

The U.S. National Security State is Here to Make AI Even Less Transparent and Accountable

Matrice JacobineNov 24, 2024, 9:36 AM

0 points

0 comments2 min readLW link

(www.eff.org)

An Open Letter To EA and AI Safety On Decelerating AI Development

kenneth_diaoFeb 28, 2025, 5:21 PM

8 points

0 comments14 min readLW link

(graspingatwaves.substack.com)

Why Recursive Self-Improvement Might Not Be the Existential Risk We Fear

Nassim_ANov 24, 2024, 5:17 PM

1 point

0 comments9 min readLW link

Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks

Tom DAVID and Miailhe Nicolas

Nov 27, 2024, 2:54 AM

30 points

0 comments2 min readLW link

Workshop Report: Why current benchmarks approaches are not sufficient for safety?

Tom DAVID and Pierre Peigné

Nov 26, 2024, 5:20 PM

3 points

1 comment3 min readLW link

AI & Liability Ideathon

Kabir KumarNov 26, 2024, 1:54 PM

18 points

2 comments4 min readLW link

(lu.ma)

Taking Away the Guns First: The Fundamental Flaw in AI Development

s-iceNov 26, 2024, 10:11 PM

1 point

0 comments17 min readLW link

How to solve the misuse problem assuming that in 10 years the default scenario is that AGI agents are capable of synthetizing pathogens

jeremttiNov 27, 2024, 9:17 PM

6 points

0 comments9 min readLW link

CAIDP Statement on Lethal Autonomous Weapons Systems

HerambNov 30, 2024, 6:16 PM

−1 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Nobody Asks the Monkey: Why Human Agency Matters in the AI Age

Miloš BorenovićDec 3, 2024, 2:16 PM

1 point

0 comments2 min readLW link

(open.substack.com)

Truthful AI: Developing and governing AI that does not lie

Owain_Evans, owencb and Lukas Finnveden

Oct 18, 2021, 6:37 PM

82 points

9 comments10 min readLW link

AMA on Truthful AI: Owen Cotton-Barratt, Owain Evans & co-authors

Owain_EvansOct 22, 2021, 4:23 PM

31 points

15 comments1 min readLW link

The Milton Friedman Model of Policy Change

JohnofCharlestonMar 4, 2025, 12:38 AM

136 points

17 comments4 min readLW link

Give Neo a Chance

ankMar 6, 2025, 1:48 AM

3 points

7 comments7 min readLW link

Anthropic’s Recommendations to OSTP for the U.S. AI Action Plan

UnofficialLinkpostBotMar 6, 2025, 10:38 PM

11 points

2 comments2 min readLW link

(www.anthropic.com)

We Have No Plan for Preventing Loss of Control in Open Models

Andrew DicksonMar 10, 2025, 3:35 PM

44 points

11 comments22 min readLW link

The Intelligence Curse

lukedragoJan 3, 2025, 7:07 PM

123 points

26 comments18 min readLW link

(lukedrago.substack.com)

Policymakers don’t have access to paywalled articles

Adam JonesJan 5, 2025, 10:56 AM

71 points

11 comments2 min readLW link

(adamjones.me)

New AI safety treaty paper out!

otto.bartenMar 26, 2025, 9:29 AM

15 points

2 comments4 min readLW link

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

Lauren GreenspanJan 7, 2025, 3:08 AM

37 points

2 comments12 min readLW link

Governance Course—Week 1 Reflections

Alice BlairJan 9, 2025, 4:48 AM

4 points

1 comment5 min readLW link

Thoughts on the In-Context Scheming AI Experiment

ExCephJan 9, 2025, 2:19 AM

3 points

0 comments4 min readLW link

Scaling AI Regulation: Realistically, what Can (and Can’t) Be Regulated?

Katalina HernandezMar 11, 2025, 4:51 PM

1 point

1 comment3 min readLW link

No comments.