All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Speculation on Path-Dependance in Large Language Models.

NickyP15 Jan 2023 20:42 UTC

16 points

2 comments7 min readLW link

Underspecification of Oracle AI

Rubi J. Hudson, Adam Jermyn and Johannes Treutlein

15 Jan 2023 20:10 UTC

30 points

12 comments19 min readLW link

[Question] How Does the Human Brain Compare to Deep Learning on Sample Efficiency?

DragonGod15 Jan 2023 19:49 UTC

10 points

6 comments1 min readLW link

Deceptive failures short of full catastrophe.

Alex Lawsen 15 Jan 2023 19:28 UTC

33 points

5 comments9 min readLW link

Non-directed conceptual founding

TsviBT15 Jan 2023 14:56 UTC

12 points

3 comments1 min readLW link

Panopticons aren’t enough

Program Den15 Jan 2023 12:55 UTC

−10 points

7 comments1 min readLW link

[Question] Is this chat GPT rewrite of my post better?

Yair Halberstadt15 Jan 2023 9:47 UTC

2 points

5 comments1 min readLW link

A simple proposal for preserving free speech on twitter

Yair Halberstadt15 Jan 2023 9:42 UTC

−2 points

13 comments1 min readLW link

Core Concept Conversation: What is technology?

Adam Zerner15 Jan 2023 9:40 UTC

8 points

1 comment1 min readLW link

Language Ex Machina

janus15 Jan 2023 9:19 UTC

41 points

23 comments24 min readLW link

(generative.ink)

Core Concept Conversation: What is wealth?

Adam Zerner15 Jan 2023 9:07 UTC

13 points

30 comments3 min readLW link

Core Concept Conversations

Adam Zerner15 Jan 2023 7:17 UTC

14 points

1 comment1 min readLW link

Incentives considered harmful

Ulisse Mini15 Jan 2023 6:38 UTC

6 points

0 comments1 min readLW link

(uli.rocks)

Consider paying for literature or book reviews using bounties and dominant assurance contracts

Arjun Panickssery15 Jan 2023 3:56 UTC

57 points

7 comments2 min readLW link

Podcast with Divia Eden on operant conditioning

DanielFilan15 Jan 2023 2:44 UTC

14 points

0 comments1 min readLW link

(youtu.be)

We Need Holistic AI Macrostrategy

NickGabs15 Jan 2023 2:13 UTC

39 points

4 comments8 min readLW link

[Question] When to mention irrelevant accusations?

philh14 Jan 2023 21:58 UTC

20 points

50 comments1 min readLW link

World-Model Interpretability Is All We Need

Thane Ruthenis14 Jan 2023 19:37 UTC

35 points

22 comments21 min readLW link

Current AI Models Seem Sufficient for Low-Risk, Beneficial AI

harsimony14 Jan 2023 18:55 UTC

17 points

1 comment2 min readLW link

[Question] Basic Question about LLMs: how do they know what task to perform

Garak14 Jan 2023 13:13 UTC

1 point

3 comments1 min readLW link

Aligned with what?

Program Den14 Jan 2023 10:28 UTC

3 points

41 comments1 min readLW link

Wokism, rethinking priorities and the Bostrom case

Arturo Macias14 Jan 2023 2:27 UTC

−31 points

2 comments4 min readLW link

A general comment on discussions of genetic group differences

anonymous810114 Jan 2023 2:11 UTC

70 points

46 comments3 min readLW link

Abstractions as morphisms between (co)algebras

Erik Jenner14 Jan 2023 1:51 UTC

17 points

1 comment8 min readLW link

Concrete Reasons for Hope about AI

Zac Hatfield-Dodds14 Jan 2023 1:22 UTC

100 points

13 comments1 min readLW link

Negative Expertise

Jonas Kgomo14 Jan 2023 0:51 UTC

4 points

0 comments1 min readLW link

(twitter.com)

Mid-Atlantic AI Alignment Alliance Unconference

Quinn13 Jan 2023 20:33 UTC

7 points

2 comments1 min readLW link

Smallpox vaccines are widely available, for now

David Hornbein13 Jan 2023 20:02 UTC

26 points

5 comments1 min readLW link

How does GPT-3 spend its 175B parameters?

Robert_AIZI13 Jan 2023 19:21 UTC

41 points

14 comments6 min readLW link

(aizi.substack.com)

[ASoT] Simulators show us behavioural properties by default

Jozdien13 Jan 2023 18:42 UTC

35 points

3 comments3 min readLW link

Wheel of Consent Theory for Rationalists and Effective Altruists

adamwilder13 Jan 2023 17:59 UTC

1 point

0 comments2 min readLW link

Money is a way of thanking strangers

DirectedEvolution13 Jan 2023 17:06 UTC

13 points

5 comments4 min readLW link

Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

DragonGod13 Jan 2023 16:53 UTC

62 points

12 comments1 min readLW link

(arxiv.org)

How we could stumble into AI catastrophe

HoldenKarnofsky13 Jan 2023 16:20 UTC

71 points

18 comments18 min readLW link

(www.cold-takes.com)

Robustness & Evolution [MLAISU W02]

Esben Kran13 Jan 2023 15:47 UTC

10 points

0 comments3 min readLW link

(newsletter.apartresearch.com)

On Cooking With Gas

Zvi13 Jan 2023 14:20 UTC

38 points

60 comments6 min readLW link

(thezvi.wordpress.com)

Beware safety-washing

Lizka13 Jan 2023 13:59 UTC

43 points

2 comments4 min readLW link

Some Arguments Against Strong Scaling

Joar Skalse13 Jan 2023 12:04 UTC

26 points

21 comments16 min readLW link

[Question] Where do you find people who actually do things?

Ulisse Mini13 Jan 2023 6:57 UTC

7 points

12 comments1 min readLW link

[Question] Could Simulating an AGI Taking Over the World Actually Lead to a LLM Taking Over the World?

simeon_c13 Jan 2023 6:33 UTC

15 points

1 comment1 min readLW link

Burning Uptime: When your Sandbox of Empathy is Leaky and also an Hourglass

Cedar13 Jan 2023 5:18 UTC

12 points

2 comments3 min readLW link

Disentangling Shard Theory into Atomic Claims

Leon Lang13 Jan 2023 4:23 UTC

86 points

6 comments18 min readLW link

AGISF adaptation for in-person groups

Sam Marks, Xander Davies and Richard_Ngo

13 Jan 2023 3:24 UTC

44 points

2 comments3 min readLW link

Actions and Flows

Alok Singh13 Jan 2023 3:20 UTC

5 points

0 comments1 min readLW link

(alok.github.io)

A Thorough Introduction to Abstraction

RohanS13 Jan 2023 0:30 UTC

9 points

1 comment18 min readLW link

The AI Control Problem in a wider intellectual context

philosophybear13 Jan 2023 0:28 UTC

11 points

3 comments12 min readLW link

The Alignment Problems

Martín Soto12 Jan 2023 22:29 UTC

20 points

0 comments4 min readLW link

Proposal for Inducing Steganography in LMs

Logan Riggs12 Jan 2023 22:15 UTC

22 points

3 comments2 min readLW link

Announcing the 2023 PIBBSS Summer Research Fellowship

Nora_Ammann and DusanDNesic

12 Jan 2023 21:31 UTC

32 points

0 comments1 min readLW link

Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC

40 points

3 comments4 min readLW link

(www.theinsideview.ai)