9 Jan 2024 23:29 UTC

10 points

6 comments25 min readLW link

On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche

Zack_M_Davis9 Jan 2024 23:12 UTC

44 points

31 comments4 min readLW link

[Question] What’s the protocol for if a novice has ML ideas that are unlikely to work, but might improve capabilities if they do work?

drocta9 Jan 2024 22:51 UTC

6 points

2 comments2 min readLW link

Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor

RogerDearnaley9 Jan 2024 20:42 UTC

47 points

8 comments36 min readLW link

Bent or Blunt Hoods?

jefftk9 Jan 2024 20:10 UTC

23 points

0 comments1 min readLW link

(www.jefftk.com)

2024 ACX Predictions: Blind/Buy/Sell/Hold

Zvi9 Jan 2024 19:30 UTC

33 points

2 comments31 min readLW link

(thezvi.wordpress.com)

Announcing the Double Crux Bot

sanyer, Sofia Vanhanen and sarah.bluhm

9 Jan 2024 18:54 UTC

52 points

8 comments3 min readLW link

Does AI risk “other” the AIs?

Joe Carlsmith9 Jan 2024 17:51 UTC

59 points

3 comments8 min readLW link

AI demands unprecedented reliability

Jono9 Jan 2024 16:30 UTC

22 points

5 comments2 min readLW link

Uncertainty in all its flavours

Cleo Nardo9 Jan 2024 16:21 UTC

27 points

6 comments35 min readLW link

Compensating for Life Biases

Jonathan Moregård9 Jan 2024 14:39 UTC

24 points

6 comments3 min readLW link

(honestliving.substack.com)

Can Morality Be Quantified?

Julius9 Jan 2024 6:35 UTC

3 points

0 comments5 min readLW link

Learning Math in Time for Alignment

Nicholas / Heather Kross9 Jan 2024 1:02 UTC

32 points

3 comments3 min readLW link

Brief Thoughts on Justifications for Paternalism

Srdjan Miletic9 Jan 2024 0:36 UTC

4 points

0 comments4 min readLW link

(dissent.blog)

Hiring decisions are not suitable for prediction markets

SimonM8 Jan 2024 21:11 UTC

12 points

6 comments1 min readLW link

Better Anomia

jefftk8 Jan 2024 18:40 UTC

8 points

0 comments1 min readLW link

(www.jefftk.com)

A starter guide for evals

Marius Hobbhahn, Jérémy Scheurer, Mikita Balesni, rusheb and AlexMeinke

8 Jan 2024 18:24 UTC

50 points

2 comments12 min readLW link

(www.apolloresearch.ai)

Is it justifiable for non-experts to have strong opinions about Gaza?

Yair Halberstadt and Adam Zerner

8 Jan 2024 17:31 UTC

23 points

12 comments30 min readLW link

Project ideas: Backup plans & Cooperative AI

Lukas Finnveden8 Jan 2024 17:19 UTC

18 points

0 comments1 min readLW link

(lukasfinnveden.substack.com)

Hackathon and Staying Up-to-Date in AI

jacobhaimes8 Jan 2024 17:10 UTC

11 points

0 comments1 min readLW link

(into-ai-safety.github.io)

When “yang” goes wrong

Joe Carlsmith8 Jan 2024 16:35 UTC

72 points

6 comments13 min readLW link

Task vectors & analogy making in LLMs

Sergii8 Jan 2024 15:17 UTC

9 points

1 comment4 min readLW link

(grgv.xyz)

[Question] How to find translations of a book?

Viliam8 Jan 2024 14:57 UTC

9 points

8 comments1 min readLW link

[Question] Why aren’t Yudkowsky & Bostrom getting more attention now?

JoshuaFox8 Jan 2024 14:42 UTC

14 points

8 comments1 min readLW link

2023 Prediction Evaluations

Zvi8 Jan 2024 14:40 UTC

47 points

0 comments28 min readLW link

(thezvi.wordpress.com)

There is no sharp boundary between deontology and consequentialism

quetzal_rainbow8 Jan 2024 11:01 UTC

8 points

2 comments1 min readLW link

Reflections on my first year of AI safety research

Jay Bailey8 Jan 2024 7:49 UTC

52 points

3 comments1 min readLW link

Why There Is Hope For An Alignment Solution

Darklight8 Jan 2024 6:58 UTC

10 points

0 comments12 min readLW link

Sledding Among Hazards

jefftk8 Jan 2024 3:30 UTC

19 points

5 comments1 min readLW link

(www.jefftk.com)

Utility is relative

CrimsonChin8 Jan 2024 2:31 UTC

2 points

4 comments2 min readLW link

A model of research skill

L Rudolf L8 Jan 2024 0:13 UTC

55 points

6 comments12 min readLW link

(www.strataoftheworld.com)

We shouldn’t fear superintelligence because it already exists

Spencer Chubb7 Jan 2024 17:59 UTC

−22 points

14 comments1 min readLW link

(Partial) failure in replicating deceptive alignment experiment

claudia.biancotti7 Jan 2024 17:56 UTC

1 point

0 comments1 min readLW link

Project ideas: Sentience and rights of digital minds

Lukas Finnveden7 Jan 2024 17:34 UTC

20 points

0 comments1 min readLW link

(lukasfinnveden.substack.com)

Deceptive AI ≠ Deceptively-aligned AI

Steven Byrnes7 Jan 2024 16:55 UTC

96 points

19 comments6 min readLW link

Bayesians Commit the Gambler’s Fallacy

Kevin Dorst7 Jan 2024 12:54 UTC

46 points

28 comments8 min readLW link

(kevindorst.substack.com)

Towards AI Safety Infrastructure: Talk & Outline

Paul Bricman7 Jan 2024 9:31 UTC

11 points

0 comments2 min readLW link

(www.youtube.com)

Defending against hypothetical moon life during Apollo 11

eukaryote7 Jan 2024 4:49 UTC

57 points

9 comments32 min readLW link

(eukaryotewritesblog.com)

The Sequences on YouTube

Neil 7 Jan 2024 1:44 UTC

26 points

9 comments2 min readLW link

AI Risk and the US Presidential Candidates

Zane6 Jan 2024 20:18 UTC

41 points

22 comments6 min readLW link

A Challenge to Effective Altruism’s Premises

False Name6 Jan 2024 18:46 UTC

−26 points

3 comments3 min readLW link

Lack of Spider-Man is evidence against the simulation hypothesis

RamblinDash6 Jan 2024 18:17 UTC

7 points

22 comments1 min readLW link

A Land Tax For Britain

A.H.6 Jan 2024 15:52 UTC

6 points

9 comments4 min readLW link

Book review: Trick or treatment (2008)

Fleece Minutia6 Jan 2024 15:40 UTC

1 point

0 comments2 min readLW link

Are we inside a black hole?

Jay6 Jan 2024 13:30 UTC

2 points

5 comments1 min readLW link

Survey of 2,778 AI authors: six parts in pictures

KatjaGrace6 Jan 2024 4:43 UTC

80 points

1 comment2 min readLW link

Project ideas: Epistemics

Lukas Finnveden5 Jan 2024 23:41 UTC

43 points

4 comments1 min readLW link

(lukasfinnveden.substack.com)

Almost everyone I’ve met would be well-served thinking more about what to focus on

Henrik Karlsson5 Jan 2024 21:01 UTC

95 points

8 comments11 min readLW link

(www.henrikkarlsson.xyz)

The Next ChatGPT Moment: AI Avatars

kolmplex and southpaw

5 Jan 2024 20:14 UTC

43 points

10 comments1 min readLW link

AI Impacts 2023 Expert Survey on Progress in AI

habryka5 Jan 2024 19:42 UTC

28 points

1 comment7 min readLW link

(wiki.aiimpacts.org)