On Privilege

shminux18 May 2024 22:36 UTC

10 points

1 comment2 min readLW link

Fund me please—I Work so Hard that my Feet start Bleeding and I Need to Infiltrate University

Johannes C. Mayer18 May 2024 19:53 UTC

35 points

6 comments6 min readLW link

[Question] What’s the risk that AI tortures us all?

Justus18 May 2024 19:36 UTC

0 points

0 comments1 min readLW link

To Limit Impact, Limit KL-Divergence

J Bostock18 May 2024 18:52 UTC

3 points

1 comment5 min readLW link

[Crosspost] Introducing the Save State Paradox

Suzie. EXE18 May 2024 17:00 UTC

1 point

0 comments7 min readLW link

[Question] Are There Other Ideas as Generally Applicable as Natural Selection

Amin Sennour18 May 2024 16:37 UTC

1 point

0 comments1 min readLW link

Scientific Notation Options

jefftk18 May 2024 15:10 UTC

18 points

5 comments1 min readLW link

(www.jefftk.com)

“If we go extinct due to misaligned AI, at least nature will continue, right? … right?”

plex18 May 2024 14:09 UTC

39 points

7 comments2 min readLW link

(aisafety.info)

What Are Non-Zero-Sum Games?—A Primer

James Stephen Brown18 May 2024 9:19 UTC

4 points

1 comment3 min readLW link

DeepMind’s “Frontier Safety Framework” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC

114 points

9 comments4 min readLW link

International Scientific Report on the Safety of Advanced AI: Key Information

Aryeh Englander18 May 2024 1:45 UTC

23 points

0 comments13 min readLW link

Goodhart in RL with KL: Appendix

Thomas Kwa18 May 2024 0:40 UTC

11 points

0 comments6 min readLW link

AI 2030 – AI Policy Roadmap

LTM17 May 2024 23:29 UTC

8 points

0 comments1 min readLW link

MIT FutureTech are hiring for an Operations and Project Management role.

peterslattery17 May 2024 23:21 UTC

2 points

0 comments3 min readLW link

Language Models Model Us

eggsyntax17 May 2024 21:00 UTC

79 points

18 comments7 min readLW link

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Joar Skalse17 May 2024 19:13 UTC

45 points

1 comment2 min readLW link

DeepMind: Frontier Safety Framework

Zach Stein-Perlman17 May 2024 17:30 UTC

60 points

0 comments3 min readLW link

(deepmind.google)

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Dan Braun, Jordan Taylor, Nicholas Goldowsky-Dill and Lee Sharkey

17 May 2024 16:25 UTC

49 points

2 comments4 min readLW link

(publications.apolloresearch.ai)

AISafety.com – Resources for AI Safety

Søren Elverlin, plex, Bryce Robertson and Melissa Samworth

17 May 2024 15:57 UTC

65 points

2 comments1 min readLW link

Berlin AI Alignment Open Meetup May 2024

GuyP17 May 2024 13:27 UTC

2 points

0 comments1 min readLW link