rough draft on what happens in the brain when you have an insight

Emrik21 May 2024 18:02 UTC

9 points

2 comments1 min readLW link

On Dwarkesh’s Podcast with OpenAI’s John Schulman

Zvi21 May 2024 17:30 UTC

55 points

1 comment20 min readLW link

(thezvi.wordpress.com)

[Question] Is deleting capabilities still a relevant research question?

tailcalled21 May 2024 13:24 UTC

8 points

1 comment1 min readLW link

My Dating Heuristic

Declan Molony21 May 2024 5:28 UTC

13 points

4 comments2 min readLW link

Scorable Functions: A Format for Algorithmic Forecasting

ozziegooen21 May 2024 4:14 UTC

19 points

0 comments1 min readLW link

The Problem With the Word ‘Alignment’

peligrietzer and particlemania

21 May 2024 3:48 UTC

53 points

4 comments6 min readLW link

Some perspectives on the discipline of Physics

Tahp20 May 2024 18:19 UTC

12 points

3 comments13 min readLW link

(quark.rodeo)

Interpretability: Integrated Gradients is a decent attribution method

Lucius Bushnaq, jake_mendel, StefanHex and Kaarel

20 May 2024 17:55 UTC

12 points

7 comments6 min readLW link

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Lucius Bushnaq, jake_mendel, Dan Braun, StefanHex, Nicholas Goldowsky-Dill, Kaarel, Avery, Joern Stoehler, debrevitatevitae, Magdalena Wache and Marius Hobbhahn

20 May 2024 17:53 UTC

77 points

2 comments3 min readLW link

Infra-Bayesian haggling

hannagabor20 May 2024 12:23 UTC

17 points

0 comments20 min readLW link

Jaan Tallinn’s 2023 Philanthropy Overview

jaan20 May 2024 12:11 UTC

138 points

3 comments1 min readLW link

(jaan.info)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset]

abstractapplic20 May 2024 9:38 UTC

24 points

1 comment1 min readLW link

Why I find Davidad’s plan interesting

Paul W20 May 2024 8:13 UTC

18 points

0 comments6 min readLW link

Anthropic: Reflections on our Responsible Scaling Policy

Zac Hatfield-Dodds20 May 2024 4:14 UTC

40 points

21 comments10 min readLW link

(www.anthropic.com)

The consistent guessing problem is easier than the halting problem

jessicata20 May 2024 4:02 UTC

28 points

5 comments4 min readLW link

(unstableontology.com)

Against Computers (infinite play)

rogersbacon20 May 2024 0:43 UTC

−11 points

1 comment14 min readLW link

(www.secretorum.life)

[Question] Can environmental laws/NEPA be used for decelism?

Alex K. Chen (parrot)19 May 2024 18:43 UTC

−4 points

0 comments1 min readLW link

Testing for parallel reasoning in LLMs

meemi and Olli Järviniemi

19 May 2024 15:28 UTC

2 points

7 comments9 min readLW link

Some “meta-cruxes” for AI x-risk debates

Aryeh Englander19 May 2024 0:21 UTC

20 points

2 comments3 min readLW link

On Privilege

shminux18 May 2024 22:36 UTC

15 points

10 comments2 min readLW link