LLMs seem (rel­a­tively) safe

JustisMills25 Apr 2024 22:13 UTC
53 points
24 comments7 min readLW link
(justismills.substack.com)

Los­ing Faith In Con­trar­i­anism

omnizoid25 Apr 2024 20:53 UTC
38 points
44 comments5 min readLW link

Why I stopped be­ing into basin broadness

tailcalled25 Apr 2024 20:47 UTC
16 points
3 comments2 min readLW link

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilan25 Apr 2024 19:10 UTC
20 points
1 comment63 min readLW link

Im­prov­ing Dic­tionary Learn­ing with Gated Sparse Autoencoders

25 Apr 2024 18:43 UTC
63 points
38 comments1 min readLW link
(arxiv.org)

“Why I Write” by Ge­orge Or­well (1946)

Arjun Panickssery25 Apr 2024 16:02 UTC
58 points
2 comments9 min readLW link
(www.orwellfoundation.com)

Knowl­edge Base 8: The truth as an at­trac­tor in the in­for­ma­tion space

iwis25 Apr 2024 15:28 UTC
−8 points
0 comments2 min readLW link

Cy­ber­se­cu­rity of Fron­tier AI Models: A Reg­u­la­tory Review

25 Apr 2024 14:51 UTC
8 points
0 comments8 min readLW link

The first fu­ture and the best future

KatjaGrace25 Apr 2024 6:40 UTC
106 points
12 comments1 min readLW link
(worldspiritsockpuppet.com)

NIH Cancer Myths Myths

25 Apr 2024 5:43 UTC
15 points
1 comment2 min readLW link

so­cial lemon markets

bhauth25 Apr 2024 2:18 UTC
22 points
6 comments3 min readLW link
(www.bhauth.com)

Bayesian in­fer­ence with­out priors

DanielFilan24 Apr 2024 23:50 UTC
26 points
8 comments8 min readLW link
(danielfilan.com)

The In­ner Ring by C. S. Lewis

Saul Munn24 Apr 2024 22:48 UTC
69 points
6 comments13 min readLW link
(www.lewissociety.org)

This is Water by David Foster Wallace

Nathan Young24 Apr 2024 21:21 UTC
58 points
16 comments13 min readLW link
(fs.blog)

Is be­ing a trans woman (or just low-T) +20 IQ?

lemonhope24 Apr 2024 20:04 UTC
6 points
29 comments1 min readLW link

Be­ta­dine oral rinses for covid and other viral infections

Elizabeth24 Apr 2024 17:50 UTC
22 points
3 comments5 min readLW link
(acesounderglass.com)

At last! ChatGPT does, shall we say, in­ter­est­ing imi­ta­tions of “Kubla Khan”

Bill Benzon24 Apr 2024 14:56 UTC
−3 points
0 comments4 min readLW link

Magic by forgetting

avturchin24 Apr 2024 14:32 UTC
18 points
39 comments4 min readLW link

Changes in Col­lege Admissions

Zvi24 Apr 2024 13:50 UTC
50 points
11 comments39 min readLW link
(thezvi.wordpress.com)

1-page out­line of Car­l­smith’s oth­er­ness and con­trol series

Nathan Young24 Apr 2024 11:25 UTC
22 points
3 comments3 min readLW link

How to use and in­ter­pret ac­ti­va­tion patching

24 Apr 2024 8:35 UTC
12 points
0 comments18 min readLW link

AI Gen­er­ated Mu­sic as a Method of In­stal­ling Essen­tial Ra­tion­al­ist Skills

keltan24 Apr 2024 7:48 UTC
13 points
3 comments1 min readLW link

Elec­tronic Harp Man­dolin Prototype

jefftk24 Apr 2024 2:20 UTC
9 points
0 comments1 min readLW link
(www.jefftk.com)

[Question] Ex­am­ples of Highly Coun­ter­fac­tual Dis­cov­er­ies?

johnswentworth23 Apr 2024 22:19 UTC
194 points
101 comments1 min readLW link

[Question] Is there soft­ware to prac­tice read­ing ex­pres­sions?

lsusr23 Apr 2024 21:53 UTC
37 points
10 comments1 min readLW link

Let’s De­sign A School, Part 1

Sable23 Apr 2024 21:50 UTC
55 points
5 comments11 min readLW link
(affablyevil.substack.com)

WSJ: In­side Ama­zon’s Se­cret Oper­a­tion to Gather In­tel on Rivals

trevor23 Apr 2024 21:33 UTC
37 points
5 comments5 min readLW link
(www.wsj.com)

On Minicircle

Metacelsus23 Apr 2024 21:28 UTC
10 points
0 comments1 min readLW link
(docs.google.com)

Sim­ple probes can catch sleeper agents

23 Apr 2024 21:10 UTC
133 points
21 comments1 min readLW link
(www.anthropic.com)

Man­i­fold “ex­plor­ing real cash prizes”

Rana Dexsin23 Apr 2024 21:07 UTC
7 points
0 comments1 min readLW link
(manifoldmarkets.notion.site)

[Question] (When) Should you work through the night when in­spira­tion strikes you?

Chi Nguyen23 Apr 2024 21:07 UTC
21 points
4 comments1 min readLW link

Book re­view: Deep Utopia

PeterMcCluskey23 Apr 2024 19:55 UTC
45 points
14 comments4 min readLW link
(bayesianinvestor.com)

On what re­search poli­cy­mak­ers ac­tu­ally need

MondSemmel23 Apr 2024 19:50 UTC
38 points
0 comments3 min readLW link
(www.slowboring.com)

De­quan­tify­ing first-or­der theories

jessicata23 Apr 2024 19:04 UTC
40 points
9 comments8 min readLW link
(unstableontology.com)

Vec­tor Plan­ning in a Lat­tice Graph

23 Apr 2024 16:58 UTC
20 points
7 comments2 min readLW link

ProLU: A Non­lin­ear­ity for Sparse Autoencoders

Glen Taggart23 Apr 2024 14:09 UTC
44 points
4 comments9 min readLW link

Sub­jec­tive Ques­tions Re­quire Sub­jec­tive information

Ben23 Apr 2024 13:16 UTC
7 points
4 comments4 min readLW link

Re­ject­ing Television

Declan Molony23 Apr 2024 4:59 UTC
85 points
10 comments6 min readLW link

LW Front­page Ex­per­i­ments! (aka “Take the wheel, Shog­goth!”)

23 Apr 2024 3:58 UTC
71 points
27 comments5 min readLW link

Thoughts on Zero Points

depressurize23 Apr 2024 2:22 UTC
31 points
1 comment4 min readLW link
(sexandchicago.substack.com)

Funny Anec­dote of Eliezer From His Sister

Noah Birnbaum22 Apr 2024 22:05 UTC
202 points
6 comments2 min readLW link

How LLMs Work, in the Style of The Economist

utilistrutil22 Apr 2024 19:06 UTC
0 points
0 comments2 min readLW link

Mea­sur­ing Co­her­ence and Goal-Direct­ed­ness in RL Policies

dx2622 Apr 2024 18:26 UTC
10 points
0 comments7 min readLW link

AI Reg­u­la­tion is Unsafe

Maxwell Tabarrok22 Apr 2024 16:37 UTC
40 points
41 comments4 min readLW link
(www.maximum-progress.com)

Pri­ors and Prejudice

MathiasKB22 Apr 2024 15:00 UTC
150 points
31 comments7 min readLW link

For­get Every­thing (Statis­ti­cal Me­chan­ics Part 1)

J Bostock22 Apr 2024 13:33 UTC
39 points
6 comments3 min readLW link

On Llama-3 and Dwarkesh Pa­tel’s Pod­cast with Zuckerberg

Zvi22 Apr 2024 13:10 UTC
63 points
4 comments47 min readLW link
(thezvi.wordpress.com)

Mo­ti­va­tion gaps: Why so much EA crit­i­cism is hos­tile and lazy

titotal22 Apr 2024 11:49 UTC
69 points
5 comments1 min readLW link
(titotal.substack.com)

Should we break up Google Deep­Mind?

Hauke Hillebrandt22 Apr 2024 9:16 UTC
3 points
0 comments1 min readLW link

What should our con­tain­ers do?

Richard Henage22 Apr 2024 6:17 UTC
1 point
1 comment2 min readLW link