E.T. Jaynes Prob­a­bil­ity The­ory: The logic of Science I

27 Dec 2023 23:47 UTC
62 points
20 comments21 min readLW link

Free agents

Michele Campolo27 Dec 2023 20:20 UTC
6 points
19 comments13 min readLW link

Merry Christ­mas Every­one!

johnlawrenceaspden27 Dec 2023 19:49 UTC
14 points
1 comment1 min readLW link

Nat­u­ral La­tents: The Math

27 Dec 2023 19:03 UTC
120 points
37 comments12 min readLW link

NYT is su­ing OpenAI&Microsoft for alleged copy­right in­fringe­ment; some quick thoughts

Mikhail Samin27 Dec 2023 18:44 UTC
42 points
17 comments1 min readLW link

Ex­tropy mag­a­z­ine re­view

Peter lawless 27 Dec 2023 18:37 UTC
1 point
0 comments1 min readLW link

The Progress Paradox

Ben Turtel27 Dec 2023 18:26 UTC
3 points
3 comments4 min readLW link
(bturtel.substack.com)

The vir­tu­ous cir­cle: twelve con­jec­tures about fe­male re­pro­duc­tive agency and cul­tural self-determination

Miles Saltiel27 Dec 2023 18:25 UTC
0 points
2 comments14 min readLW link

MSP Ar­ti­cle Dis­cus­sion Meetup: The EMH, Long-Term In­vest­ing, and Lev­er­aged ETFs

25Hour27 Dec 2023 16:50 UTC
3 points
1 comment1 min readLW link

In Defense of Epistemic Em­pa­thy

Kevin Dorst27 Dec 2023 16:27 UTC
55 points
19 comments6 min readLW link
(kevindorst.substack.com)

Crit­i­cal re­view of Chris­ti­ano’s dis­agree­ments with Yudkowsky

Vanessa Kosoy27 Dec 2023 16:02 UTC
172 points
40 comments15 min readLW link

AGI will be made of het­ero­ge­neous com­po­nents, Trans­former and Selec­tive SSM blocks will be among them

Roman Leventov27 Dec 2023 14:51 UTC
33 points
9 comments4 min readLW link

5. Mo­ral Value for Sen­tient An­i­mals? Alas, Not Yet

RogerDearnaley27 Dec 2023 6:42 UTC
33 points
41 comments23 min readLW link

Differ­en­tial Op­ti­miza­tion Reframes and Gen­er­al­izes Utility-Maximization

J Bostock27 Dec 2023 1:54 UTC
30 points
2 comments3 min readLW link

More Thoughts on the Hu­man-AGI War

Seth Ahrenbach27 Dec 2023 1:03 UTC
−3 points
4 comments7 min readLW link

METR is hiring!

Beth Barnes26 Dec 2023 21:00 UTC
65 points
1 comment1 min readLW link

En­vi­ron­men­tal aller­gies are cur­able? (Sublin­gual im­munother­apy)

Chipmonk26 Dec 2023 19:05 UTC
47 points
10 comments1 min readLW link

Pi­casso in the Gallery of Babel

samhealy26 Dec 2023 16:25 UTC
12 points
12 comments4 min readLW link

Flag­ging Po­ten­tially Un­fair Parenting

jefftk26 Dec 2023 12:40 UTC
69 points
1 comment1 min readLW link
(www.jefftk.com)

Link Col­lec­tion: Im­pact Markets

Saul Munn26 Dec 2023 9:01 UTC
27 points
0 comments2 min readLW link
(www.brasstacks.blog)

How Emer­gency Medicine Solves the Align­ment Problem

StrivingForLegibility26 Dec 2023 5:24 UTC
41 points
4 comments6 min readLW link

Ra­tion­al­ity out­reach vs. ra­tio­nal­ity teaching

Lenmar26 Dec 2023 0:37 UTC
7 points
2 comments1 min readLW link

Ex­plor­ing the Resi­d­ual Stream of Trans­form­ers for Mechanis­tic In­ter­pretabil­ity — Explained

Zeping Yu26 Dec 2023 0:36 UTC
7 points
1 comment11 min readLW link

[Question] Anki setup best prac­tices?

Sinclair Chen25 Dec 2023 22:34 UTC
11 points
4 comments1 min readLW link

[Question] Why does ex­pected util­ity mat­ter?

Marco Discendenti25 Dec 2023 14:47 UTC
18 points
21 comments4 min readLW link

Freeze Dried Rasp­berry Truffles

jefftk25 Dec 2023 14:10 UTC
14 points
0 comments1 min readLW link
(www.jefftk.com)

Porno­graphic and semi-porno­graphic ads on main­stream web­sites as an in­stance of the AI al­ign­ment prob­lem?

greenrd25 Dec 2023 13:19 UTC
−1 points
5 comments12 min readLW link

Defense Against The Dark Arts: An Introduction

Lyrongolem25 Dec 2023 6:36 UTC
24 points
36 comments20 min readLW link

Oc­clu­sions of Mo­ral Knowledge

herschel25 Dec 2023 5:55 UTC
−1 points
0 comments2 min readLW link
(brothernin.substack.com)

[Question] Would you have a baby in 2024?

martinkunev25 Dec 2023 1:52 UTC
24 points
76 comments1 min readLW link

al­ign your la­tent spaces

bhauth24 Dec 2023 16:30 UTC
27 points
8 comments2 min readLW link
(www.bhauth.com)

Viral Guess­ing Game

jefftk24 Dec 2023 13:10 UTC
19 points
0 comments1 min readLW link
(www.jefftk.com)

The Su­gar Align­ment Problem

Adam Zerner24 Dec 2023 1:35 UTC
5 points
3 comments7 min readLW link

A Crisper Ex­pla­na­tion of Si­mu­lacrum Levels

Thane Ruthenis23 Dec 2023 22:13 UTC
89 points
13 comments13 min readLW link

Hyper­bolic Dis­count­ing and Pas­cal’s Mugging

Andrew Keenan Richardson23 Dec 2023 21:55 UTC
9 points
0 comments7 min readLW link

AISN #28: Cen­ter for AI Safety 2023 Year in Review

23 Dec 2023 21:31 UTC
30 points
1 comment5 min readLW link
(newsletter.safe.ai)

“In­f­tox­i­c­ity” and other new words to de­scribe mal­i­cious in­for­ma­tion and com­mu­ni­ca­tion thereof

Jáchym Fibír23 Dec 2023 18:15 UTC
−1 points
6 comments3 min readLW link

AI’s im­pact on biol­ogy re­search: Part I, today

octopocta23 Dec 2023 16:29 UTC
31 points
6 comments2 min readLW link

AI Gir­lfriends Won’t Mat­ter Much

Maxwell Tabarrok23 Dec 2023 15:58 UTC
42 points
22 comments2 min readLW link
(maximumprogress.substack.com)

The Next Right Token

jefftk23 Dec 2023 3:20 UTC
14 points
0 comments1 min readLW link
(www.jefftk.com)

Fact Find­ing: Do Early Lay­ers Spe­cial­ise in Lo­cal Pro­cess­ing? (Post 5)

23 Dec 2023 2:46 UTC
18 points
0 comments4 min readLW link

Fact Find­ing: How to Think About In­ter­pret­ing Me­mori­sa­tion (Post 4)

23 Dec 2023 2:46 UTC
22 points
0 comments9 min readLW link

Fact Find­ing: Try­ing to Mechanis­ti­cally Un­der­stand­ing Early MLPs (Post 3)

23 Dec 2023 2:46 UTC
10 points
0 comments16 min readLW link

Fact Find­ing: Sim­plify­ing the Cir­cuit (Post 2)

23 Dec 2023 2:45 UTC
25 points
3 comments14 min readLW link

Fact Find­ing: At­tempt­ing to Re­v­erse-Eng­ineer Fac­tual Re­call on the Neu­ron Level (Post 1)

23 Dec 2023 2:44 UTC
108 points
9 comments22 min readLW link1 review

Mea­sure­ment tam­per­ing de­tec­tion as a spe­cial case of weak-to-strong generalization

23 Dec 2023 0:05 UTC
57 points
10 comments4 min readLW link

How does a toy 2 digit sub­trac­tion trans­former pre­dict the differ­ence?

Evan Anders22 Dec 2023 21:17 UTC
12 points
0 comments10 min readLW link
(evanhanders.blog)

Thoughts on Max Teg­mark’s AI verification

Johannes C. Mayer22 Dec 2023 20:38 UTC
10 points
0 comments3 min readLW link

Ideal­ized Agents Are Ap­prox­i­mate Causal Mir­rors (+ Rad­i­cal Op­ti­mism on Agent Foun­da­tions)

Thane Ruthenis22 Dec 2023 20:19 UTC
74 points
14 comments6 min readLW link

AI safety ad­vo­cates should con­sider pro­vid­ing gen­tle push­back fol­low­ing the events at OpenAI

civilsociety22 Dec 2023 18:55 UTC
16 points
5 comments3 min readLW link