hy­dro­gen tube transport

bhauth18 Apr 2024 22:47 UTC
34 points
12 comments5 min readLW link
(www.bhauth.com)

LessOn­line Fes­ti­val Up­dates Thread

Ben Pace18 Apr 2024 21:55 UTC
59 points
26 comments1 min readLW link

A Re­view of In-Con­text Learn­ing Hy­pothe­ses for Au­to­mated AI Align­ment Research

alamerton18 Apr 2024 18:29 UTC
25 points
4 comments16 min readLW link

I’m open for pro­jects (sort of)

cousin_it18 Apr 2024 18:05 UTC
46 points
13 comments1 min readLW link

Blessed in­for­ma­tion, garbage in­for­ma­tion, cursed information

tailcalled18 Apr 2024 16:56 UTC
23 points
8 comments3 min readLW link

[Fic­tion] A Confession

Arjun Panickssery18 Apr 2024 16:28 UTC
37 points
2 comments5 min readLW link
(arjunpanickssery.substack.com)

Discrim­i­nat­ing Be­hav­iorally Iden­ti­cal Clas­sifiers: a model prob­lem for ap­ply­ing in­ter­pretabil­ity to scal­able oversight

Sam Marks18 Apr 2024 16:17 UTC
107 points
10 comments12 min readLW link

Co­op­er­a­tion is op­ti­mal, with weaker agents too  -  tldr

Ryo 18 Apr 2024 15:03 UTC
12 points
22 comments4 min readLW link
(medium.com)

How to co­or­di­nate de­spite our bi­ases? - tldr

Ryo 18 Apr 2024 15:03 UTC
3 points
2 comments3 min readLW link
(medium.com)

Knowl­edge Base 7: Long-tail knowl­edge and col­lec­tive intelligence

iwis18 Apr 2024 14:21 UTC
−6 points
0 comments1 min readLW link

AI #60: Oh the Humanity

Zvi18 Apr 2024 14:10 UTC
44 points
7 comments62 min readLW link
(thezvi.wordpress.com)

UDT1.01: Log­i­cal In­duc­tors and Im­plicit Beliefs (5/​10)

Diffractor18 Apr 2024 8:39 UTC
33 points
2 comments19 min readLW link

An ex­am­i­na­tion of GPT-2′s bor­ing yet effec­tive glitch

MiguelDev18 Apr 2024 5:26 UTC
5 points
3 comments3 min readLW link

[Question] What if Ethics is Prov­ably Self-Con­tra­dic­tory?

Yitz18 Apr 2024 5:12 UTC
3 points
7 comments2 min readLW link

The Mom Test: Sum­mary and Thoughts

Adam Zerner18 Apr 2024 3:34 UTC
48 points
3 comments10 min readLW link

Ex­press in­ter­est in an “FHI of the West”

habryka18 Apr 2024 3:32 UTC
268 points
41 comments3 min readLW link

Why Would Belief-States Have A Frac­tal Struc­ture, And Why Would That Mat­ter For In­ter­pretabil­ity? An Explainer

18 Apr 2024 0:27 UTC
184 points
21 comments7 min readLW link

AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilan17 Apr 2024 21:42 UTC
12 points
0 comments65 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

17 Apr 2024 21:09 UTC
44 points
1 comment3 min readLW link
(tiny.cc)

SFS: Foun­da­tions of Forecasting

MAD217 Apr 2024 17:46 UTC
3 points
0 comments1 min readLW link

An eth­i­cal frame­work to su­per­sede Utilitarianism

metalcrow17 Apr 2024 17:18 UTC
1 point
4 comments4 min readLW link

Mov­ing on from com­mu­nity living

Vika17 Apr 2024 17:02 UTC
63 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

Staged release

Zach Stein-Perlman17 Apr 2024 16:00 UTC
9 points
4 comments2 min readLW link

[Question] Dis­com­fort Stacking

Lewis O’Brien17 Apr 2024 14:49 UTC
5 points
12 comments1 min readLW link

FHI (Fu­ture of Hu­man­ity In­sti­tute) has shut down (2005–2024)

gwern17 Apr 2024 13:54 UTC
176 points
22 comments1 min readLW link
(www.futureofhumanityinstitute.org)

Child­hood and Ed­u­ca­tion Roundup #5

Zvi17 Apr 2024 13:00 UTC
36 points
4 comments25 min readLW link
(thezvi.wordpress.com)

Should we max­i­mize the Geo­met­ric Ex­pec­ta­tion of Utility?

A.H.17 Apr 2024 10:37 UTC
5 points
17 comments9 min readLW link

Claude 3 Opus can op­er­ate as a Tur­ing machine

Gunnar_Zarncke17 Apr 2024 8:41 UTC
36 points
2 comments1 min readLW link
(twitter.com)

When is a mind me?

Rob Bensinger17 Apr 2024 5:56 UTC
135 points
125 comments15 min readLW link

Mid-con­di­tional love

KatjaGrace17 Apr 2024 4:00 UTC
76 points
21 comments2 min readLW link
(worldspiritsockpuppet.com)

Spend­ing Up­date 2024

jefftk17 Apr 2024 2:30 UTC
20 points
2 comments3 min readLW link
(www.jefftk.com)

Anti MMAcevedo Protocol

Logan Zoellner16 Apr 2024 22:32 UTC
1 point
1 comment8 min readLW link

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam Shai16 Apr 2024 21:16 UTC
411 points
100 comments12 min readLW link

Tinker

Richard_Ngo16 Apr 2024 18:26 UTC
38 points
0 comments1 min readLW link
(press.asimov.com)

Paul Chris­ti­ano named as US AI Safety In­sti­tute Head of AI Safety

Joel Burget16 Apr 2024 16:22 UTC
256 points
58 comments1 min readLW link
(www.commerce.gov)

Creat­ing un­re­stricted AI Agents with Com­mand R+

Simon Lermen16 Apr 2024 14:52 UTC
77 points
13 comments5 min readLW link

What should the EA com­mu­nity learn from the FTX /​ SBF dis­aster? An in-depth dis­cus­sion with Will MacAskill on the Clearer Think­ing pod­cast

spencerg16 Apr 2024 13:11 UTC
20 points
0 comments1 min readLW link
(podcast.clearerthinking.org)

{Book Sum­mary} The Art of Gathering

Tristan Williams16 Apr 2024 10:48 UTC
28 points
0 comments1 min readLW link

Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philos­o­phy — $25k in prizes

16 Apr 2024 10:10 UTC
82 points
12 comments8 min readLW link
(blog.aiimpacts.org)

An­nounc­ing SPAR Sum­mer 2024!

laurenmarie1216 Apr 2024 8:30 UTC
30 points
2 comments1 min readLW link

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris_Leong16 Apr 2024 4:50 UTC
21 points
2 comments1 min readLW link
(link.springer.com)

My ex­pe­rience us­ing fi­nan­cial com­mit­ments to over­come akrasia

William Howard15 Apr 2024 22:57 UTC
137 points
31 comments18 min readLW link

An eval­u­a­tion of cir­cuit eval­u­a­tion metrics

15 Apr 2024 19:38 UTC
18 points
0 comments4 min readLW link

Ex­per­i­ments with an al­ter­na­tive method to pro­mote spar­sity in sparse autoencoders

Eoin Farrell15 Apr 2024 18:21 UTC
29 points
7 comments12 min readLW link

Effec­tively Han­dling Disagree­ments—In­tro­duc­ing a New Workshop

Camille Berger 15 Apr 2024 16:33 UTC
37 points
2 comments7 min readLW link

Four Lo­cal Gigs

jefftk15 Apr 2024 16:00 UTC
8 points
0 comments1 min readLW link
(www.jefftk.com)

Tak­ing into ac­count prefer­ences of past selves

Jacob G-W15 Apr 2024 13:15 UTC
14 points
9 comments7 min readLW link

Monthly Roundup #17: April 2024

Zvi15 Apr 2024 12:10 UTC
54 points
4 comments76 min readLW link
(thezvi.wordpress.com)

Re­con­sider the anti-cav­ity bac­te­ria if you are Asian

Lao Mein15 Apr 2024 7:02 UTC
168 points
43 comments4 min readLW link

An­thropic AI made the right call

bhauth15 Apr 2024 0:39 UTC
22 points
20 comments1 min readLW link