MIT Fu­tureTech are hiring for a Tech­ni­cal As­so­ci­ate role

peterslattery9 Sep 2024 20:16 UTC
3 points
0 comments3 min readLW link

AI fore­cast­ing bots incoming

9 Sep 2024 19:14 UTC
29 points
44 comments4 min readLW link
(www.safe.ai)

My takes on SB-1047

leogao9 Sep 2024 18:38 UTC
151 points
8 comments4 min readLW link

[Question] Build­ing an In­ex­pen­sive, Aes­thetic, Pri­vate Forum

Aaron Graifman9 Sep 2024 17:10 UTC
13 points
15 comments1 min readLW link

[Linkpost] In­ter­pretable Anal­y­sis of Fea­tures Found in Open-source Sparse Au­toen­coder (par­tial repli­ca­tion)

Fernando Avalos9 Sep 2024 3:33 UTC
6 points
1 comment1 min readLW link
(forum.effectivealtruism.org)

[Question] Has Any­one Here Con­sciously Changed Their Pas­sions?

Spade9 Sep 2024 1:36 UTC
11 points
12 comments1 min readLW link

Pol­lsters Should Pub­lish Ques­tion Translations

jefftk8 Sep 2024 22:10 UTC
60 points
3 comments2 min readLW link
(www.jefftk.com)

On Fables and Nuanced Charts

Niko_McCarty8 Sep 2024 17:09 UTC
35 points
2 comments8 min readLW link
(www.asimov.press)

Con­tra Yud­kowsky on 2-4-6 Game Difficulty Explanations

Josh Hickman8 Sep 2024 16:13 UTC
6 points
1 comment2 min readLW link
(xn--2r8hmb.ws)

At­tach­ment THEORY AND THE EFFECTS OF SECURE ATTACHMENT ON CHILD DEVELOPMENT

Mihriban Temel8 Sep 2024 16:09 UTC
−8 points
0 comments9 min readLW link

Fic­tional par­a­sites very differ­ent from our own

Abhishaike Mahajan8 Sep 2024 14:59 UTC
25 points
0 comments4 min readLW link
(www.owlposting.com)

My Num­ber 1 Episte­mol­ogy Book Recom­men­da­tion: In­vent­ing Temperature

adamShimi8 Sep 2024 14:30 UTC
116 points
18 comments3 min readLW link
(epistemologicalfascinations.substack.com)

[Question] I want a good multi-LLM API-pow­ered chatbot

rotatingpaguro8 Sep 2024 9:40 UTC
10 points
3 comments1 min readLW link

That Alien Mes­sage—The Animation

Writer7 Sep 2024 14:53 UTC
144 points
9 comments8 min readLW link
(youtu.be)

Jonothan Go­rard:The ter­ri­tory is iso­mor­phic to an equiv­alence class of its maps

Daniel C7 Sep 2024 10:04 UTC
17 points
18 comments2 min readLW link
(x.com)

Pay Risk Eval­u­a­tors in Cash, Not Equity

Adam Scholl7 Sep 2024 2:37 UTC
202 points
19 comments1 min readLW link

Ex­cerpts from “A Reader’s Man­i­festo”

Arjun Panickssery6 Sep 2024 22:37 UTC
72 points
1 comment13 min readLW link
(arjunpanickssery.substack.com)

Fun With CellxGene

sarahconstantin6 Sep 2024 22:00 UTC
30 points
2 comments7 min readLW link
(sarahconstantin.substack.com)

[Question] Is this vot­ing sys­tem strat­egy proof?

Donald Hobson6 Sep 2024 20:44 UTC
17 points
9 comments1 min readLW link

Adam Op­ti­mizer Causes Priv­ileged Ba­sis in Trans­former LM Resi­d­ual Stream

6 Sep 2024 17:55 UTC
70 points
7 comments4 min readLW link

Back­doors as an anal­ogy for de­cep­tive alignment

6 Sep 2024 15:30 UTC
104 points
2 comments8 min readLW link
(www.alignment.org)

A Cable Holder for 2 Cent

Johannes C. Mayer6 Sep 2024 11:01 UTC
1 point
1 comment1 min readLW link

Per­haps Try a Lit­tle Ther­apy, As a Treat?

segfault 6 Sep 2024 8:51 UTC
−178 points
61 comments16 min readLW link

In­ves­ti­gat­ing Sen­si­tive Direc­tions in GPT-2: An Im­proved Baseline and Com­par­a­tive Anal­y­sis of SAEs

6 Sep 2024 2:28 UTC
28 points
0 comments12 min readLW link

Dist­in­guish worst-case anal­y­sis from in­stru­men­tal train­ing-gaming

5 Sep 2024 19:13 UTC
37 points
0 comments5 min readLW link

AI x Hu­man Flour­ish­ing: In­tro­duc­ing the Cos­mos Institute

Brendan McCord5 Sep 2024 18:23 UTC
14 points
5 comments6 min readLW link
(cosmosinstitute.substack.com)

What is SB 1047 *for*?

Raemon5 Sep 2024 17:39 UTC
61 points
8 comments3 min readLW link

in­struc­tion tun­ing and au­tore­gres­sive dis­tri­bu­tion shift

nostalgebraist5 Sep 2024 16:53 UTC
40 points
5 comments5 min readLW link

Con­flat­ing value al­ign­ment and in­tent al­ign­ment is caus­ing confusion

Seth Herd5 Sep 2024 16:39 UTC
48 points
18 comments5 min readLW link

A bet for Samo Burja

Nathan Helm-Burger5 Sep 2024 16:01 UTC
13 points
2 comments2 min readLW link

Univer­sal ba­sic in­come isn’t always AGI-proof

Kevin Kohler5 Sep 2024 15:39 UTC
5 points
3 comments7 min readLW link
(machinocene.substack.com)

Why Reflec­tive Sta­bil­ity is Important

Johannes C. Mayer5 Sep 2024 15:28 UTC
19 points
2 comments1 min readLW link

Why Swiss watches and Tay­lor Swift are AGI-proof

Kevin Kohler5 Sep 2024 13:23 UTC
17 points
11 comments6 min readLW link
(machinocene.substack.com)

Is Redis­tribu­tive Tax­a­tion Jus­tifi­able? Part 1: Do the Rich De­serve their Wealth?

Alexander de Vries5 Sep 2024 10:23 UTC
7 points
20 comments10 min readLW link
(2ndhandecon.substack.com)

What pro­gram struc­tures en­able effi­cient in­duc­tion?

Daniel C5 Sep 2024 10:12 UTC
21 points
5 comments3 min readLW link

How to Fake Decryption

ohmurphy5 Sep 2024 9:18 UTC
12 points
0 comments4 min readLW link
(ohmurphy.substack.com)

We Should Try to Directly Mea­sure the Value of Scien­tific Papers

ohmurphy5 Sep 2024 9:08 UTC
1 point
0 comments5 min readLW link
(ohmurphy.substack.com)

on Science Beak­ers and DDT

bhauth5 Sep 2024 3:21 UTC
23 points
13 comments9 min readLW link
(bhauth.com)

Mas­sive Ac­ti­va­tions and why <bos> is im­por­tant in To­k­enized SAE Unigrams

Louka Ewington-Pitsos5 Sep 2024 2:19 UTC
1 point
0 comments3 min readLW link

The Forg­ing of the Great Minds: An Un­finished Tale

Aryeh Englander5 Sep 2024 0:58 UTC
−3 points
0 comments5 min readLW link

The Chat­bot of Babble

Aryeh Englander5 Sep 2024 0:56 UTC
−3 points
0 comments7 min readLW link

[Question] Is it Le­gal to Main­tain Tur­ing Tests us­ing Data Poi­son­ing, and would it work?

Double5 Sep 2024 0:35 UTC
8 points
9 comments1 min readLW link

Ex­e­cutable philos­o­phy as a failed to­tal­iz­ing meta-worldview

jessicata4 Sep 2024 22:50 UTC
93 points
40 comments4 min readLW link
(unstableontology.com)

Against Ex­plo­sive Growth

c.trout4 Sep 2024 21:45 UTC
14 points
1 comment5 min readLW link

The Frag­ility of Life Hy­poth­e­sis and the Evolu­tion of Cooperation

KristianRonn4 Sep 2024 21:04 UTC
50 points
6 comments11 min readLW link

Emo­tion-In­formed Valu­a­tion Mechanism for Im­proved AI Align­ment in Large Lan­guage Models

Javier Marin Valenzuela4 Sep 2024 17:00 UTC
2 points
4 comments6 min readLW link

What hap­pens if you pre­sent 500 peo­ple with an ar­gu­ment that AI is risky?

4 Sep 2024 16:40 UTC
102 points
7 comments3 min readLW link
(blog.aiimpacts.org)

Au­tomat­ing LLM Au­dit­ing with Devel­op­men­tal Interpretability

4 Sep 2024 15:50 UTC
17 points
0 comments3 min readLW link

Michael Dick­ens’ Caf­feine Tol­er­ance Research

niplav4 Sep 2024 15:41 UTC
46 points
3 comments2 min readLW link
(mdickens.me)

[Question] Are UV-C Air puri­fiers so use­ful?

JohnBuridan4 Sep 2024 14:16 UTC
9 points
0 comments1 min readLW link