RSS

Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

30 Apr 2024 18:51 UTC
143 points
27 comments45 min readLW link

“Open Source AI” is a lie, but it doesn’t have to be

jacobhaimes30 Apr 2024 23:10 UTC
17 points
1 comment6 min readLW link
(jacob-haimes.github.io)

In­creas­ing IQ is trivial

George3d61 Mar 2024 22:43 UTC
37 points
50 comments6 min readLW link
(epistemink.substack.com)

Mechanis­tic In­ter­pretabil­ity Work­shop Hap­pen­ing at ICML 2024!

3 May 2024 1:18 UTC
47 points
3 comments1 min readLW link

The Ra­tion­al­ists of the 1950s (and be­fore) also called them­selves “Ra­tion­al­ists”

Owain_Evans28 Nov 2021 20:17 UTC
187 points
31 comments3 min readLW link1 review

“AI Safety for Fleshy Hu­mans” an AI Safety ex­plainer by Nicky Case

habryka3 May 2024 18:10 UTC
48 points
10 comments4 min readLW link
(aisafety.dance)

ACX Covid Ori­gins Post con­vinced readers

ErnestScribbler1 May 2024 13:06 UTC
74 points
7 comments2 min readLW link

My hour of mem­o­ryless lucidity

Eric Neyman4 May 2024 1:40 UTC
77 points
2 comments5 min readLW link
(ericneyman.wordpress.com)

[Question] How does the ever-in­creas­ing use of AI in the mil­i­tary for the di­rect pur­pose of mur­der­ing peo­ple af­fect your p(doom)?

Justausername6 Apr 2024 6:31 UTC
15 points
16 comments1 min readLW link

Key take­aways from our EA and al­ign­ment re­search sur­veys

3 May 2024 18:10 UTC
66 points
3 comments21 min readLW link

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
93 points
11 comments4 min readLW link

KAN: Kol­mogorov-Arnold Networks

Gunnar_Zarncke1 May 2024 16:50 UTC
10 points
11 comments1 min readLW link
(arxiv.org)

An In­tro­duc­tion to AI Sandbagging

26 Apr 2024 13:40 UTC
41 points
1 comment8 min readLW link

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam Shai16 Apr 2024 21:16 UTC
353 points
79 comments12 min readLW link

If you weren’t such an idiot...

2 Mar 2024 0:01 UTC
119 points
60 comments2 min readLW link
(markxu.com)

[Question] Which skin­care prod­ucts are ev­i­dence-based?

Vanessa Kosoy2 May 2024 15:22 UTC
83 points
24 comments1 min readLW link

LLM+Plan­ners hy­bridi­s­a­tion for friendly AGI

installgentoo3 May 2024 8:40 UTC
6 points
2 comments1 min readLW link

[Question] Were there any an­cient ra­tio­nal­ists?

OliverHayman3 May 2024 18:26 UTC
11 points
3 comments1 min readLW link

Why is AGI/​ASI Inevitable?

DeathlessAmaranth2 May 2024 18:27 UTC
14 points
6 comments1 min readLW link

A list of core AI safety prob­lems and how I hope to solve them

davidad26 Aug 2023 15:12 UTC
161 points
26 comments5 min readLW link