Why com­par­a­tive ad­van­tage does not help horses

Sherrinford30 Sep 2024 22:27 UTC
101 points
15 comments3 min readLW link

In­tel­li­gence ex­plo­sion: a ra­tio­nal as­sess­ment.

p4rziv4l30 Sep 2024 21:17 UTC
1 point
0 comments1 min readLW link
(docs.google.com)

Peak Hu­man Capital

PeterMcCluskey30 Sep 2024 21:13 UTC
69 points
3 comments5 min readLW link
(bayesianinvestor.com)

Sam Alt­man’s Busi­ness Negging

Julian Bradshaw30 Sep 2024 21:06 UTC
13 points
0 comments1 min readLW link
(www.bloomberg.com)

In-Con­text Learn­ing: An Align­ment Survey

alamerton30 Sep 2024 18:44 UTC
8 points
0 comments20 min readLW link
(docs.google.com)

Not Just For Ther­apy Chat­bots: The Case For Com­pas­sion In AI Mo­ral Align­ment Research

kenneth_diao30 Sep 2024 18:37 UTC
2 points
0 comments12 min readLW link

Ex­plor­ing De­com­pos­abil­ity of SAE Features

Vikram_N30 Sep 2024 18:28 UTC
1 point
0 comments3 min readLW link

Knowl­edge Base 1: Could it in­crease in­tel­li­gence and make it safer?

iwis30 Sep 2024 16:00 UTC
−4 points
0 comments4 min readLW link

Point of Failure: Semi­con­duc­tor-Grade Quartz

Annapurna30 Sep 2024 15:57 UTC
41 points
8 comments2 min readLW link
(jorgevelez.substack.com)

on bac­te­ria, on teeth

bhauth30 Sep 2024 15:56 UTC
62 points
9 comments6 min readLW link
(bhauth.com)

SB 1047 gets vetoed

ryan_b30 Sep 2024 15:49 UTC
25 points
1 comment1 min readLW link
(www.reuters.com)

Of Birds and Bees

RussellThor30 Sep 2024 10:52 UTC
7 points
9 comments2 min readLW link

A new pro­cess for map­ping discussions

Nathan Young30 Sep 2024 8:57 UTC
28 points
7 comments6 min readLW link
(open.substack.com)

MATS Alumni Im­pact Analysis

30 Sep 2024 2:35 UTC
61 points
7 comments11 min readLW link

[Question] Most ca­pa­ble pub­li­cly available agents?

Gabe30 Sep 2024 0:04 UTC
2 points
0 comments1 min readLW link

the case for CoT un­faith­ful­ness is overstated

nostalgebraist29 Sep 2024 22:07 UTC
245 points
40 comments11 min readLW link

0.202 Bits of Ev­i­dence In Fa­vor of Futarchy

niplav29 Sep 2024 21:57 UTC
38 points
0 comments1 min readLW link

Po­modoro Method Ran­dom­ized Self Experiment

niplav29 Sep 2024 21:55 UTC
14 points
2 comments1 min readLW link

Toy Models of Su­per­po­si­tion: Sim­plified by Hand

Axel Sorensen29 Sep 2024 21:19 UTC
9 points
3 comments8 min readLW link

LLMs are likely not conscious

research_prime_space29 Sep 2024 20:57 UTC
6 points
9 comments1 min readLW link

A Policy Proposal

phdead29 Sep 2024 20:45 UTC
10 points
4 comments4 min readLW link

Do Sparse Au­toen­coders (SAEs) trans­fer across base and fine­tuned lan­guage mod­els?

29 Sep 2024 19:37 UTC
26 points
8 comments25 min readLW link

Models of life

Abhishaike Mahajan29 Sep 2024 19:24 UTC
8 points
0 comments16 min readLW link
(www.asimov.press)

In­ter­pret­ing the effects of Jailbreak Prompts in LLMs

Harsh Raj29 Sep 2024 19:01 UTC
8 points
0 comments5 min readLW link

New Ca­pa­bil­ities, New Risks? - Eval­u­at­ing Agen­tic Gen­eral As­sis­tants us­ing Ele­ments of GAIA & METR Frameworks

Tej Lander29 Sep 2024 18:58 UTC
5 points
0 comments29 min readLW link

Devel­op­men­tal Stages in Multi-Prob­lem Grokking

James Sullivan29 Sep 2024 18:58 UTC
4 points
0 comments6 min readLW link

A Psy­cho­an­a­lytic Ex­pla­na­tion of Sam Alt­man’s Ir­ra­tional Actions

Gabe29 Sep 2024 18:58 UTC
1 point
3 comments3 min readLW link

Build­ing Safer AI from the Ground Up: Steer­ing Model Be­hav­ior via Pre-Train­ing Data Curation

Antonio Clarke29 Sep 2024 18:48 UTC
4 points
0 comments23 min readLW link

Cry­on­ics is free

Mati_Roy29 Sep 2024 17:58 UTC
190 points
42 comments2 min readLW link

Run­ner’s High On De­mand: A Story of Luck & Persistence

Shoshannah Tekofsky29 Sep 2024 17:15 UTC
14 points
6 comments5 min readLW link
(shoshanigans.substack.com)

You can, in fact, bam­boo­zle an un­al­igned AI into spar­ing your life

David Matolcsi29 Sep 2024 16:59 UTC
97 points
171 comments27 min readLW link

Base LLMs re­fuse too

29 Sep 2024 16:04 UTC
60 points
20 comments10 min readLW link

My Method­olog­i­cal Turn

adamShimi29 Sep 2024 15:01 UTC
29 points
0 comments1 min readLW link
(formethods.substack.com)

Linkpost: Hypocrisy standoff

Chris_Leong29 Sep 2024 14:27 UTC
5 points
1 comment1 min readLW link
(x.com)

[Question] Any real toe­holds for mak­ing prac­ti­cal de­ci­sions re­gard­ing AI safety?

lemonhope29 Sep 2024 12:03 UTC
27 points
6 comments1 min readLW link

Re­view: Dr Stone

ProgramCrafter29 Sep 2024 10:35 UTC
18 points
9 comments4 min readLW link

AXRP Epi­sode 36 - Adam Shai and Paul Riech­ers on Com­pu­ta­tional Mechanics

DanielFilan29 Sep 2024 5:50 UTC
25 points
0 comments55 min readLW link

DunCon @Lighthaven

Duncan Sabien (Deactivated)29 Sep 2024 4:56 UTC
41 points
0 comments1 min readLW link

San Fran­cisco ACX Meetup “First Satur­day”

Nate Sternberg29 Sep 2024 3:13 UTC
3 points
0 comments1 min readLW link

Ex­plor­ing Shard-like Be­hav­ior: Em­piri­cal In­sights into Con­tex­tual De­ci­sion-Mak­ing in RL Agents

Alejandro Aristizabal29 Sep 2024 0:32 UTC
6 points
0 comments15 min readLW link

Jailbreak­ing lan­guage mod­els with user roleplay

loops28 Sep 2024 23:43 UTC
8 points
0 comments3 min readLW link
(iter.ca)

“Slow” take­off is a ter­rible term for “maybe even faster take­off, ac­tu­ally”

Raemon28 Sep 2024 23:38 UTC
214 points
69 comments1 min readLW link

Con­tex­tual Con­sti­tu­tional AI

aksh-n28 Sep 2024 23:24 UTC
12 points
2 comments12 min readLW link

Ex­plore More: A Bag of Tricks to Keep Your Life on the Rails

Shoshannah Tekofsky28 Sep 2024 21:38 UTC
234 points
15 comments11 min readLW link
(shoshanigans.substack.com)

2024 Petrov Day Retrospective

28 Sep 2024 21:30 UTC
93 points
25 comments10 min readLW link

[Question] Any Trump Sup­port­ers Want to Dialogue?

k6428 Sep 2024 19:41 UTC
14 points
80 comments1 min readLW link

Eval­u­at­ing LLaMA 3 for poli­ti­cal syco­phancy

alma.liezenga28 Sep 2024 19:02 UTC
2 points
2 comments6 min readLW link

Two new datasets for eval­u­at­ing poli­ti­cal syco­phancy in LLMs

alma.liezenga28 Sep 2024 18:29 UTC
8 points
0 comments9 min readLW link

COT Scal­ing im­plies slower take­off speeds

Logan Zoellner28 Sep 2024 16:20 UTC
37 points
56 comments1 min readLW link

Thoughts on Evo-Bio Math and Mesa-Op­ti­miza­tion: Maybe We Need To Think Harder About “Rel­a­tive” Fit­ness?

Lorec28 Sep 2024 14:07 UTC
6 points
6 comments1 min readLW link