RSS

Distil­la­tion Of Deep­Seek-Prover V1.5

IvanLin15 Oct 2024 18:53 UTC
−2 points
0 comments3 min readLW link

Im­prov­ing Model-Writ­ten Evals for AI Safety Benchmarking

15 Oct 2024 18:25 UTC
12 points
0 comments18 min readLW link

Tak­ing non­log­i­cal con­cepts seriously

Kris Brown15 Oct 2024 18:16 UTC
7 points
0 comments18 min readLW link
(topos.site)

Rashomon—A news­bet­ting site

ideasthete15 Oct 2024 18:15 UTC
4 points
0 comments1 min readLW link

On the Prac­ti­cal Ap­pli­ca­tions of Interpretability

Nick Jiang15 Oct 2024 17:18 UTC
1 point
0 comments7 min readLW link

[Question] When is re­ward ever the op­ti­miza­tion tar­get?

Noosphere8915 Oct 2024 15:09 UTC
21 points
2 comments1 min readLW link

An Opinionated Evals Read­ing List

15 Oct 2024 14:38 UTC
33 points
0 comments13 min readLW link
(www.apolloresearch.ai)

[In­tu­itive self-mod­els] 5. Dis­so­ci­a­tive Iden­tity (Mul­ti­ple Per­son­al­ity) Disorder

Steven Byrnes15 Oct 2024 13:31 UTC
23 points
2 comments11 min readLW link

In­verse Prob­lems In Every­day Life

silentbob15 Oct 2024 11:42 UTC
10 points
1 comment8 min readLW link

Think­ing LLMs: Gen­eral In­struc­tion Fol­low­ing with Thought Generation

Bogdan Ionut Cirstea15 Oct 2024 9:21 UTC
7 points
0 comments1 min readLW link
(arxiv.org)

Thoughts On the Na­ture of Ca­pa­bil­ity Elic­i­ta­tion via Fine-tuning

Theodore Chapman15 Oct 2024 8:39 UTC
4 points
0 comments8 min readLW link

Min­i­mal Mo­ti­va­tion of Nat­u­ral Latents

14 Oct 2024 22:51 UTC
37 points
1 comment3 min readLW link

How long should poli­ti­cal (and other) terms be?

ohmurphy14 Oct 2024 21:38 UTC
5 points
0 comments1 min readLW link
(ohmurphy.substack.com)

Ex­am­ples of How I Use LLMs

jefftk14 Oct 2024 17:10 UTC
27 points
2 comments2 min readLW link
(www.jefftk.com)

Mechanis­tic Ex­plo­ra­tion of Gemma 2 List Generation

Gerard Boxo14 Oct 2024 17:04 UTC
7 points
0 comments4 min readLW link
(gboxo.github.io)

[Question] LW re­sources on child­hood ex­pe­riences?

nahir9159514 Oct 2024 17:04 UTC
10 points
7 comments1 min readLW link

Free Will, Neu­rotyp­i­cal Dom­i­nance, and the Path to ASI and Neu­ral­inks: Evolv­ing Beyond Scarcity

j_passeri14 Oct 2024 16:54 UTC
−2 points
0 comments3 min readLW link

Break­throughs, Neu­ro­di­ver­gence, and Work­ing Out­side the System

j_passeri14 Oct 2024 16:54 UTC
1 point
2 comments2 min readLW link

The case for un­learn­ing that re­moves in­for­ma­tion from LLM weights

Fabien Roger14 Oct 2024 14:08 UTC
72 points
1 comment6 min readLW link

Cir­cuits in Su­per­po­si­tion: Com­press­ing many small neu­ral net­works into one

14 Oct 2024 13:06 UTC
104 points
7 comments13 min readLW link