RSS

In Par­tial, Pug­na­cious Defense of Func­tional De­ci­sion Theory

Mikewins1 Jul 2026 17:49 UTC
4 points
0 comments1 min readLW link

How to read tableaux, a for­mal sys­tem for modal logic with Kripke models

transhumanist_atom_understander1 Jul 2026 17:37 UTC
7 points
0 comments6 min readLW link

Con­sis­tency Train­ing while Miti­gat­ing Obfus­ca­tion via Rate Matching

1 Jul 2026 17:26 UTC
15 points
0 comments12 min readLW link

Dis­cov­er­ing Con­cept-Edit­ing Al­gorithms With LLM Agents

1 Jul 2026 16:07 UTC
22 points
0 comments1 min readLW link
(dmodel.ai)

Model ac­cess for third-par­ties — it’s a big deal!

Cleo Nardo1 Jul 2026 13:09 UTC
68 points
2 comments6 min readLW link

Most Cur­rent Model Or­ganisms Leak: Per­plex­ity Differenc­ing Often Re­veals Fine­tun­ing Objectives

1 Jul 2026 10:07 UTC
15 points
0 comments7 min readLW link

A Black Box Made Less Opaque (part 4)

Matthew McDonnell1 Jul 2026 7:30 UTC
8 points
0 comments9 min readLW link

When should you know the point?

KatjaGrace1 Jul 2026 6:31 UTC
23 points
1 comment1 min readLW link
(worldspiritsockpuppet.substack.com)

A CERN for AI is a dis­trac­tion; push for an IAEA instead

Charbel-Raphaël1 Jul 2026 6:30 UTC
35 points
2 comments4 min readLW link

Why aren’t there more AlphaFolds?

nimakeivan1 Jul 2026 3:42 UTC
19 points
1 comment17 min readLW link

Green

Adam Zerner1 Jul 2026 1:10 UTC
29 points
5 comments2 min readLW link

Clue­less­ness: Sum­mary of the ar­gu­ment, why it mat­ters, and counterarguments

Anthony DiGiovanni30 Jun 2026 20:54 UTC
20 points
3 comments9 min readLW link

Is it eth­i­cal to work on gen­eral-pur­pose robots given the risk of cy­ber­hack­ing?

Master Chief30 Jun 2026 19:27 UTC
6 points
1 comment1 min readLW link

That Which Can­not Be Poked With A Stick Is The Mind-Killer

Firinn30 Jun 2026 19:01 UTC
22 points
3 comments32 min readLW link

Con­nect to your past selves

PatrickDFarley30 Jun 2026 17:01 UTC
8 points
1 comment5 min readLW link

Pre­limi­nary in­ves­ti­ga­tion: KL penalties in RL can in­crease CoT unfaithfulness

30 Jun 2026 13:08 UTC
73 points
1 comment13 min readLW link

Struc­tural Proxies

Raymond Douglas30 Jun 2026 12:38 UTC
33 points
0 comments8 min readLW link

Why Pre­fer Any De­ci­sion The­ory?

J Bostock30 Jun 2026 12:06 UTC
25 points
22 comments6 min readLW link

What Ca­pable Agents Must Know: Why AI Con­scious­ness May Be an Inevitable Byproduct of Capability

Aran Nayebi30 Jun 2026 11:48 UTC
35 points
5 comments12 min readLW link

Agency is not a nat­u­ral kind (and why that might mat­ter for al­ign­ment)

SJ_Beard30 Jun 2026 8:50 UTC
35 points
8 comments4 min readLW link