RSS

Johannes Treutlein

Karma: 1,388

All opinions are my own. Homepage: johannestreutlein.com

Align­ment Fak­ing in Large Lan­guage Models

18 Dec 2024 17:19 UTC
482 points
74 comments10 min readLW link

Con­nect­ing the Dots: LLMs can In­fer & Ver­bal­ize La­tent Struc­ture from Train­ing Data

21 Jun 2024 15:54 UTC
163 points
13 comments8 min readLW link
(arxiv.org)

Re­port on mod­el­ing ev­i­den­tial co­op­er­a­tion in large worlds

Johannes Treutlein12 Jul 2023 16:37 UTC
45 points
3 comments1 min readLW link
(arxiv.org)

Con­di­tional Pre­dic­tion with Zero-Sum Train­ing Solves Self-Fulfilling Prophecies

26 May 2023 17:44 UTC
88 points
13 comments24 min readLW link