RSS

Guillaume Corlouer

Karma: 116

An in­for­ma­tion-the­o­retic study of ly­ing in LLMs

2 Aug 2024 10:06 UTC
16 points
0 comments4 min readLW link

De­gen­era­cies are sticky for SGD

16 Jun 2024 21:19 UTC
56 points
1 comment16 min readLW link

Un­der­stand­ing mesa-op­ti­miza­tion us­ing toy models

7 May 2023 17:00 UTC
43 points
2 comments10 min readLW link

Me­tal­ign­ment: De­con­fus­ing metaethics for AI al­ign­ment.

Guillaume Corlouer23 Aug 2019 10:25 UTC
13 points
7 comments3 min readLW link