RSS

Kei Nishimura-Gasparian

Karma: 555

Re­search note on win­dow shift­ing training

17 Mar 2026 15:58 UTC
26 points
1 comment15 min readLW link

Ap­pen­dices: Su­per­vised fine­tun­ing on low-harm re­ward hack­ing gen­er­al­ises to high-harm re­ward hacking

22 Dec 2025 19:33 UTC
17 points
0 comments1 min readLW link

Su­per­vised fine­tun­ing on low-harm re­ward hack­ing gen­er­al­ises to high-harm re­ward hacking

22 Dec 2025 19:32 UTC
15 points
0 comments30 min readLW link

Can you find the stegano­graph­i­cally hid­den mes­sage?

Kei Nishimura-Gasparian20 Oct 2025 17:29 UTC
49 points
2 comments7 min readLW link