RSS

Michael Soareverix

Karma: 93

De­tect­ing AI Agent Failure Modes in Simulations

Michael SoareverixFeb 11, 2025, 11:10 AM
16 points
0 comments8 min readLW link

Pivotal Acts are eas­ier than Align­ment?

Michael SoareverixJul 21, 2024, 12:15 PM
1 point
4 comments1 min readLW link

[Question] Op­ti­miz­ing for Agency?

Michael SoareverixFeb 14, 2024, 8:31 AM
10 points
9 comments2 min readLW link

The Virus—Short Story

Michael SoareverixApr 13, 2023, 6:18 PM
4 points
0 comments4 min readLW link

Gold, Silver, Red: A color scheme for un­der­stand­ing people

Michael SoareverixMar 13, 2023, 1:06 AM
17 points
2 comments4 min readLW link

A Good Fu­ture (rough draft)

Michael SoareverixOct 24, 2022, 8:45 PM
10 points
5 comments3 min readLW link

A rough idea for solv­ing ELK: An ap­proach for train­ing gen­er­al­ist agents like GATO to make plans and de­scribe them to hu­mans clearly and hon­estly.

Michael SoareverixSep 8, 2022, 3:20 PM
2 points
2 comments2 min readLW link