RSS

Martín Soto

Karma: 1,370

Doing AI Safety research for ethical reasons.

My webpage.

Leave me anonymous feedback.

I operate by Crocker’s Rules.

Tell me about your­self: LLMs are aware of their learned behaviors

Jan 22, 2025, 12:47 AM
129 points
5 comments6 min readLW link

Near- and medium-term AI Con­trol Safety Cases

Martín SotoDec 23, 2024, 5:37 PM
9 points
0 comments6 min readLW link

The In­for­ma­tion: OpenAI shows ‘Straw­berry’ to feds, races to launch it

Martín SotoAug 27, 2024, 11:10 PM
145 points
15 comments3 min readLW link

The need for multi-agent experiments

Martín SotoAug 1, 2024, 5:14 PM
43 points
3 comments9 min readLW link

OpenAI re­leases GPT-4o, na­tively in­ter­fac­ing with text, voice and vision

Martín SotoMay 13, 2024, 6:50 PM
54 points
23 comments1 min readLW link
(openai.com)

Con­flict in Posthu­man Literature

Martín SotoApr 6, 2024, 10:26 PM
40 points
1 comment2 min readLW link
(twitter.com)

Com­par­ing Align­ment to other AGI in­ter­ven­tions: Ex­ten­sions and analysis

Martín SotoMar 21, 2024, 5:30 PM
7 points
0 comments4 min readLW link

Com­par­ing Align­ment to other AGI in­ter­ven­tions: Ba­sic model

Martín SotoMar 20, 2024, 6:17 PM
12 points
4 comments7 min readLW link

How dis­agree­ments about Ev­i­den­tial Cor­re­la­tions could be settled

Martín SotoMar 11, 2024, 6:28 PM
11 points
3 comments4 min readLW link

Ev­i­den­tial Cor­re­la­tions are Sub­jec­tive, and it might be a problem

Martín SotoMar 7, 2024, 6:37 PM
26 points
6 comments14 min readLW link

Why does gen­er­al­iza­tion work?

Martín SotoFeb 20, 2024, 5:51 PM
43 points
16 comments4 min readLW link

Nat­u­ral ab­strac­tions are ob­server-de­pen­dent: a con­ver­sa­tion with John Wentworth

Martín SotoFeb 12, 2024, 5:28 PM
39 points
13 comments7 min readLW link

The lat­tice of par­tial updatelessness

Martín SotoFeb 10, 2024, 5:34 PM
23 points
5 comments5 min readLW link

Up­date­less­ness doesn’t solve most problems

Martín SotoFeb 8, 2024, 5:30 PM
135 points
45 comments12 min readLW link

Sources of ev­i­dence in Alignment

Martín SotoJul 2, 2023, 8:38 PM
20 points
0 comments11 min readLW link

Quan­ti­ta­tive cruxes in Alignment

Martín SotoJul 2, 2023, 8:38 PM
19 points
0 comments23 min readLW link

Why are coun­ter­fac­tu­als elu­sive?

Martín SotoMar 3, 2023, 8:13 PM
14 points
6 comments2 min readLW link

Martín Soto’s Shortform

Martín Soto11 Feb 2023 23:38 UTC
3 points
46 comments1 min readLW link

The Align­ment Problems

Martín Soto12 Jan 2023 22:29 UTC
20 points
0 comments4 min readLW link

Brute-forc­ing the uni­verse: a non-stan­dard shot at di­a­mond alignment

Martín Soto22 Nov 2022 22:36 UTC
9 points
2 comments20 min readLW link