RSS

Marc Carauleanu

Karma: 423

AI Safety Researcher @AE Studio

Currently researching a neglected prior for cooperation and honesty inspired by the cognitive neuroscience of altruism called self-other overlap in state-of-the-art ML models.

Previous SRF @ SERI 21′, MLSS & Student Researcher @ CAIS 22′ and LTFF grantee.

LinkedIn

Self-Other Over­lap: A Ne­glected Ap­proach to AI Alignment

30 Jul 2024 16:22 UTC
193 points
43 comments12 min readLW link

The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

18 Dec 2023 20:35 UTC
168 points
21 comments12 min readLW link

Towards em­pa­thy in RL agents and be­yond: In­sights from cog­ni­tive sci­ence for AI Align­ment

Marc Carauleanu3 Apr 2023 19:59 UTC
15 points
6 comments1 min readLW link
(clipchamp.com)

Les­sons learned and re­view of the AI Safety Nudge Competition

Marc Carauleanu17 Jan 2023 17:13 UTC
3 points
0 comments1 min readLW link

Win­ners of the AI Safety Nudge Competition

Marc Carauleanu15 Nov 2022 1:06 UTC
4 points
0 comments1 min readLW link

An­nounc­ing the AI Safety Nudge Com­pe­ti­tion to Help Beat Procrastination

Marc Carauleanu1 Oct 2022 1:49 UTC
10 points
0 comments1 min readLW link

Should we rely on the speed prior for safety?

Marc Carauleanu14 Dec 2021 20:45 UTC
14 points
5 comments5 min readLW link