RSS

Vika

Karma: 3,197

Victoria Krakovna. Research scientist at DeepMind working on AI safety, and cofounder of the Future of Life Institute. Website and blog: vkrakovna.wordpress.com

A short course on AGI safety from the GDM Align­ment team

Feb 14, 2025, 3:43 PM
99 points
1 comment1 min readLW link
(deepmindsafetyresearch.medium.com)

Mov­ing on from com­mu­nity living

VikaApr 17, 2024, 5:02 PM
63 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

When dis­cussing AI risks, talk about ca­pa­bil­ities, not intelligence

VikaAug 11, 2023, 1:38 PM
124 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

[Linkpost] Some high-level thoughts on the Deep­Mind al­ign­ment team’s strategy

Mar 7, 2023, 11:55 AM
128 points
13 comments5 min readLW link
(drive.google.com)

Power-seek­ing can be prob­a­ble and pre­dic­tive for trained agents

Feb 28, 2023, 9:10 PM
56 points
22 comments9 min readLW link
(arxiv.org)

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

Nov 25, 2022, 2:36 PM
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)

Threat Model Liter­a­ture Review

Nov 1, 2022, 11:03 AM
78 points
4 comments25 min readLW link

Clar­ify­ing AI X-risk

Nov 1, 2022, 11:03 AM
127 points
24 comments4 min readLW link1 review

Deep­Mind al­ign­ment team opinions on AGI ruin arguments

VikaAug 12, 2022, 9:06 PM
395 points
37 comments14 min readLW link1 review

Refin­ing the Sharp Left Turn threat model, part 1: claims and mechanisms

Aug 12, 2022, 3:17 PM
86 points
4 comments3 min readLW link1 review
(vkrakovna.wordpress.com)

Paradigms of AI al­ign­ment: com­po­nents and enablers

VikaJun 2, 2022, 6:19 AM
53 points
4 comments8 min readLW link

ELK con­test sub­mis­sion: route un­der­stand­ing through the hu­man ontology

Mar 14, 2022, 9:42 PM
21 points
2 comments2 min readLW link

Op­ti­miza­tion Con­cepts in the Game of Life

Oct 16, 2021, 8:51 PM
75 points
16 comments10 min readLW link

Trade­off be­tween de­sir­able prop­er­ties for baseline choices in im­pact measures

VikaJul 4, 2020, 11:56 AM
37 points
24 comments5 min readLW link

Pos­si­ble take­aways from the coro­n­avirus pan­demic for slow AI takeoff

VikaMay 31, 2020, 5:51 PM
135 points
36 comments3 min readLW link1 review

Speci­fi­ca­tion gam­ing: the flip side of AI ingenuity

May 6, 2020, 11:51 PM
66 points
9 comments6 min readLW link

Clas­sify­ing speci­fi­ca­tion prob­lems as var­i­ants of Good­hart’s Law

VikaAug 19, 2019, 8:40 PM
72 points
5 comments5 min readLW link1 review

De­sign­ing agent in­cen­tives to avoid side effects

Mar 11, 2019, 8:55 PM
29 points
0 comments2 min readLW link
(medium.com)

New safety re­search agenda: scal­able agent al­ign­ment via re­ward modeling

VikaNov 20, 2018, 5:29 PM
34 points
12 comments1 min readLW link
(medium.com)

Dis­cus­sion on the ma­chine learn­ing ap­proach to AI safety

VikaNov 1, 2018, 8:54 PM
27 points
3 comments4 min readLW link