Vika

Karma: 3,197

Victoria Krakovna. Research scientist at DeepMind working on AI safety, and cofounder of the Future of Life Institute. Website and blog: vkrakovna.wordpress.com

A short course on AGI safety from the GDM Alignment team

Vika and Rohin Shah

Feb 14, 2025, 3:43 PM

99 points

1 comment1 min readLW link

(deepmindsafetyresearch.medium.com)

Moving on from community living

VikaApr 17, 2024, 5:02 PM

63 points

7 comments3 min readLW link

(vkrakovna.wordpress.com)

When discussing AI risks, talk about capabilities, not intelligence

VikaAug 11, 2023, 1:38 PM

124 points

7 comments3 min readLW link

(vkrakovna.wordpress.com)

[Linkpost] Some high-level thoughts on the DeepMind alignment team’s strategy

Vika and Rohin Shah

Mar 7, 2023, 11:55 AM

128 points

13 comments5 min readLW link

(drive.google.com)

Power-seeking can be probable and predictive for trained agents

Vika and janos

Feb 28, 2023, 9:10 PM

56 points

22 comments9 min readLW link

(arxiv.org)

Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika, Vikrant Varma, Ramana Kumar and Rohin Shah

Nov 25, 2022, 2:36 PM

39 points

9 comments6 min readLW link

(vkrakovna.wordpress.com)

Threat Model Literature Review

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar and Elliot Catt

Nov 1, 2022, 11:03 AM

78 points

4 comments25 min readLW link

Clarifying AI X-risk

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar and Elliot Catt

Nov 1, 2022, 11:03 AM

127 points

24 comments4 min readLW link 1 review

DeepMind alignment team opinions on AGI ruin arguments

VikaAug 12, 2022, 9:06 PM

395 points

37 comments14 min readLW link 1 review

Refining the Sharp Left Turn threat model, part 1: claims and mechanisms

Vika, Vikrant Varma, Ramana Kumar and Mary Phuong

Aug 12, 2022, 3:17 PM

86 points

4 comments3 min readLW link 1 review

(vkrakovna.wordpress.com)

Paradigms of AI alignment: components and enablers

VikaJun 2, 2022, 6:19 AM

53 points

4 comments8 min readLW link

ELK contest submission: route understanding through the human ontology

Vika, Ramana Kumar and Vikrant Varma

Mar 14, 2022, 9:42 PM

21 points

2 comments2 min readLW link

Optimization Concepts in the Game of Life

Vika and Ramana Kumar

Oct 16, 2021, 8:51 PM

75 points

16 comments10 min readLW link

Tradeoff between desirable properties for baseline choices in impact measures

VikaJul 4, 2020, 11:56 AM

37 points

24 comments5 min readLW link

Possible takeaways from the coronavirus pandemic for slow AI takeoff

VikaMay 31, 2020, 5:51 PM

135 points

36 comments3 min readLW link 1 review

Specification gaming: the flip side of AI ingenuity

Vika, Vlad Mikulik, Matthew Rahtz, tom4everitt, Zac Kenton and janleike

May 6, 2020, 11:51 PM

66 points

9 comments6 min readLW link

Classifying specification problems as variants of Goodhart’s Law

VikaAug 19, 2019, 8:40 PM

72 points

5 comments5 min readLW link 1 review

Designing agent incentives to avoid side effects

Vika and TurnTrout

Mar 11, 2019, 8:55 PM

29 points

0 comments2 min readLW link

(medium.com)

New safety research agenda: scalable agent alignment via reward modeling

VikaNov 20, 2018, 5:29 PM

34 points

12 comments1 min readLW link

(medium.com)

Discussion on the machine learning approach to AI safety

VikaNov 1, 2018, 8:54 PM

27 points

3 comments4 min readLW link