Vivek Hebbar

Karma: 1,150

Political sycophancy as a model organism of scheming

Alex Mallen and Vivek Hebbar

May 12, 2025, 5:49 PM

39 points

0 comments14 min readLW link

How can we solve diffuse threats like research sabotage with AI control?

Vivek HebbarApr 30, 2025, 7:23 PM

52 points

1 comment8 min readLW link

How training-gamers might function (and win)

Vivek HebbarApr 11, 2025, 9:26 PM

107 points

5 comments13 min readLW link

Different senses in which two AIs can be “the same”

Vivek Hebbar and Buck

Jun 24, 2024, 3:16 AM

69 points

2 comments4 min readLW link

Thomas Kwa’s MIRI research experience

Thomas Kwa, peterbarnett, Vivek Hebbar, Jeremy Gillen, Bird Concept and Raemon

Oct 2, 2023, 4:42 PM

173 points

53 comments1 min readLW link

Infinite-width MLPs as an “ensemble prior”

Vivek HebbarMay 12, 2023, 11:45 AM

46 points

0 comments5 min readLW link

[Question] Is EDT correct? Does “EDT” == “logical EDT” == “logical CDT”?

Vivek HebbarMay 8, 2023, 2:07 AM

13 points

2 comments1 min readLW link

Vivek Hebbar’s Shortform

Vivek HebbarNov 24, 2022, 2:57 AM

4 points

5 comments LW link

Path dependence in ML inductive biases

Vivek Hebbar and evhub

Sep 10, 2022, 1:38 AM

68 points

13 comments10 min readLW link

Hessian and Basin volume

Vivek HebbarJul 10, 2022, 6:59 AM

35 points

10 comments4 min readLW link

[Short version] Information Loss --> Basin flatness

Vivek HebbarMay 21, 2022, 12:59 PM

12 points

0 comments1 min readLW link

Information Loss --> Basin flatness

Vivek HebbarMay 21, 2022, 12:58 PM

62 points

31 comments7 min readLW link

Org announcement: [AC]RC

Vivek HebbarApr 17, 2022, 5:24 PM

82 points

11 comments1 min readLW link

[Question] When people ask for your P(doom), do you give them your inside view or your betting odds?

Vivek HebbarMar 26, 2022, 11:08 PM

11 points

11 comments1 min readLW link

Transformer inductive biases & RASP

Vivek HebbarFeb 24, 2022, 12:42 AM

15 points

4 comments1 min readLW link

(proceedings.mlr.press)

[Question] Favorite / most obscure research on understanding DNNs?

Vivek HebbarFeb 21, 2022, 5:49 AM

16 points

1 comment1 min readLW link

How complex are myopic imitators?

Vivek HebbarFeb 8, 2022, 12:00 PM

26 points

1 comment15 min readLW link