Mark Xu

Karma: 4,493

Learn more about me at markxu.com/about

Estimating Tail Risk in Neural Networks

Mark XuSep 13, 2024, 8:00 PM

68 points

9 comments23 min readLW link

(www.alignment.org)

Backdoors as an analogy for deceptive alignment

Jacob_Hilton and Mark Xu

Sep 6, 2024, 3:30 PM

104 points

2 comments8 min readLW link

(www.alignment.org)

If you weren’t such an idiot...

kave and Mark Xu

Mar 2, 2024, 12:01 AM

157 points

76 comments2 min readLW link

(markxu.com)

ARC is hiring theoretical researchers

paulfchristiano, Jacob_Hilton and Mark Xu

Jun 12, 2023, 6:50 PM

126 points

12 comments4 min readLW link

(www.alignment.org)

How to do theoretical research, a personal perspective

Mark XuAug 19, 2022, 7:41 PM

91 points

6 comments15 min readLW link

ELK prize results

paulfchristiano and Mark Xu

Mar 9, 2022, 12:01 AM

138 points

50 comments21 min readLW link

ELK First Round Contest Winners

Mark Xu and paulfchristiano

Jan 26, 2022, 2:56 AM

65 points

6 comments1 min readLW link

ARC’s first technical report: Eliciting Latent Knowledge

paulfchristiano, Mark Xu and Ajeya Cotra

Dec 14, 2021, 8:09 PM

228 points

90 comments1 min readLW link 3 reviews

(docs.google.com)

ARC is hiring!

paulfchristiano and Mark Xu

Dec 14, 2021, 8:09 PM

64 points

2 comments1 min readLW link

Your Time Might Be More Valuable Than You Think

Mark XuOct 18, 2021, 12:55 AM

57 points

10 comments6 min readLW link

(markxu.com)

The Simulation Hypothesis Undercuts the SIA/Great Filter Doomsday Argument

Mark Xu and CarlShulman

Oct 1, 2021, 10:23 PM

43 points

11 comments7 min readLW link

Fractional progress estimates for AI timelines and implied resource requirements

Mark Xu and CarlShulman

Jul 15, 2021, 6:43 PM

55 points

6 comments7 min readLW link

Intermittent Distillations #4: Semiconductors, Economics, Intelligence, and Technological Progress.

Mark XuJul 8, 2021, 10:14 PM

81 points

9 comments10 min readLW link

Anthropic Effects in Estimating Evolution Difficulty

Mark XuJul 5, 2021, 4:02 AM

13 points

2 comments3 min readLW link

An Intuitive Guide to Garrabrant Induction

Mark XuJun 3, 2021, 10:21 PM

149 points

20 comments24 min readLW link

Rogue AGI Embodies Valuable Intellectual Property

Mark Xu and CarlShulman

Jun 3, 2021, 8:37 PM

71 points

9 comments3 min readLW link

Intermittent Distillations #3

Mark XuMay 15, 2021, 7:13 AM

21 points

1 comment11 min readLW link

Pre-Training + Fine-Tuning Favors Deception

Mark XuMay 8, 2021, 6:36 PM

27 points

3 comments3 min readLW link

Less Realistic Tales of Doom

Mark XuMay 6, 2021, 11:01 PM

113 points

13 comments4 min readLW link

Agents Over Cartesian World Models

Mark Xu and evhub

Apr 27, 2021, 2:06 AM

67 points

4 comments27 min readLW link