I don’t really want to argue about language. I’ll defend “almost no individual has a pretty substantial affect on capabilities.” I think publishing norms could have a pretty substantial effect on capabilities, and also a pretty substantial effect on interpretability, and currently think the norms suggested have a tradeoff that’s bad-on-net for x-risk.
Yep, makes sense. No need to argue about language. In that case I do think Gwern is a pretty interesting datapoint, and seems worth maybe digging more into.
I would be surprised if lots of ML engineers thought that Olah’s work was in the top 10 best things to read to become a better ML engineer. I less beliefs about top 100. I would take even odds (and believe something closer to 4:1 or whatever), that if you surveyed good ML engineers and ask for top 10 lists, not a single Olah interpretability piece would be in the top 10 most mentioned things. I think most of the stuff will be random things about e.g. debugging workflow, how deal with computers, how to use libraries effectively, etc. If anyone is good at ML engineering and wants to chime in, that would be neat.
I would take a bet at 2:1 in my favor for the top 10 thing. Top 10 is a pretty high bar, so I am not at even odds.
Idk, I have the same prior about trying to e.g. prove various facts about ML stuff, or do statistical learning theory type things, or a bunch of other stuff. It’s just like, if you’re not trying to eek out more oomph from SGD, then probably the stuff you’re doing isn’t going to allow you to eek out more oomph from SGD, because it’s kinda hard to do that and people are trying many things.
Hmm, yeah, I do think I disagree with the generator here, but I don’t feel super confident and this perspective seems at least plausible to me. I don’t believe it with enough probability to make me think that there is negligible net risk, and I feel like I have a relatively easy time coming up with counterexamples from science and other industries (the nuclear scientists working on nuclear fission did indeed not work on making weapons, and many people were working on making weapons).
Not sure how much it’s worth digging more into this here.
Yep, makes sense. No need to argue about language. In that case I do think Gwern is a pretty interesting datapoint, and seems worth maybe digging more into.
I would take a bet at 2:1 in my favor for the top 10 thing. Top 10 is a pretty high bar, so I am not at even odds.
Hmm, yeah, I do think I disagree with the generator here, but I don’t feel super confident and this perspective seems at least plausible to me. I don’t believe it with enough probability to make me think that there is negligible net risk, and I feel like I have a relatively easy time coming up with counterexamples from science and other industries (the nuclear scientists working on nuclear fission did indeed not work on making weapons, and many people were working on making weapons).
Not sure how much it’s worth digging more into this here.