A few relevant comments for anybody trying some of the workshops...
The applied linear algebra lecture series covers some material directly relevant to the parts people found difficult in Experiment Week. Lectures 2 and 3 are particularly relevant. (Unfortunately I didn’t record those lectures in time for this MATS cohort, but their confusions did inform the lecture content.)
As Rohan noticed, a lot of the exercises probably needed more time/attention than I gave for people to figure things out. Also, different workshops will connect for different people, mostly depending on what background skills/knowledge people already have and how much they’ve already thought about alignment. Unfortunately, if you just read through the list, there are exercises which you will probably not expect to be very relevant to you but which would in fact be high-value if you did them, so it’s hard to avoid just trying them all and seeing what works.
Other than additional time/attention and some expected variation in the extent to which different workshops connect for different people, I think the conjecture workshop was the only workshop where I qualitatively messed up the implementation. I’d previously run conjecture workshops with differently-selected people, and it turns out the things they need were very different from the things this cohort needed. In particular, I should have put much more emphasis on the fact that a conjecture usually needs two sets of properties—one set of properties are assumed, and then the other properties are derived from those. Lots of people in this cohort ended up coming up with an operationalization of some intuitive concept, but never got around to conjecturing what properties were implied by that operationaliation; they didn’t have an actual claim.
In addition to the workshops, two of the weeks had optional bonus exercises, which I expect are high-expected-value but didn’t really fit in the schedule. Experiment week:
Optional Bonus Exercise for this week: go through the code for both your MNIST classifier, and the hessian/behavioral gradient eigenstuff calculation. First, without running the code, say what the shape is of each variable (i.e. scalar, vector of length 40k, 100 by 1000 matrix, etc), then run the code and check the actual shapes match what you expected. Second, again without running the code, do a fermi estimate of the runtime of each part of the code, then run the code and check how close your fermi estimates were. (For the fermi estimates, assuming you’re not running on a GPU, a reasonable estimate for your CPU’s speed is 1-10 billion operations per second, and you should aim to get your estimate within a factor of 10.)
If you had trouble following what was going on in any of the coding during today’s exercise, then the bonus exercise will probably help; shapes and runtime fermi estimates are things which I usually track in my head when writing numerical code. It’s also a relatively fun exercise, since the feedback loop is very tight.
… and writing week:
Optional Bonus Exercise for this coming week: look up either Shannon’s paper introducing information theory, Turing’s paper on morphogenesis, or any of Einstein’s four annus mirabilis papers. Read through it, paying attention mainly to writing style/techniques. These were all highly influential papers on complicated technical topics which nonetheless had a lot of reach. How did the author make things understandable? How does the style differ from e.g. a typical paper today? What takeaways could you incorporate into your own writing, to write more like Shannon/Turing/Einstein?
I tried the Shannon/Turing/Einstein writing style exercise in the Distillation for Alignment Practicum and didn’t find it very useful. The Einstein paper I read seemed reasonably good at communicating its ideas, but I didn’t find many useful techniques besides obvious things like “describe one idea per paragraph” and “define the symbols in your equations.”
Another idea I’m thinking is that scientific papers are fundamentally worse for communicating ideas than other mediums like textbooks, videos, or more casual writing.
A few relevant comments for anybody trying some of the workshops...
The applied linear algebra lecture series covers some material directly relevant to the parts people found difficult in Experiment Week. Lectures 2 and 3 are particularly relevant. (Unfortunately I didn’t record those lectures in time for this MATS cohort, but their confusions did inform the lecture content.)
As Rohan noticed, a lot of the exercises probably needed more time/attention than I gave for people to figure things out. Also, different workshops will connect for different people, mostly depending on what background skills/knowledge people already have and how much they’ve already thought about alignment. Unfortunately, if you just read through the list, there are exercises which you will probably not expect to be very relevant to you but which would in fact be high-value if you did them, so it’s hard to avoid just trying them all and seeing what works.
Other than additional time/attention and some expected variation in the extent to which different workshops connect for different people, I think the conjecture workshop was the only workshop where I qualitatively messed up the implementation. I’d previously run conjecture workshops with differently-selected people, and it turns out the things they need were very different from the things this cohort needed. In particular, I should have put much more emphasis on the fact that a conjecture usually needs two sets of properties—one set of properties are assumed, and then the other properties are derived from those. Lots of people in this cohort ended up coming up with an operationalization of some intuitive concept, but never got around to conjecturing what properties were implied by that operationaliation; they didn’t have an actual claim.
In addition to the workshops, two of the weeks had optional bonus exercises, which I expect are high-expected-value but didn’t really fit in the schedule. Experiment week:
… and writing week:
I tried the Shannon/Turing/Einstein writing style exercise in the Distillation for Alignment Practicum and didn’t find it very useful. The Einstein paper I read seemed reasonably good at communicating its ideas, but I didn’t find many useful techniques besides obvious things like “describe one idea per paragraph” and “define the symbols in your equations.”
I bet there are some better papers for learning communication techniques? Maybe from What is the best scientific paper you have read? or Any fun, easy to read scientific papers you’d suggest? or Lists of important publications in science. (The first link has a lot of Shannon/Turing/Einstein fans, so maybe I’m crazy.)
Another idea I’m thinking is that scientific papers are fundamentally worse for communicating ideas than other mediums like textbooks, videos, or more casual writing.