CallumMcDougall

Karma: 2,001

CallumMcDougall Jul 5, 2023, 8:06 PM
2 points
0
in reply to: Nina Panickssery’s comment on: Six (and a half) intuitions for KL divergence
Thanks, really appreciate it!

CallumMcDougall Jun 1, 2023, 6:34 AM
1 point
0
in reply to: Ethan (EJ) Watkins’s comment on: Six (and a half) intuitions for KL divergence
Yep that’s right, thanks! Corrected.

CallumMcDougall May 14, 2023, 9:14 AM
1 point
0
in reply to: Sean Hardy’s comment on: An Analogy for Understanding Transformers
huh interesting, I wasn’t aware of this, thanks for sending it!

CallumMcDougall May 13, 2023, 1:50 PM
10 points
0
in reply to: Dentin’s comment on: An Analogy for Understanding Transformers
Thanks for the suggestion! I’ve edited the first diagram to clarify things, is this what you had in mind?

CallumMcDougall Apr 26, 2023, 10:19 AM
1 point
0
in reply to: Thomas Kwa’s comment on: AI Alignment Research Engineer Accelerator (ARENA): call for applicants
The first week of WMLB / MLAB maps quite closely onto the first week of ARENA, with a few exceptions (ARENA includes PyTorch Lightning, plus some more meta stuff like typechecking, VSCode testing and debugging, using GPT in your workflow, etc). I’d say that starting some way through the second week would probably be most appropriate. If you didn’t want to repeat stuff on training / sampling from transformers, the mech interp material would start on Wednesday of the second week.

CallumMcDougall Apr 23, 2023, 7:54 PM
1 point
0
in reply to: AugusteB’s comment on: AI Alignment Research Engineer Accelerator (ARENA): call for applicants
Resolved by private message, but I’m just mentioning this here for others who might be reading this—we didn’t have confirmation emails set up, but we expect to send out coding assessments to applicants tomorrow (Monday 24th April). For people who apply after this point, we’ll generally try to send out coding assessments no later than 24 hours after your application.

CallumMcDougall Apr 20, 2023, 3:29 PM
1 point
0
in reply to: DragonGod’s comment on: AI Alignment Research Engineer Accelerator (ARENA): call for applicants
Yeah, I think this would be possible. In theory, you could do something like:
- Study relevant parts of the week 0 material before the program starts (we might end up creating a virtual group to accommodate this, which also contains people who either don’t get an offer or can’t attend but still want to study the material.)
- Join at the start of the 3rd week—at that point there will be 3 days left of the transformers chapter (which is 8 days long and has 4 days of core content), so you could study (most of) the core content and then transition to RL with the rest of the group (and there would be opportunities to return to the transformers & mech interp material during the bonus parts of later chapters / capstone projects, if you wanted.)
How feasible this is would depend on your prereqs and past experience I imagine. Either way, you’re definitely welcome to apply!

CallumMcDougall Jan 8, 2023, 1:36 AM
2 points
0
on: How do I better stick to a morning schedule?
Not a direct answer, but this post has a ton of useful advice that I think would be applicable here: https://www.neelnanda.io/blog/mini-blog-post-19-on-systems-living-a-life-of-zero-willpower

CallumMcDougall Jan 3, 2023, 1:49 PM
2 points
1
in reply to: hold_my_fish’s comment on: Induction heads—illustrated
Awesome, really glad to hear it was helpful, thanks for commenting!

CallumMcDougall Jan 3, 2023, 1:49 PM
1 point
0
in reply to: LawrenceC’s comment on: Induction heads—illustrated
Yep, fixed, thanks!

CallumMcDougall Dec 26, 2022, 4:17 PM
1 point
0
in reply to: Mateusz Bagiński’s comment on: “Search” is dead. What is the new paradigm?
Or “prompting” ? Seems short and memorable, not used in many other contexts so its meaning would become clear, and it fits in with other technical terms that people are currently using in news articles, e.g. “prompt engineering”. (Admittedly though, it might be a bit premature to guess what language people will use!)

CallumMcDougall Oct 29, 2022, 8:32 AM
1 point
0
in reply to: jsd’s comment on: Six (and a half) intuitions for KL divergence
This is awesome, I love it! Thanks for sharing (-:

CallumMcDougall Oct 15, 2022, 9:00 AM
1 point
0
in reply to: Ulisse Mini’s comment on: Six (and a half) intuitions for KL divergence
Thank you :-)

CallumMcDougall Oct 15, 2022, 8:59 AM
1 point
0
in reply to: DeLesley Hutchins’s comment on: Six (and a half) intuitions for KL divergence
Thanks, really appreciate it!

CallumMcDougall Oct 10, 2022, 6:34 PM
1 point
0
in reply to: lalaithion’s comment on: Six (and a half) intuitions for KL divergence
I think some of the responses here do a pretty good job of this. It’s not really what I intended to go into with my post since I was trying to keep it brief (although I agree this seems like it would be useful).

CallumMcDougall Oct 10, 2022, 2:13 PM
1 point
0
in reply to: TekhneMakre’s comment on: Six (and a half) intuitions for KL divergence
And yeah, despite a whole 16 lecture course on convex opti I still don’t really get Bregman either, I skipped the exam questions on it 😆

CallumMcDougall Oct 10, 2022, 2:11 PM
1 point
0
in reply to: TekhneMakre’s comment on: Six (and a half) intuitions for KL divergence
Oh yeah, I hadn’t considered that one. I think it’s interesting, but the intuitions are better in the opposite direction, i.e. you can build on good intuitions for $D_{K L}$ to better understand MI. I’m not sure if you can easily get intuitions to point in the other direction (i.e. from MI to $D_{K L}$ ), because this particular expression has MI as an expectation over $D_{K L}$ , rather than the other way around. E.g. I don’t think this expression illuminates the nonsymmetry of $D_{K L}$ .
The way it’s written here seems more illuminating (not sure if that’s the one that you meant). This gets across the idea that:
$P_{(X, Y)}$ is the true reality, and $P_{X} \otimes P_{Y}$ is our (possibly incorrect) model which assumes independence. The mutual information between $X$ and $Y$ equals $D_{K L} (P_{(X, Y)} | | P_{X} \otimes P_{Y})$ , i.e. the extent to which modelling $X$ and $Y$ as independent (sharing no information) is a poor way of modelling the true state of affairs (where they do share information).
But again I think this intuition works better in the other direction, since it builds on intuitions for $D_{K L}$ to better explain MI. The arguments in the $D_{K L}$ expression aren’t arbitrary (i.e. we aren’t working with $D_{K L} (P | | Q)$ ), which restricts the amount this can tell us about $D_{K L}$ in general.

CallumMcDougall Oct 10, 2022, 9:51 AM
19 points
1
in reply to: tailcalled’s comment on: Six (and a half) intuitions for KL divergence
Oh yeah, I really like this one, thanks! The intuition here is again that a monomodal distribution is a bad model for a bimodal one because it misses out on an entire class of events, but the other way around is much less bad because there’s no large class of events that happen in reality but that your model fails to represent.
For people reading here, this post discusses this idea in more detail. The image to have in mind is this one:

CallumMcDougall Sep 20, 2022, 7:25 PM
2 points
2
on: Alignment Org Cheat Sheet
Love that this exists! Looks like the material here will make great jumping off points when learning more about any of these orgs, or discussing them with others

CallumMcDougall Sep 13, 2022, 12:03 PM
1 point
0
in reply to: CraigMichael’s comment on: Which LessWrong content would you like recorded into audio/podcast form?
Thanks Nihalm, also I wasn’t aware of it being free! CraigMichael maybe you didn’t find it cause it’s under “Rationality: From AI to Zombies” not “Sequences”?

The narration is pretty good imo, although one disadvantage is it’s a pain to navigate to specific posts cause they aren’t titled (it’s the whole thing, not the highlights).