AI researcher
Jessica Rumbelow
Not yet, but there’s no reason why it wouldn’t be possible. You can imagine microscope AI, for language models. It’s on our to-do list.
Good to know. Thanks!
Yep, aside from running forward prop n times to generate an output of length n, we can just optimise the mean probability of the target tokens at each position in the output—it’s already implemented in the code. Although, it takes way longer to find optimal completions.
More detail on this phenomenon here: https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation
Yeah, I think it could be! I’m considering pursuing it after SERI-MATS. I’ll need a couple of cofounders.
Guardian AI (Misaligned systems are all around us.)
The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)
Why I’m Working On Model Agnostic Interpretability
“being able to reorganise a question in the form of a model-appropriate game” seems like something we already have built a set of reasonable heuristics around—categorising different types of problems and their appropriate translations into ML-able tasks. There are well established ML approaches to, e.g. image captioning, time-series prediction, audio segmentation etc etc. is the bottleneck you’re concerned with the lack of breadth and granularity of these problem-sets, OP—and we can mark progress (to some extent) by the number of these problem sets we have robust ML translations for?
What’s an SCP?