phd student in comp neuroscience @ mpi brain research frankfurt. https://twitter.com/janhkirchner and https://universalprior.substack.com/
Jan
Hi, thanks for the response! I apologize, the “Left as an exercise” line was mine, and written kind of tongue-in-cheek. The rough sketch of the proposition we had in the initial draft did not spell out sufficiently clearly what it was I want to demonstrate here and was also (as you point out correctly) wrong in the way it was stated. That wasted people’s time and I feel pretty bad about it. Mea culpa.
I think/hope the current version of the statement is more complete and less wrong. (Although I also wouldn’t be shocked if there are mistakes in there). Regarding your points:
The limit now shows up on both sides of the equation (as it should)! The dependence on on the RHS does actually kind of drop away at some point, but I’m not showing that here. I’d previously just sloppily substituted “chose as a large number” and then rewrite the proposition in the way indicated at the end of the Note for Proposition 2. That’s the way these large deviation principles are typically used.
Yeah, that should have been an rather than a . Sorry, sloppy.
True. Thinking more about it now, perhaps framing the proposition in terms of “bridges” was a confusing choice; if I revisit this post again (in a month or so 🤦♂️) I will work on cleaning that up.
Hmm there was a bunch of back and forth on this point even before the first version of the post, with @Michael Oesterle and @metasemi arguing what you are arguing. My motivation for calling the token the state is that A) the math gets easier/cleaner that way and B) it matches my geometric intuitions. In particular, if I have a first-order dynamical system then is the state, not the trajectory of states . In this situation, the dynamics of the system only depend on the current state (that’s because it’s a first-order system). When we move to higher-order systems, , then the state is still just , but the dynamics of the system but also the “direction from which we entered it”. That’s the first derivative (in a time-continuous system) or the previous state (in a time-discrete system).
At least I think that’s what’s going on. If someone makes a compelling argument that defuses my argument then I’m happy to concede!
Thanks for pointing this out! This argument made it into the revised version. I think because of finite precision it’s reasonable to assume that such an always exists in practice (if we also assume that the probability gets rounded to something < 1).
Technically correct, thanks for pointing that out! This comment (and the ones like it) was the motivation for introducing the “non-degenerate” requirement into the text. In practice, the proposition holds pretty well—although I agree it would nice to have a deeper understanding of when to expect the transition rule to be “non-degenerate”
Thanks for sharing your thoughts Shos! :)
Hmmm good point. I originally made that decision because loading the image from the server was actually kind of slow. But then I figured out asynchronicity, so could totally change it… I’ll see if I find some time later today to push an update! (to make an ‘all vs all’ mode in addition to the ‘King of the hill’)
Hi Jennifer!
Awesome, thank you for the thoughtful comment! The links are super interesting, reminds me of some of the research in empirical aesthetics I read forever ago.
On the topic of circular preferences: It turns out that the type of reward model I am training here handles non-transitive preferences in a “sensible” fashion. In particular, if you’re “non-circular on average” (i.e. you only make accidental “mistakes” in your rating) then the model averages that out. And if you consitently have a loopy utility function, then the reward model will map all the elements of the loop onto the same reward value.
Finally: Yes, totally, feel free to send me the guest ID either here of via DM!
Hi Erik! Thank you for the careful read, this is awesome!
Regarding proposition 1 - I think you’re right, that counter-example disproves the proposition. The proposition we were actually going for was , i.e. the probability without the end of the bridge! I’ll fix this in the post.
Regarding proposition II—janus had the same intuition and I tried to explain it with the following argument: When the distance between tokens becomes large enough, then eventually all bridges between the first token and an arbitrary second token end up with approximately the same “cost”. At that point, only the prior likelihood of the token will decide which token gets sampled. So Proposition II implies something like , or that in the limit “the probability of the most likely sequence ending in will be (when appropriately normalized) proportional to the probability of ”, which seems sensible? (assuming something like ergodicity). Although I’m now becoming a bit suspicious about the sign of the exponent, perhaps there is a “log” or a minus missing on the RHS… I’ll think about that a bit more.
Uhhh exciting! Thanks for sharing!
Huh, thanks for spotting that! Yes, should totally be ELK 😀 Fixed it.
This work by Michael Aird and Justin Shovelain might also be relevant: “Using vector fields to visualise preferences and make them consistent”
And I have a post where I demonstrate that reward modeling can extract utility functions from non-transitive preference orderings: “Inferring utility functions from locally non-transitive preferences”
(Extremely cool project ideas btw)
Hey Ben! :) Thanks for the comment and the careful reading!
Yes, we only added the missing arx.iv papers after clustering, but then we repeat the dimensionality reduction and show that the original clustering still holds up even with the new papers (Figure 4 bottom right). I think that’s pretty neat (especially since the dimensionality reduction doesn’t “know” about the clustering) but of course the clusters might look slightly different if we also re-run k-means on the extended dataset.
There’s an important caveat here:
The visual stimuli are presented 8 degrees over the visual field for 100ms followed by a 100ms grey mask as in a standard rapid serial visual presentation (RSVP) task.
I’d be willing to bet that if you give the macaque more than 100ms they’ll get it right—That’s at least how it is for humans!
(Not trying to shift the goalpost, it’s a cool result! Just pointing at the next step.)
Great points, thanks for the comment! :) I agree that there are potentially some very low-hanging fruits. I could even imagine that some of these methods work better in artificial networks than in biological networks (less noise, more controlled environment).
But I believe one of the major bottlenecks might be that the weights and activations of an artificial neural network are just so difficult to access? Putting the weights and activations of a large model like GPT-3 under the microscope requires impressive hardware (running forward passes, storing the activations, transforming everything into a useful form, …) and then there are so many parameters to look at.
Giving researchers structured access to the model via a research API could solve a lot of those difficulties and appears like something that totally should exist (although there is of course the danger of accelerating progress on the capabilities side also).
Great point! And thanks for the references :)
I’ll change your background to Computational Cognitive Science in the table! (unless you object or think a different field is even more appropriate)
Thank you for the comment and the questions! :)
This is not clear from how we wrote the paper but we actually do the clustering in the full 768-dimensional space! If you look closely as the clustering plot you can see that the clusters are slightly overlapping—that would be impossible with k-means in 2D, since in that setting membership is determined by distance from the 2D centroid.
Oh true, I completely overlooked that! (if I keep collecting mistakes like this I’ll soon have enough for a “My mistakes” page)
Yes, good point! I had that in an earlier draft and then removed it for simplicity and for the other argument you’re making!
This sounds right to me! In particular, I just (re-)discovered this old post by Yudkowsky and this newer post by Alex Flint that both go a lot deeper on the topic. I think the optimal control perspective is a nice complement to those posts and if I find the time to look more into this then that work is probably the right direction.
Neuroscience and Natural Abstractions
Similarities in structure and function abound in biology; individual neurons that activate exclusively to particular oriented stimuli exist in animals from drosophila (Strother et al. 2017) via pigeons (Li et al. 2007) and turtles (Ammermueller et al. 1995) to macaques (De Valois et al. 1982). The universality of major functional response classes in biology suggests that the neural systems underlying information processing in biology might be highly stereotyped (Van Hooser, 2007, Scholl et al. 2013). In line with this hypothesis, a wide range of neural phenomena emerge as optimal solutions to their respective functional requirements (Poggio 1981, Wolf 2003, Todorov 2004, Gardner 2019). Intriguingly, recent studies on artificial neural networks that approach human-level performance reveal surprising similarity between emerging representations in both artificial and biological brains (Kriegeskorte 2015, Yamins et al. 2016, Zhuang et al. 2020).
Despite the commonalities across different animal species, there is also substantial variability (Van Hooser, 2007). One prominent example of a functional neural structure that is present in some, but absent in other, animals is the orientation pinwheel in the primary visual cortex (Meng et al. 2012), synaptic clustering with respect to orientation selectivity (Kirchner et al. 2021), or the distinct three-layered cortex in reptiles (Tosches et al. 2018). These examples demonstrate that while general organization principles might be universal, the details of how exactly and where in the brain the principles manifest is highly dependent on anatomical factors (Keil et al. 2012, Kirchner et al. 2021), genetic lineage (Tosches et al. 2018), and ecological factors (Roeth et al. 2021). Thus, the universality hypothesis as applied to biological systems does not imply perfect replication of a given feature across all instances of the system. Rather, it suggests that there are broad principles or abstractions that underlie the function of cognitive systems, which are conserved across different species and contexts.