…so new interview with Blake Lemoine.
Lamda is not an LLM. It’s an LLM retrained in real-time by Deepmind’s StarCraft playing AlphaStar with full ability to query Google Search, Books, Maps. I note that much of the dismissal of Blake Lemoine’s claims strawman Lamda as GPT-3 but it’s so much easier to imagine a strange loop of consciousness happening in the actual architecture. Plug in some nematode brain copies and start improving that aspect and it’ll be much more believable for us that it feels the buzz of consciousness with visceral, consistent qualia. Or what if its only buzz of existence is the essence of what it takes to play StarCraft? Not what I would have chosen as a human friendly basis for intelligence.
I can see the path the AGI as being as simple as plugging in Gato or another model striving to be more general, which they seemingly know not to do and are keeping underpowered for now. Interesting times. I remain optimistic.
Blake is not reliable here. I work with LaMDA, and while I can’t say much specific other than what’s published (which directly refutes him), he is not accurate here.
Three questions here; obviously you can only answer what you can answer, but I figure it doesn’t hurt to ask.
Does LaMDA contain any kind of cycle in its computation graph, such as a recurrence or iterative process?
Does LaMDA have any form of ‘medium-term’ memory (shorter-term than weights but longer-term than activations)?
Does LaMDA (or, to what extent does LaMDA) make use of non-NN tools such as internet/database searches, calculators, etc?
No, no, yes according to the LaMDA paper. :)
See Gwern’s comment below for more detail on #3.
Interesting! I do wish you were able to talk more openly about this (I think a lot of confusion is coming from lack of public information about how LaMDA works), but that’s useful to know at least. Is there any truth to the claim that there’s any real-time updating going on, or is that false as well?
Just chiming to say that I’m always happy to hear about companies sharing less information about their ML ideas publicly (regardless of the reason!), and I think it would be very prosocial to publish vastly less.
This is an excellent point actually, though I’m not sure I fully agree (sometimes lack of information could end up being far worse, especially if people think we’re further along than we really are, and try to get into an “arms race” of sorts)
Google published a paper about LaMDA, which (at a glance) mentions nothing about AlphaStar or any RL. It does talk about some sort of supervised fine-tuning and queries to “external knowledge resources and tools.”
The retrieval system is pretty interesting: https://arxiv.org/pdf/2201.08239.pdf#subsection.6.2 It includes a translator and a calculator, which apparently get run automatically if LaMDA generates appropriately-formatted text, and then the rest is somewhat WebGPT-style web browsing for answers—the WebGPT paper explicitly states it’s a canned offline pre-generated dataset of text (even if using live web pages would be easy), but the LaMDA paper is unclear about whether it does the same thing, because it says things like
Inasmuch as the other two tools in this are run autonomously by external tools (rather than simply calling LaMDA with a specialized prompt or something), there’s a heavy emphasis on inclusion of accurate live URLs, there’s no mention of using a static pregenerated cache, and the human workers who are generating/rating/editing LaMDA queries explicitly do use live URLs, it seems like LaMDA is empowered to do (arbitrary?) HTTP GETs to retrieve live URLs ‘from the open web’.
(This is all RL, of course: offline imitation & preference learning of how to use non-differentiable actions like calling a calculator or translator, with a single bootstrap phase, at a minimum. You even have expert agents editing LaMDA trajectories to be better demonstration trajectories to increase reward for blackbox environments like chatbots.)
The paper could use more detail on how querying external knowledge resources works. Nevertheless, in the paper, they just add information for various queries to the input string. Example:
LaMDA ta user: Hi, how can I help you today? <EOS> [...]
user to LaMDA: When was the Eiffel Tower built? <EOS>
LaMDA-Base to LaMDA-Research: It was constructed in 1887.<EOS>
Retraining in the middle of a conversation seems to be well beyond what is documented in the 2201.08239 paper.
Here’s a really strange/interesting part of the Wired interview I haven’t seen discussed here yet. Lemoine says: