Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
For what it’s worth, I was thoroughly underwhelmed by Gato, to the point of feeling confused what the paper was even trying to demonstrate.
I’m not the only ML researcher who had this reaction. In the Eleuther discord server, I said “i don’t get what i’m supposed to take away from this gato paper,” and responses from regulars included
“nothing, this was 3 years over-due”
“Yep. I didn’t update much on this paper. I think the ‘general’ in the title is making people panic lol” (with two “this” reacts)
Or see this tweet. I’m not trying to convince you by saying “lots of people agree with me!”, but I think this may be useful context.
A key thing to remember when evaluating Gato is that it was trained on data from many RL models that were themselves very impressive. So there are 2 very different questions we can ask:
Does Gato successively distill a large number of learned RL policies into a single, small collection of params?
Does Gato do anything except distillation? Is there significant beneficial transfer between tasks or data types? Is Gato any more of a “generalist agent” than, like, a big cloud storage bucket with all of those RL models in it, and a little script that lets you pick which one to load and run?
And the answers are a pretty clear, stark “yes” and “no,” respectively.
For #2, note that every time the paper investigates transfer, it gets results that are mostly or entirely negative (see Figs 9 and 17). For example, including stuff like text data makes Gato seem more sexily “generalist” but does not actually seem to help anything—it’s like uploading a (low-quality) LM to the same cloud bucket as the RL policies. It just sits there.
In the particular case of the robot stacking experiment, I don’t think your read is accurate, for reasons related to the above. Neither the transfer to real robotics, nor the effectiveness of offline finetuning, are new to Gato—the researchers are sticking as close as they can to what was done in Lee et al 2022, which used the same stacking task + offline finetuning + real robots, and getting (I think?) broadly similar results. That is, this is yet another success of distillation, without a clear value-add beyond distillation.
In the specific case of Lee et al’s “Skill Generalization” task, it’s important to note that the “expert” line is not reflective of “SOTA RL expert models.”
The stacking task is partitioned here (over object shapes/colors) into train and test subsets. The “expert” is trained only on the train subset, and then Lee et al (and the Gato authors) investigate models that are additionally tuned on the test subset in some way or other. So the “expert” is really a baseline here, and the task consists of trying to beat it.
(This distinction made somewhat clearer in an appendix of the Gato paper—see Fig. 17, and note that the “expert” lines there match the “Dataset” lines from Fig. 3 in Lee et al 2022.)
FWIW I agree with this take & basically said as much in my post; Gato is about what I would have expected given past progress. I think people are right to freak out now about oncoming AGI, but I think they should have been freaking out already, and Gato just had a sufficiently sexy title and abstract. It’s like how people should have been freaking out about COVID early on but only actually started freaking out when hospitals started getting crowded in their own country.
As for the transfer, I would actually have been a bit surprised if there was significant positive transfer given the small number of tasks trained on and the small model size. I’m curious to hear if there was negative transfer though and if so how much. I think the reason why it being a unified agent matters is that we should expect significant positive transfer to happen eventually as we scale up the model and train it longer on more tasks. Do you not?
Sure, this might happen.
But remember, to train “a Gato,” we have to first train all the RL policies that generate its training data. So we have access to all of them too. Instead of training Gato, we could just find the one policy that seems closest to the target task, and spend all our compute on just finetuning it. (Yes, related tasks transfer—and the most related tasks transfer most!)
This approach doesn’t have to spend any compute on the “train Gato” step before finetuning, which gives it a head start. Plus, the individual policy models are generally much smaller than Gato, so they take less compute per step.
Would this work? In the case of the Lee et al robot problem, yes (this is roughly what Lee et al originally did, albeit with various caveats). In general, I don’t know, but this is the baseline that Gato should be comparing itself against.
The question isn’t “will it improve with scale?”—it’s 2022, anything worth doing improves with scale—but “will it ever reach the Pareto frontier? will I ever have a reason to do it?”
As an ML practitioner, it feels like the paper is telling me, “hey, think of a thing you can already do. What if I told you a way to do the same thing, equally well, with an extra step in the middle?” Like, uh, sure, but . . . why?
By contrast, when I papers like AlphaGo, BERT, CLIP, OpenAI diffusion, Chinchilla . . . this is a type of paper where I say, “holy shit, this Fucking Works™, this moves the Pareto frontier.” In several of these cases I went out and immediately used the method in the real world and reaped great rewards.
IMO, the “generalist agent” framing is misleading, insofar as it obscures this second-best quality of Gato. It’s not really any more an “agent” than my hypothetical cloud drive with a bunch of SOTA models on it. Prompting GATO is the equivalent of picking a file from the drive; if I want to do a novel task, I still have to finetune, just as I would with the drive. (A real AGI, even a weak one, would know how to finetune itself, or do the equivalent.)
We are not talking about an autonomous thing; we’re still in the world where there’s a human practitioner and “Gato” is one method they can use or not use. And I don’t see why I would want to use it.
No, you don’t have to, nor do you have guaranteed access, nor would you necessarily want to use them rather than Gato if you did. As Daniel points out, this is obviously untrue of all of the datasets it’s simply doing self-supervised learning on (how did we ‘train the RL policy’ for photographs?). It is also not true of it because it’s off-policy and offline: the experts could be human, or they could be the output of non-RL algorithms which are infeasible to run much like large search processes (eg chess endgame tables) or brittle non-generalizable expert-hand-engineered algorithms, or they could be RL policies you don’t have direct access to (because they’ve bitrotten or their owners won’t let you), or even RL policies which no longer exist because the agents were deleted but their data remains, or they could be RL policies from an oracle setting where you can’t run the original policy in the meaningful real world context (eg in robotics sim2real where you train the expert with oracle access to the simulation’s ground truth to get a good source of demonstrations, but at the end you need a policy which doesn’t use that oracle so you can run it in a real robot) or more broadly any kind of meta-learning context where you have data from RL policies for some problems in a family of problems and want to induce general solving, or they are filtered high-reward episodes from large numbers of attempts by brute force dumb (even random) agents where you trivially have ‘access to all of them’ but that is useless, or… Those RL policies may also not be better than a Gato or DT to begin with, because imitation learning can exceed observed experts and the ‘RL policies’ here might be, say, random baselines which merely have good coverage of the state-space. Plus, nothing at all stops Decision Transformer from doing its own exploration (planning was already demonstrated by DT/Trajectory Transformer, and there’s been work afterwards like Online Decision Transformer).
I thought some of the “experts” Gato was trained on were not from-scratch models but rather humans—e.g. images and text generated by humans.
Relatedly, instead of using a model as the “expert” couldn’t you use a human demonstrator? Like, suppose you are training it to control a drone flying through a warehouse. Couldn’t you have humans fly the drones for a bit and then have it train on those demonstrations?
This is false if significant transfer/generalization starts to happen, right? A drive full of a bunch of SOTA models, plus a rule for deciding what to use, is worse than Gato to the extent that Gato is able to generalize few-shot or zero-shot to new tasks and/or insofar as Gato gets gains from transfer.
EDIT: Meta-comment: I think we are partially just talking past each other here. For example, you think that the question is ‘will it ever reach the Pareto frontier,’ which is definitely not the question I care about.
Meta-comment of my own: I’m going to have to tap out of this conversation after this comment. I appreciate that you’re asking questions in good faith, and this isn’t your fault, but I find this type of exchange stressful and tiring to conduct.
Specifically, I’m writing at the level of exactness/explicitness that I normally expect in research conversations, but it seems like that is not enough here to avoid misunderstandings. It’s tough for me to find the right level of explicitness while avoiding the urge to put thousands of very pedantic words in every comment, just in case.
Re: non-RL training data.
Above, I used “RL policies” as a casual synecdoche for “sources of Gato training data,” for reasons similar to the reasons that this post by Oliver Sourbut focuses on RL/control.
Yes, Gato had other sources of training data, but (1) the RL/control results are the ones everyone is talking about, and (2) the paper shows that the RL/control training data is driving those results (they get even better RL/control outcomes when they drop the other data sources).
Re: gains from transfer..
Yes, if Gato outperforms a particular RL/control policy that generated training data for it, then having Gato is better than merely having that policy, in the case where you want to do its target task.
However, training a Gato is not the only way of reaping gains from transfer. Every time we finetune any model, or use multi-task training, we are reaping gains from transfer. The literature (incl. this paper) robustly shows that we get the biggest gains from transfer when transferring between similar tasks, while distant or unrelated tasks yield no transfer or even negative transfer.
So you can imagine a spectrum ranging from
“pretrain only on one very related task” (i.e. finetuning a single narrow task model), to
“pretraining on a collection of similar tasks” (i.e. multi-task pretraining followed by finetuning), to
“pretrain on every task, even those where you expect no or negative transfer” (i.e. Gato)
The difference between Gato (3) and ordinary multi-task pretraining (2) is that, where the latter would only train with a few closely related tasks, Gato also trains on many other less related tasks.
It would be cool if this helped, and sometimes it does help, as in this paper about training on many modalities at once for multi-modal learning with small transformers. But this is not what the Gato authors found—indeed it’s basically the opposite of what they found.
We could use a bigger model in the hope that will get us some gains from distant transfer (and there is some evidence that this will help), but with the same resources, we could also restrict ourselves to less-irrelevant data and then train a smaller (or same-sized) model on more of it. Gato is at one extreme end of this spectrum, and everything suggests the optimum is somewhere in the interior.
Oliver’s post, which I basically I agree with, has more details on the transfer results.
A single network is solving 600 different tasks spanning different areas. 100+ of the tasks are solved at 100% human performance. Let that sink in.
While not a breaktrough in arbitrary scalable generality, the fact that so many tasks can be fitted into one architecture is surprising and novel. For many real life applications, being good in 100-1000 tasks makes an AI general enough to be deployed as an error tollerant robot, say in a warehouse.
The main point imho is that this architecture may be enough to be scaled (10-1000x parameters) in few years to a useful proto-AGI product.
But the mere fact that one network may be useful for many tasks at once has been extensively investigated since 1990s.
Having just seen this paper and still recovering from Dalle-2 and Palm and then re-reading Eliezer’s now incredibly prescient dying with dignity post I really have to ask: What are we supposed to do? I myself work on ML in a fairly boring corporate capacity and when reading these papers and posts I get a massive urge to drop everything and do something equivalent to a PhD in Alignment but the timelines that seem to be becoming possible now make that seem like a totally pointless exercise, I’d be writing my Dissertation as nanobots liquify my body into raw materials for paper clip manufacturing. Do we just carry on and hope someone somewhere stumbles upon a miracle solution and we happen to have enough heads in the space to implement it? Do I tell my partner we can’t have kids because the probability they will be born into some unknowable hellscape is far too high? Do I become a prepper and move to a cabin in the woods? I’m actually at a loss on how to proceed and frankly Eliezers article made things muddier for me.
As I understand it, the empirical ML alignment community is bottlenecked on good ML engineers, and so people with your stated background without any further training are potentially very valuable in alignment!
I agree. You can even get career advice here at https://www.aisafetysupport.org/resources/career-coaching
Or feel free to message me for a short call. I bet you could get paid to do alignment work, so it’s worth looking into at least.
What’s the best job board for that kind of job?
You should take a look at Anthropic and Redwood’s careers pages for engineer roles!
Lots of other positions at Jobs in AI safety & policy − 80,000 Hours too! E.g., from the Fund for Alignment Research and Aligned AI. But note that the 80,000 Hours jobs board lists positions from OpenAI, DeepMind, Baidu, etc. which aren’t actually alignment-related.
Things are a lot easier for me, given that I know that I couldn’t contribute to Alignment research directly, and the other option, monetarily, is at least not bottlenecked by money so much as prime talent. A doctor unfortunate enough to reside in the Third World, who happens to have emigration plans and a large increase in absolute discretionary income that will only pay off in tens of years has little scope to do more than signal boost.
As such, I intend to live the rest of my life primarily as a hedge against the world in which AGI isn’t imminent in the coming decade or two, and do all the usual things humans do, like keeping a job, having fun, raising a family.
That’s despite the fact that I think it’s more likely than not that I or my kids won’t make it out of the 21st century, but at the least it’ll be a quick and painless death, with the dispassionate dispatch of a bulldozer running over an anthill, not any actual malice.
Outright sadism is unlikely to be a terminal or contingent goal for any AGI we make, however unaligned; and I doubt that the life expectancy of anyone on a planet rapidly being disassembled for parts will be large enough for serious suffering. In slower circumstances, such as an Aligned AI that only caters to the needs of a cabal of creators, leaving the rest of us to starve, I have enough confidence that I can make the end quick.
Thus, I’ve made my peace with likely joining the odd 97 billion anatomically modern humans in oblivion, plus another 8 or 9 concurrently departing with me, but it doesn’t really spark anxiety or despair. It’s good to be alive, and I probably wouldn’t prefer to have been born at any earlier a time in history. Hoping for the best and expecting the worst really, assuming your psyche can handle it.
Then again, I’m not you, and someone with a decent foundation in ML is also in the 0.01% of people who could feasibly make an impact in the time we have, and I selfishly hope that you can do what I never could. And if not, at least enjoy the time you have!
Thanks for the reflection, it is how a part of me feels (I usually never post on LessWrong, being just a lurker, but your comment inspired me a bit).
Actually, I do have some background that could, maybe, be useful in alignment, and I did just complete the AGISF program. Right now, I’m applying to some positions (particularly, I’m focusing now on the SERIMATS application, which is an area that I may be differentially talented), and just honestly trying to do my best. After all, it would be outrageous if I could do something, but I simply did not.
But I recognize the possibility that I’m simply not good enough, and there is no way for me to actually do anything beyond just, as you said, signal boosting, so I can introduce more capable people into the field, while living my life and hoping that Humanity solves this.
But, if Humanity does not, well, it is what it is. There was the dream of success, and building a future Utopia, with future technology facilitated by aligned AI, but that may have been just that, a dream. Maybe alignment is unsolvable, and is the natural order of any advanced civilization to destroy itself by its own AI. Or maybe alignment is solvable, but given the incentives of our world as they are, it was always a fact that unsafe AGI would be created before we would solve alignment.
Or maybe, we will solve alignment in the end, or we were all wrong about the risks from AI in the first place.
As for me, for now, I’m going to keep trying, keep studying, just because, if the world comes to an end, I don’t want to conclude that I could’ve done more. While hoping that I never have to wonder about that in the first place.
EDIT: To be clear, I’m not that sure about short timelines, in the sense that, insofar I know (and I may be very, very wrong), the AGIs we are creating right now don’t seem to be very agentic, and it may be that creating agency from current techniques is much harder than creating general intelligence. But again, “not so sure” is something like 20%-30% chance of timelines being really short, so the point mostly stands.
Develop a training set for alignment via brute force. We can’t defer alignment to the ubernerds. If enough ordinary people (millions? tens of millions?) contribute billions or trillions of tokens, maybe we can increase the chance of alignment. It’s almost like we need to offer prayers of kindness and love to the future AGI: writing alignment essays of kindness that are posted to reddit, or videos extolling the virtue of love that are uploaded to youtube.
What’s your plan for signal boosting?
Primarily talking about it in rat-adjacent communities that are both open to such discussion, but also contain a large number of people who aren’t immersed in AI X-risk. A pertinent example would be either the SSC subreddit or its spinoff, The Motte.
The ideal target is someone with the intellectual curiosity to want to know more about such matters, while also not having encountered them beyond glancing summaries. Below that threshold, people are hard to sway because they’re going off the usual pop culture tropes about AI, and significantly above that, you have the LW crowd, and me trying to teach them anything novel would be trying to teach my grandma to suck eggs.
If I can find people who are mildly aware of such possibilities, then it’s easier to dispel any particular misconceptions they have, such as the tendency to anthromorphize AI, the question of “why not shut it off” etc. Showing them the blistering pace of progress in ML is a reliable eye-opener in my experience.
Engaging with naysayers is also effective, there’s a certain stentorian type who not only has said misunderstandings, but loudly shares them to dismiss X-risk altogether. Dismantling such arguments is always good, even if the odds of convincing them are minimal. There’s always a crowd of undecided but curious people who are somewhat swayed.
There’s also the topic of automation-induced unemployment, which is what I usually bring up in medical circles that would otherwise be baffled by AI X-risk. That’s the most concrete and imminent danger any skilled professional faces, even if the current timelines indicate that the period between the widespread adoption of near-human AI and actual Superhuman AGI is going to be tiny.
That’s about as much as I can do, I don’t have the money to donate anything but pocket change, and my access to high-flying ML engineers is mostly restricted to this very forum. I’m acutely aware that I’m not good enough at math to produce original work in the field, so given those constraints, I consider it a victory if I can sway people wealthier and better positioned by virtue of living in the First World on the matter!
That seems like an excellent strategy and I’m glad someone is focusing on that. Would you be interested in chatting about this sometime?
Absolutely! I haven’t used the messaging features here much, but I’m open to a conversation in any medium of your choice.
Regarding the arguments for doom, they are quite logical, but they don’t quite have the same confidence as e.g. an argument that if you are in a burning, collapsing building, your life is in peril. There are a few too many profound unknowns that have a bearing on the consequences of superhuman AI, to know that the default outcome really is the equivalent of a paperclip maximizer.
However, I definitely agree that that is a very logical scenario, and also that the human race (or the portion of it that works on AI) is taking a huge gamble by pushing towards superhuman AI, without making its central priority that this superhuman AI is ‘friendly’ or ‘aligned’.
In that regard, I keep saying that the best plan I have seen, is June Ku’s “meta-ethical AI”, which falls into the category of AI proposals that construct an overall goal by aggregating idealized versions of the current goals of all human individuals. I want to make a post about it, but I haven’t had time… So I would suggest, check it out, and see if you can contribute technically or critically or by spreading awareness of this kind of proposal.
I think “train a single transformer to imitate the performance of lots of narrow models” is perhaps the least satisfying way to get to a general agent. The fact that this works is disturbing, I shudder thinking of what is possible with an actual Theory of Deep Learning, and not the bag of rusty tools this field consists of right now. With our luck, I wouldn’t be surprised to find that somehow grafting MCTS to this model gets Deepmind all the way there to human-level.
Nevertheless… maybe now would be a good time to buy google and nvidia stock? There’s no sense in dying poor...
How long would you expect to be able to enjoy your newfound fortune in Google stock before death? I could maybe see an AGI starting to disassemble the planet before the stock even has a chance to rise all that much...
In the modal future, no time at all, we probably all die at the same time before any economic effects are felt. That was partly a joke, and partly said to maximize value in those futures where we do get some time of economic growth before catastrophe.
Curious: Do you have a considered model of the regulatory bottleneck on short-term AGI inputs into the economy?
I mention in a footnote above that Eliezer thinks that, in the worlds where we could make lots of money from Google stock shortly before doomsday, the developed countries would already have implemented something like open borders in order to pick up those trillion dollar bills from labor mobility. If we can’t take human-level natural-general-intelligence inputs into our economy, we can’t take subhuman-level AGI economic inputs either, Eliezer argues.
But I’m trying not to defer, and I don’t know what I object-level-think … that argument seems pretty solid I guess.
I don’t expect google’s stock price to rise because AGI will actually impact the economy in meaningful ways in the short-term, I just think that demonstrating AGI will really make investors finally understand the eventual impact, and the stock will go crazy, just like Tesla’s, even if their actual fundamental metrics aren’t that impressive. Economies as a whole might not be competent enough to take AGI inputs, but specific companies certainly are, and I don’t doubt that Google will be trying to solve every problem they can find with AGI, just like Deepmind is now trying to do for medicine. You can bet on the spread between google and the SP500 to take advantage of the inefficiency of the economy as a whole in integrating AGI.
What’s the EA UCLA AI Timelines Workshop? Might be interested in running something similar at Georgia Tech.
Ah EA UCLA just wrote a post about it at We Ran an AI Timelines Retreat—EA Forum (effectivealtruism.org)