Interpretation is also ambiguous because this was near-simultaneous with the merger with Google Brain; the general view is that GB was the one that lost out in the merger and was the one being dissolved due to insufficient productivity compared to DM. (And we do see a lot of ex-GBers now.)
Thanks for the details, to me the issue is that a large budget slash like this sounds pretty detrimental in EV. You could get this kind of savings during the Manhattan project if you decided to cut 2 of the 3 enrichment methods for example.
Sure we know in hindsight that all 3 methods worked, but the expected value of “bomb before the end of the war” drops a lot because everything is now riding on whichever method you kept.
I would assume now Deepmind is going to be focused on massive transformers and has much less to spare on any other routes.
This also, like you said, sends out many ‘B team’ members who still know almost everything the people not fired know, spreading the knowledge around to all the competition. (Imagine if the Manhattan project staff who were fired were able to join the Axis powers. They are bringing with them strategically relevant info, even if none of them are the most talented physicists)
It depends on what that ’40% staff cost’ means, really. Was it just accounting shenanigans related to RSUs and GOOG stock fluctuations? Then it means pretty much nothing of interest to us here at LW. Did it come from shedding a few superstars with multi-million-dollar compensation packages? Hard to say, depends on how much you think superstars matter at this point compared to researchers. Could be a very big deal: I remain convinced that search for LLMs may be the Next Big Thing and everyone who is reinventing RL from scratch for LLMs is botching the job, and so a few superstar researchers leaving DM could be critical. (But maybe you think the opposite because it’s now all about big pressgangs of researchers whipping a model into shape.) Did it come from shedding a lot of lower-level people who are obscure and unheard of? Inverse of the former.
If the cut is inflated by Edmonton people getting the axe, then I personally would consider this cut to be irrelevant: I have been largely unimpressed by their work, and I think Sutton’s ‘Edmonton plan’ or whatever he was calling it is not an interesting line of work compared to more mainstream RL scaling approaches. (In general, I think Sutton has completely missed the boat on deep learning & especially DL scaling. I realize the irony of saying this about the author of “The Bitter Lesson”, but if you look at his actual work, he’s committed to basically antiquated model-free tweaks and small models, rather than the future of large-scale model-based DRL—like all of his stuff on continual learning is a waste of time, when scaling just plain solves that!)
Thanks for the details, to me the issue is that a large budget slash like this sounds pretty detrimental in EV. You could get this kind of savings during the Manhattan project if you decided to cut 2 of the 3 enrichment methods for example.
Sure we know in hindsight that all 3 methods worked, but the expected value of “bomb before the end of the war” drops a lot because everything is now riding on whichever method you kept.
I would assume now Deepmind is going to be focused on massive transformers and has much less to spare on any other routes.
This also, like you said, sends out many ‘B team’ members who still know almost everything the people not fired know, spreading the knowledge around to all the competition. (Imagine if the Manhattan project staff who were fired were able to join the Axis powers. They are bringing with them strategically relevant info, even if none of them are the most talented physicists)
It depends on what that ’40% staff cost’ means, really. Was it just accounting shenanigans related to RSUs and GOOG stock fluctuations? Then it means pretty much nothing of interest to us here at LW. Did it come from shedding a few superstars with multi-million-dollar compensation packages? Hard to say, depends on how much you think superstars matter at this point compared to researchers. Could be a very big deal: I remain convinced that search for LLMs may be the Next Big Thing and everyone who is reinventing RL from scratch for LLMs is botching the job, and so a few superstar researchers leaving DM could be critical. (But maybe you think the opposite because it’s now all about big pressgangs of researchers whipping a model into shape.) Did it come from shedding a lot of lower-level people who are obscure and unheard of? Inverse of the former.
If the cut is inflated by Edmonton people getting the axe, then I personally would consider this cut to be irrelevant: I have been largely unimpressed by their work, and I think Sutton’s ‘Edmonton plan’ or whatever he was calling it is not an interesting line of work compared to more mainstream RL scaling approaches. (In general, I think Sutton has completely missed the boat on deep learning & especially DL scaling. I realize the irony of saying this about the author of “The Bitter Lesson”, but if you look at his actual work, he’s committed to basically antiquated model-free tweaks and small models, rather than the future of large-scale model-based DRL—like all of his stuff on continual learning is a waste of time, when scaling just plain solves that!)