gwern comments on Making every researcher seek grants is a broken model

gwern 27 Jan 2024 16:48 UTC
7 points
4

Googles private management thought they were being insufficiently productive, and fired 40 percent of the staff in October 2022

No, they cut staff costs by 40%. Not the same thing at all. (You would have noticed a lot more ex-DMers if they had fired half the place!)

They shut down the Edmonton office with Sutton*, so they clearly shed some people, but it’s not clear what percentage by headcount; because of how compensation works, a lot of that 40% could reflect, say, high stock grants and high share prices in the COVID tech bubble followed by slashing offered bonuses to return to baseline. The DM budget was unusually high for a while, I think, and interpreting the official public numbers is hard because of issues like the purchase of their extremely expensive London HQ.

Interpretation is also ambiguous because this was near-simultaneous with the merger with Google Brain; the general view is that GB was the one that lost out in the merger and was the one being dissolved due to insufficient productivity compared to DM. (And we do see a lot of ex-GBers now.)

* which is probably relevant to why Sutton is now partnering with Carmack’s Keen Technologies AGI startup.
- Gerald Monroe 27 Jan 2024 17:05 UTC
  2 points
  0
  Parent
  Interpretation is also ambiguous because this was near-simultaneous with the merger with Google Brain; the general view is that GB was the one that lost out in the merger and was the one being dissolved due to insufficient productivity compared to DM. (And we do see a lot of ex-GBers now.)
  Thanks for the details, to me the issue is that a large budget slash like this sounds pretty detrimental in EV. You could get this kind of savings during the Manhattan project if you decided to cut 2 of the 3 enrichment methods for example.
  Sure we know in hindsight that all 3 methods worked, but the expected value of “bomb before the end of the war” drops a lot because everything is now riding on whichever method you kept.
  I would assume now Deepmind is going to be focused on massive transformers and has much less to spare on any other routes.
  This also, like you said, sends out many ‘B team’ members who still know almost everything the people not fired know, spreading the knowledge around to all the competition. (Imagine if the Manhattan project staff who were fired were able to join the Axis powers. They are bringing with them strategically relevant info, even if none of them are the most talented physicists)
  - gwern 31 Jan 2024 1:35 UTC
    7 points
    0
    Parent
    It depends on what that ’40% staff cost’ means, really. Was it just accounting shenanigans related to RSUs and GOOG stock fluctuations? Then it means pretty much nothing of interest to us here at LW. Did it come from shedding a few superstars with multi-million-dollar compensation packages? Hard to say, depends on how much you think superstars matter at this point compared to researchers. Could be a very big deal: I remain convinced that search for LLMs may be the Next Big Thing and everyone who is reinventing RL from scratch for LLMs is botching the job, and so a few superstar researchers leaving DM could be critical. (But maybe you think the opposite because it’s now all about big pressgangs of researchers whipping a model into shape.) Did it come from shedding a lot of lower-level people who are obscure and unheard of? Inverse of the former.
    
    If the cut is inflated by Edmonton people getting the axe, then I personally would consider this cut to be irrelevant: I have been largely unimpressed by their work, and I think Sutton’s ‘Edmonton plan’ or whatever he was calling it is not an interesting line of work compared to more mainstream RL scaling approaches. (In general, I think Sutton has completely missed the boat on deep learning & especially DL scaling. I realize the irony of saying this about the author of “The Bitter Lesson”, but if you look at his actual work, he’s committed to basically antiquated model-free tweaks and small models, rather than the future of large-scale model-based DRL—like all of his stuff on continual learning is a waste of time, when scaling just plain solves that!)