Daniel Kokotajlo comments on ML is now automating parts of chip R&D. How big a deal is this?

Daniel Kokotajlo Aug 24, 2021, 9:41 AM
2 points
Thanks! As before, this was helpful & I have some follow-up questions. :) Feel free to not reply if you don’t want to.
1. Can verification be automated too, in the next 10 years?
2. Quantitatively, about how much time + money does a good version of this automated chip design save? E.g. “It normally takes 1 year to design a chip and 2 years to actually scale up production; this tech turns that 1 year into 1 month (when you include verification), for an overall time savings of 33%. As for cost, design is a small fraction of the cost (even a research team of hundreds for a year is nothing compared to the cost of a manufacturing line or whatever) so the effect is negligible.”
3. y = 2? That’s way lower y than I expected, especially considering that you “rebuff my original point that this isn’t that big of a deal.” A 2x improvement in 3 years is NOT a big deal, right? Isn’t that slightly slower than the historical rate of progress from e.g. moore’s law etc.? Or are you saying it’s going to be a 2x improvement on top of the regular progress from other sources? Oh… maybe you are specifically talking about speed improvements rather than all-things-considered cost to train a model of a given size on a given dataset? It’s the latter that I’m interested in, I probably misspoke.
4. What is post-silicon fabrication? When I google it it redirects to “post-silicon validation.” If creating the design and verifying it is the barrier to entry, then won’t this AI tech help reduce the barrier to entry since it automates the design part? I guess I just don’t understand your point 3.
5. “Thus, suppose you completely eliminate post-silicon fabrication times. Where would this extra time go? I highly doubt we would change our society-accepted cadence of hardware rotations. Most definitely, it would go right back into creating new designs—human brains. ” I’m particularly keen to hear what you mean by this.
- ljh2 Aug 24, 2021, 4:50 PM
  5 points
  Parent
  1. Definitely not in the next 10 years. In some sense, that’s what formal verification is all about. There’s progress, but from my perspective, it’s a very linear growth.
    The tools that I have seen (e.g. out of the RISC-V Summit, or DVCon) are difficult to adopt, and there’s a large inertia you have to overcome since many big Semi companies already have their own custom flows built up over decades.
    I think it’ll take a young plucky startup to adopt and push for the usage of these tools—but even then, you need the talent to learn these tools, and frankly hardware is filled with old people.
  2. I think we have different interpretations of “design”. You consider chip design in the aggregate, but I’m subdividing it into multiple areas. There’s several aspects of chip design, some of which can be automated, but I’m claiming never to an extreme extent as e.g. 1 month. This technology in particular really only helps in determining where to place “buildings” but not really much in actually building the “buildings” themselves. While valuable, there’s only so much “placing” can do.
    My view is that, the time and money spent won’t go down, just reallocated, which may or may not increase quality.
  3. Sorry, I guess I meant the former where I incorporate every source, at least on the hardware side. Were you to isolate just the ML Chip placement gain… again, hard to say. It’s just indicative of a release of resources, but who knows if those extra resources can/will be properly directed to something better?
  4. + 5. : Sorry! I guess I meant post-design fabrication, which is really just a term I came up with to mean “shipping it to TSMC once you’re done designing”. A better term, in hindsight, is just called “tapeout”, but I was hesitant to use the term time-to-tapeout since that feels cumulative rather than isolating that one period of time I mean.
    
    See: https://anysilicon.com/verification-validation-testing-asic-soc-designs-differences/
    
    What I mean is that, this technology is addressing the “Physical Design” blob of time as above. Notice that the whole critical path to “Shipping”/getting the chips out there goes “Verification”--> “Tapeout” --> “Validation”/Testing
    
    Suppose the “Physical Design” time gets eliminated. These freed resources will most definitely go into “RTL Design” and not “Verification”. That’s what I mean by “creating new designs”—it gives us more time to think of cool stuff, but again, depends if that stuff is good or not.
    
    Why will extra resources not be devoted to verification? That’s a whole can of worms. Industry inertia, overlapping talent skillset, business models, design complexity—but I guess most of all I’d say inertia.
    
    On inertia—as I said, this cadence takes about 1-2 years. We are so so so very accustomed to this cadence, I can’t see it changing barring massive changes in our needs. If you told me you could reduce our verification time from 1 year to 11 months, I’d just spend that extra month iterating on my RTL design instead, or use that extra time to run more simulations, because 11 vs. 12 months doesn’t mean much.
    
    If you told me I could reduce it from 1 year --> 6 months? I’d maaaaybe throw money at you. It has potential to double my income, but that depends.
    
    Imagine new iPhones came out every 6 months instead of yearly. Isn’t that super weird? Well… That depends on how well Apple can market to me that I absolutely need it.
    
    Perhaps that differs for AI use cases… but even there, I’d argue this yearly cadence is ingrained already