Did the industry predict these problems and their consequences?
People in the industry were well aware of these limitations, long before they actually became critical. However, whether solutions and workarounds would be found was a matter of much greater uncertainty.
Robert Dennard et al’s seminal 1974 paper Design of Ion-Implanted MOSFETs with Very Small Physical Dimensions, that described the very favourable scaling properties of MOSFET transistors and gave rise to the term “Dennard scaling”, explicitly mentions the scaling limitations posed by subthreshold leakage:
One area in which the device characteristics fail to scale is in the subthreshold or weak inversion region of the turn-on characteristic. (…) In order to design devices for operation at room temperature and above, one must accept the fact that the subthreshold behavior does not scale as desired. This nonscaling property of the subthreshold characteristic is of particular concern to miniature dynamic memory circuits which require low source-to-drain leakage currents.
This influential 1995 paper by Davari, Dennard and Shahidi presents guidelines for transistor scaling for the years up to 2004. This paper contains a subsection titled “Performance/Power Tradeoff and Nonscalability of the Threshold Voltage”, which explains the problems described above in a lot more detail than I have. The paper also mentions tunnelling through the gate oxide layer, concluding on both issues that they would remain relatively unproblematic up until 2004.
Subthreshold leakage is textbook material. The main textbook I have consulted is Digital Integrated Circuits by Jan Rabaey, and I have compared some aspects of the 1995 and 2003 editions. Both contained this sentence in the “History of…” chapter:
Interestingly enough, power consumption concerns are rapidly becoming dominant in CMOS design as well, and this time there does not seem to be a new technology around the corner to alleviate the problem.
Regarding gate oxide leakage, this comparatively very accessible 2007 article from the IEEE spectrum recounts the story of how engineers at Intel and elsewhere have developed transistors that use high-kappa dielectrics as a way of maintaining the shrinking gate oxide’s capacitance even as its thickness would cease to be reduced to prevent excessive tunnelling. According to this article, work on such solutions began in the mid-1990s, and Intel eventually launched new chips that made use of this technology in 2007. The main impression this article leaves me with is that the problem was very easy to foresee, but that finding out which solutions might work was a matter of extensive tinkering with highly unpredictable results.
The leakage issues are all mentioned in 2001 ITRS roadmap, the earliest edition that is available online. One example from the Executive Summary:
For low power logic (mainly for portable applications), the main issue is low leakage current, which is absolutely necessary in order to extend battery life. Device performance is then maximized according to the low leakage current requirements. Gate leakage current must be controlled, as well as sub-threshold leakage and junction leakage, including band-to-band tunneling. Preliminary analysis indicates that, balancing the gate leakage control requirements against performance requirements, high kappa may be required for the gate dielectric by around the year 2005.
From reading the reports, it is hard to make out whether the implications of these issues were correctly understood, and I have had to draw on a lot of other literature to get a better sense of where the industry stood on this. Getting a hold of earlier editions (the 1999 one in particular) and talking to industry insiders might shed a lot more light on the weight that was given to the different issues that were flagged as part of the “Red Brick Wall” I’ve mentioned above, i.e. as issues that had no known manufacturable solutions (I did not receive an answer to my inquiry about this from the contact person at the ITRS website). The Executive Summary of the 2001 edition states:
The 1999 ITRS warned that there was a wide range of solutions needed but unavailable to meet the technology requirements corresponding to 100 nm technology node. The 1999 ITRS edition also reported the presence of a potential “Red Brick Wall” or “100 nm Wall” (as indicated by the red cells in the technology requirements) that, by 2005, could block further scaling as predicted by Moore’s Law. However, technological progress continues to accelerate. In the process of compiling information for 2001 ITRS, it was clarified that this “Red Brick Wall” could be reached as early as 2003.
Two accessiblearticles from 2000 give a clearer impression of how this Red Brick Wall was perceived in the industry at the time. Both particularly emphasise gate oxide leakage.
As I have mentioned at the beginning, the reports up to 2005 contained highly overoptimistic projections for on-chip frequency and supply voltage, which became dramatically more pessimistic in the 2007 edition. The reports clearly state, however, that these numbers are meant as targets and are not necessarily “on the road to sure implementation”, especially where it has been highlighted that solutions were needed and not yet known. They can therefore not necessarily serve as a clear indictment of the ITRS’ predictive powers, but I remain puzzled by some of their projections and comments on these before 2007. Getting clarification on this from industry insiders was the next thing I had planned for this project before we paused it.
Specifically, tables 4c and 4d in the Overall Roadmap Technology Characteristics, found in a subsection of the Executive Summary titled Performance of Packaged Chips, contain on-chip frequency forecasts in MHz, which became dramatically more pessimistic in 2007 than they had been in the previous 3 editions. A footnote in the 2007 edition states:
after 2007, the PIDS model fundamental reduction rate of ~ −14.7% for the transistor delay results in an individual transistor frequency performance rate increase of ~17.2% per year growth. In the 2005 roadmap, the trend of the on-chip frequency was also increased at the same rate of the maximum transistor performance through 2022. Although the 17% transistor performance trend target is continued in the PIDS TWG outlook, the Design TWG has revised the long-range on-chip frequency trend to be only about 8% growth rate per year. This is to reflect recent on-chip frequency slowing trends and anticipated speed-power design tradeoffs to manage a maximum 200 watts/chip affordable power management tradeoff.
Later editions seem to have reduced the expected scaling factor even further (1.04 in the 2011 edition), but there were also changes made to the metric employed, so I am not sure how to interpret the numbers (though I would expect the scaling factor to be unaffected by those changes).
Relatedly, a paragraph in the System Drivers document titled Maximum on-chip (global) clock frequency states that the on-chip clock frequency would not continue scaling at a factor of 2 per generation for several reasons. The 2001 edition states 3 reasons for this, the 2003 and 2005 edition state 4. But only in 2007 was the limitation from maximum allowable power dissipation added to this list of reasons. This strikes me as very puzzling.
The paragraph, as it appears in the 2007 edition, is (emphasis added):
Maximum on-chip (global) clock frequency—(...) Through the 2000 ITRS, the MPU maximum on-chip clock frequency was modeled to increase by a factor of 2 per generation. Of this, approximately 1.4× was historically realized by device scaling (17%/year improvement in CV/I metric); the other 1.4× was obtained by reduction in number of logic stages in a pipeline stage (e.g., equivalent of 32 fanout-of-4 inverter (FO4 INV) delays13 at 180 nm, going to 24–26 FO4 INV delays at 130 nm). As noted in the 2001 ITRS, there are several reasons why this historical trend could not continue: 1) well-formed clock pulses cannot be generated with period below 6–8 FO4 INV delays; 2) there is increased overhead (diminishing returns) in pipelining (2–3 FO4 INV delays per flip-flop, 1–1.5 FO4 INV delays per pulse-mode latch); 3) thermal envelopes imposed by affordable packaging discourage very deep pipelining, and 4) architectural and circuit innovations increasingly defer the impact of worsening interconnect RCs (relative to devices) rather than contribute directly to frequency improvements. Recent editions of the ITRS flattened the MPU clock period at 12 FO4 INV delays at 90 nm (a plot of historical MPU clock period data is provided online at public.itrs.net), so that clock frequencies advanced only with device performance in the absence of novel circuit and architectural approaches. In 2007, we recognize the additional limitation from maximum allowable power dissipation. Modern MPU platforms have stabilized maximum power dissipation at approximately 120W due to package cost, reliability, and cooling cost issues. With a flat power requirement, the updated MPU clock frequency model starts with 4.7 GHz in 2007 and is projected to increase by a factor of at most 1.25× per technology generation, despite aggressive development and deployment of low-power design techniques.
Finally, the Overall Roadmap Technology Characteristics tables 6a and 6b (found in a subsection titled Power Supply and Power Dissipation in the Executive Summary) contains projected values of the supply power (V_{dd}) which also became dramatically more pessimistic in the 2007 edition.
I have indicated my puzzlement at these points in an email I have sent out to a number of industry insiders, then asking:
Do the 3 revisions made to the roadmap in 2007 that I’ve pointed out reflect a failure of previous editions to predict the “breakdown in the serial speed version of Moore’s Law” and the relevant issues that would cause it? Or do they merely reflect the ambitiousness and aggressiveness of the targets that were set before admitting defeat became inevitable?
I have received some very kind replies to those emails, but most have focused on the technical reasons for the “breakdown” in Dennard scaling. The only comment I have received on this last question was from Robert Dennard, who sent me a particularly thoughtful email that came with 4 attachments (which mainly provided more technical detail on transistor design, however). At the end of his email, he wrote:
I cannot comment on wishful thinking vs hard facts. Predicting the future is difficult. Betting against Moore’s Law was often a losing game.
Texas Instruments quit way to early.
Indeed, which bets it is most rational to make depends on expected payoff ratios as well as on probability estimates. This distinction between targets and mere predictions complicates the question quite a bit.
This was an interesting project, it would be great to pick it up again.
(cont’d from previous comment)
Did the industry predict these problems and their consequences?
People in the industry were well aware of these limitations, long before they actually became critical. However, whether solutions and workarounds would be found was a matter of much greater uncertainty.
Robert Dennard et al’s seminal 1974 paper Design of Ion-Implanted MOSFETs with Very Small Physical Dimensions, that described the very favourable scaling properties of MOSFET transistors and gave rise to the term “Dennard scaling”, explicitly mentions the scaling limitations posed by subthreshold leakage:
This influential 1995 paper by Davari, Dennard and Shahidi presents guidelines for transistor scaling for the years up to 2004. This paper contains a subsection titled “Performance/Power Tradeoff and Nonscalability of the Threshold Voltage”, which explains the problems described above in a lot more detail than I have. The paper also mentions tunnelling through the gate oxide layer, concluding on both issues that they would remain relatively unproblematic up until 2004.
Subthreshold leakage is textbook material. The main textbook I have consulted is Digital Integrated Circuits by Jan Rabaey, and I have compared some aspects of the 1995 and 2003 editions. Both contained this sentence in the “History of…” chapter:
Regarding gate oxide leakage, this comparatively very accessible 2007 article from the IEEE spectrum recounts the story of how engineers at Intel and elsewhere have developed transistors that use high-kappa dielectrics as a way of maintaining the shrinking gate oxide’s capacitance even as its thickness would cease to be reduced to prevent excessive tunnelling. According to this article, work on such solutions began in the mid-1990s, and Intel eventually launched new chips that made use of this technology in 2007. The main impression this article leaves me with is that the problem was very easy to foresee, but that finding out which solutions might work was a matter of extensive tinkering with highly unpredictable results.
The leakage issues are all mentioned in 2001 ITRS roadmap, the earliest edition that is available online. One example from the Executive Summary:
From reading the reports, it is hard to make out whether the implications of these issues were correctly understood, and I have had to draw on a lot of other literature to get a better sense of where the industry stood on this. Getting a hold of earlier editions (the 1999 one in particular) and talking to industry insiders might shed a lot more light on the weight that was given to the different issues that were flagged as part of the “Red Brick Wall” I’ve mentioned above, i.e. as issues that had no known manufacturable solutions (I did not receive an answer to my inquiry about this from the contact person at the ITRS website). The Executive Summary of the 2001 edition states:
Two accessible articles from 2000 give a clearer impression of how this Red Brick Wall was perceived in the industry at the time. Both particularly emphasise gate oxide leakage.
(cont’d)
(cont’d from previous comment)
As I have mentioned at the beginning, the reports up to 2005 contained highly overoptimistic projections for on-chip frequency and supply voltage, which became dramatically more pessimistic in the 2007 edition. The reports clearly state, however, that these numbers are meant as targets and are not necessarily “on the road to sure implementation”, especially where it has been highlighted that solutions were needed and not yet known. They can therefore not necessarily serve as a clear indictment of the ITRS’ predictive powers, but I remain puzzled by some of their projections and comments on these before 2007. Getting clarification on this from industry insiders was the next thing I had planned for this project before we paused it.
Specifically, tables 4c and 4d in the Overall Roadmap Technology Characteristics, found in a subsection of the Executive Summary titled Performance of Packaged Chips, contain on-chip frequency forecasts in MHz, which became dramatically more pessimistic in 2007 than they had been in the previous 3 editions. A footnote in the 2007 edition states:
Later editions seem to have reduced the expected scaling factor even further (1.04 in the 2011 edition), but there were also changes made to the metric employed, so I am not sure how to interpret the numbers (though I would expect the scaling factor to be unaffected by those changes).
Relatedly, a paragraph in the System Drivers document titled Maximum on-chip (global) clock frequency states that the on-chip clock frequency would not continue scaling at a factor of 2 per generation for several reasons. The 2001 edition states 3 reasons for this, the 2003 and 2005 edition state 4. But only in 2007 was the limitation from maximum allowable power dissipation added to this list of reasons. This strikes me as very puzzling. The paragraph, as it appears in the 2007 edition, is (emphasis added):
Finally, the Overall Roadmap Technology Characteristics tables 6a and 6b (found in a subsection titled Power Supply and Power Dissipation in the Executive Summary) contains projected values of the supply power (V_{dd}) which also became dramatically more pessimistic in the 2007 edition.
I have indicated my puzzlement at these points in an email I have sent out to a number of industry insiders, then asking:
I have received some very kind replies to those emails, but most have focused on the technical reasons for the “breakdown” in Dennard scaling. The only comment I have received on this last question was from Robert Dennard, who sent me a particularly thoughtful email that came with 4 attachments (which mainly provided more technical detail on transistor design, however). At the end of his email, he wrote:
Indeed, which bets it is most rational to make depends on expected payoff ratios as well as on probability estimates. This distinction between targets and mere predictions complicates the question quite a bit.
This was an interesting project, it would be great to pick it up again.
Update: see also Erik DeBenedictis’ comments on this topic.