Report Overview
"From Chips to Systems: How AI Is Revolutionizing Compute and Infrastructure" is an industry report published by William Blair on September 18, 2024. The report provides a comprehensive analysis of how artificial intelligence, particularly generative AI, is transforming computing architectures, processor design, and systems engineering.
Key Insight: AI represents the next generational shift in computing, following mainframes, PCs, mobile, and cloud. This shift requires a fundamental rethinking of computing architectures, moving from chips to comprehensive systems where integration of hardware and software plays a critical role in performance improvements.
Key Data Points
Key Insights Summary
It's Not Just Better Chips, It's Better Systems
Traditional semiconductor companies are shifting focus from discrete chips to comprehensive computing systems. Companies like Nvidia now view themselves as builders of entire computers, where integration of chips, storage, networking, and software drives performance.
AI Represents the Next Generational Shift
AI follows prior shifts from mainframes to PCs in the 1980s and from PCs to mobile in the late 2000s. With AI still in its infancy, it's expected to create a massive multitrillion-dollar market opportunity over the next decade.
Parallel Computing Takes Center Stage
The rise of AI marks a shift from serial computing (CPU) to parallel computing (GPU/AI accelerator) in data centers. Roughly 30% of new chips in data centers were GPUs in 2023, expected to surpass 50% in coming years.
Vertical Integration Keeps Moore's Law Alive
As traditional transistor scaling slows, semiconductor companies are developing vertically integrated computing systems that combine expertise in compute, storage, networking, and software to continue driving performance improvements.
Compute Systems Garner Higher Margins
As semi companies expand their IP from chip design to systems engineering and software, they capture more value in the tech stack. Nvidia's CUDA stack helps achieve industry-leading gross margins in the mid-70% range.
Custom Chip Demand Highlights Verticalization
Hyperscale technology companies like Meta, AWS, Microsoft, and Google are designing their own chips, highlighting the benefits of designing entire systems for specific use-cases but also creating competitive dynamics.
Content Overview
Document Contents
Introduction
AI, particularly generative AI, represents the next generational shift in computing. Similar to prior waves that took us from mainframes to PCs to mobile phones to cloud data centers, each shift has required a rethinking of computing architectures, processor design, and systems engineering.
Technical capabilities like parallel computing, system on a chip (SoC), software-defined infrastructure, and systems science are not new, but AI has emerged as a new, large-scale workload that can make use of all these capabilities. This report examines how the rise in AI impacts the computing layer and the entirety of data center infrastructure technology.
Key Takeaways
The report identifies several critical trends reshaping the computing landscape:
- Systems Over Chips: The core unit of compute is shifting from the chip to the broader computing system, with companies like Nvidia building entire computers rather than just chips.
- Parallel Computing Dominance: AI marks a shift from serial computing (CPU) to parallel computing (GPU/AI accelerator) in data centers.
- Vertical Integration: Semiconductor companies are vertically integrating to continue driving performance improvements as Moore's Law slows.
- Higher Margins: Compute systems with integrated software command higher margins than chips alone.
- Custom Chip Demand: Hyperscalers are increasingly designing their own chips, creating both opportunities and competitive challenges.
A Brief History of Computing
The report traces the evolution of computing from the invention of the transistor in 1947 through the mainframe era, PC revolution, mobile computing, and now the AI wave. Each shift has brought changes in architectures, business models, and market leaders.
In the PC era, the CPU became the primary volume driver for semiconductors, with Intel emerging as the dominant player. The mobile era shifted focus to application processor units (APUs) and system-on-chip (SoC) designs. Now, the AI wave is transforming data centers and driving demand for parallel processing capabilities.
We Need More Processing!
AI requires handling and processing massive datasets, with computational demands that far exceed traditional computing tasks. The scale of AI workloads puts increased demand on core processing capabilities, memory, storage, networking, and power.
The difference between serial processing (CPU) and parallel processing (GPU) is fundamental to understanding AI computing needs. While CPUs excel at sequential tasks with low latency, GPUs provide much higher throughput for parallelizable workloads, making them ideal for AI training and inference.
As AI models grow larger and more complex, the computational requirements increase exponentially. For example, while the original AI transformer model released in 2017 required 7,400 petaFLOPs to train, Google's Gemini Ultra required 50 billion petaFLOPs.
It's Not Just a Chip, It's a System
As performance improvements from transistor scaling become harder and more expensive, semiconductor vendors are shifting focus to broader systems architectures. The key unit of compute is no longer the processor, but rather a system that connects multiple processors with storage and networking, integrated with critical software.
This shift is driving a pendulum swing from horizontal to vertical integration in the semiconductor industry. Companies like Nvidia are building complete computing systems rather than just chips, creating more substantial technical moats and capturing more value in the tech stack.
Amdahl's Law, which states that the speedup of a task using multiple processors is constrained by the portion that cannot be parallelized, has pushed semiconductor companies to focus on system-level optimizations and software that can maximize parallelization benefits.
Designing AI Clusters
AI data centers might look similar to traditional data centers, but there are considerable differences in hardware, software, power, and cooling needs. AI clusters group multiple servers connected together, dedicated to specific AI workloads.
Key components of AI clusters include:
- Compute: High-performance parallel processing chips (GPUs or specialized ASICs)
- Memory/Storage: Specialized memory hierarchies optimized for AI workloads
- Networking: High-bandwidth, low-latency connectivity between servers and components
- Power/Energy: Significantly higher power requirements than traditional data centers
- Cooling: Advanced cooling solutions to handle high-density computing
The debate between Ethernet and InfiniBand for AI networking is ongoing, with each technology having its advantages and trade-offs for different AI workloads and cluster sizes.
GPUs, Software, and Structurally Higher Margins
As semiconductor companies move up the tech stack to optimize system performance and capture more value, they have been able to drive better gross margins. The integration of software and systems design IP into chip solutions allows for more differentiation and value creation.
Nvidia is a prime example, with its gross margins increasing from the low- to mid-60% range to the mid-70% range. While some of this improvement comes from pricing power, the integration of valuable software like CUDA creates sustainable margin advantages even as core GPU technology becomes more commoditized.
Conclusion
AI is refocusing the entire semiconductor ecosystem on system-level efficiencies for data centers. Discrete chip manufacturers are moving up the stack and embracing verticalization to remain competitive.
While the rapid pace of spending on GPUs and AI infrastructure may create near-term volatility, the long-term impact of AI on computing is profound. As technology leaders have noted, the risk of underinvesting in AI is dramatically greater than the risk of overinvesting.
The largest companies remain committed to this AI investment cycle, seeing it as existential to remain competitive in this next technology wave.
Note: The above is only a summary of the report content. The complete document contains extensive data, charts, and detailed analysis. We recommend downloading the full PDF for in-depth reading.