CS294-252: Architectures and Systems for Hyperscale Cloud Datacenters in the Era of Agentic AI

Fall 2025, UC Berkeley

Location: Tuesdays and Thursdays from 2pm-3:30pm in 405 Soda

Course Overview: Warehouse-Scale Computers (WSCs) host hyperscale cloud services relied on by billions of daily users and power the latest advances in AI/ML, data processing, and web services. While classical WSCs were built as homogeneous collections of servers and networking hardware, modern hardware scaling trends and exponential increases in demand for AI/ML compute have necessitated the introduction of specialized hardware in datacenter environments, including ML accelerators and ML “supercomputer pods”, SmartNICs, GPUs, and custom server SoCs. The challenge of designing these HW/SW systems is vast and ever-growing, but also critical to enabling the continued advancement of AI-powered applications.

This graduate-level course will explore two major themes:

  • How do we architect hardware-software systems at scale to support efficient, practical, AI-powered application pipelines, end-to-end? (i.e. more than just the math)
  • How can AI help us wrangle complexity in designing these HW/SW systems to meet exponential demand, from chip to datacenter and beyond?

Prerequisites: Students must satisfy the following requirements to enroll:

Completion of at least one of: CS252, CS262, CS268, EECS251.

OR

Completion of at least two of: CS152, EECS151, CS162, CS168.

Additionally, if you are an undergraduate, 5th-year master’s, or concurrent enrollment student, please fill out the following form to be considered for enrollment: https://forms.gle/qWfFdmeVUGK2PpJT9.

Calendar / Reading List

August 28
Intro to Warehouse-Scale Computers
Reading 1
L. Barroso, et. al. The Datacenter as a Computer, Third Edition.

September 2
Datacenter-Wide Trends and Workloads
Reading 1
S. Kanev, et. al. Profiling a Warehouse-Scale Computer.
Reading 2
W. Su, et. al. DCPerf: An Open-Source, Battle-Tested Performance Benchmark Suite for Datacenter Workloads.

September 11
Datacenter-Wide Trends and Workloads Pt. 2
Reading 1
J. Dean, et. al. The tail at scale. +
L. Barroso, et. al. Attack of the Killer Microseconds.
Reading 2
K. Seemakhupt, et. al. A Cloud-Scale Characterization of Remote Procedure Calls.

September 16
Accelerators in WSCs, Pt. 1
Reading 1
I. Magaki, et. al. ASIC Clouds: Specializing the Datacenter.
Reading 2
N. Jouppi, et. al. TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings.

September 23
Accelerators in WSCs, Pt. 3
Reading 1
M. D. Hill, et. al. Accelerator-Level Parallelism. +
A. Saidi. Powering Amazon EC2: Deep dive on the AWS Nitro System.
Reading 2
S. Karandikar, et. al. A Hardware Accelerator for Protocol Buffers.

September 25
Agile Hardware Design at Scale
Reading 1
P. Ranganathan, et. al. Warehouse-scale video acceleration: co-design and deployment in the wild.
Reading 2
S. Karandikar, et. al. FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud.

September 30
Memory and Disaggregation, Pt. 1
Reading 1
J. Weiner, et. al. TMO: transparent memory offloading in datacenters.
Reading 2
K. Zhao, et. al. Contiguitas: The Pursuit of Physical Memory Contiguity in Datacenters.

October 7
Project Proposal Presentations

October 9
Project Proposal Presentations, Pt. 2

October 14
Silent Data Corruption
Reading 1
H. D. Dixit, et. al. Silent Data Corruptions at Scale.
Reading 2
P. H. Hochschild, et. al. Cores that don’t count.

October 16
Memory and Disaggregation, Pt. 2
Reading 1
P. Duraisamy, et. al. Towards an Adaptable Systems Architecture for Memory Tiering at Warehouse-Scale.
Reading 2
D. Berger, et. al. Octopus: Scalable Low-Cost CXL Memory Pooling.

October 21
Server Design
Reading 1
G. Ayers, et. al. Memory Hierarchy for Web Search.
Reading 2
A. Sriraman, et. al. SoftSKU: optimizing server architectures for microservice diversity @scale.

October 23
Sustainability, Pt. 2
Reading 1
C. Elsworth, et. al. Measuring the environmental impact of delivering AI at Google Scale.
Reading 2
J. Wang, et. al. Designing Cloud Servers for Lower Carbon.

October 30
Data Analytics, Pt. 1.
Reading 1
A. Gonzalez, et. al. Profiling Hyperscale Big Data Processing.

November 4
Project Lightning Status Updates

November 6
Data Analytics, Pt. 2
Reading 1
L. Wu, et. al. Q100: The Architecture and Design of a Database Processing Unit.
Reading 2
D. B. Johnston and A. Caldwell. AWS Redshift reimagined / AQUA Accelerator

November 11
Holiday/No Class

November 13
Operating Systems
Reading 1
J. T. Humphries, et. al. ghOSt: Fast & Flexible User-Space Delegation of Linux Scheduling.
Reading 2
J. T. Humphries, et. al. A case against (most) context switches.

November 20
Cluster Management
Reading 1
A. Verma, et. al. Large-scale cluster management at Google with Borg.
Reading 2
C. Tang, et. al. Twine: A Unified Cluster Management System for Shared Infrastructure.

November 25
Project Lightning Status Updates
Remote attendance/presentation OK.

November 27
Holiday/No Class

December 2
Feedback-Directed Optimization
Reading 1
G. Ayers, et. al. AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers.
Reading 2
Y. Zhang, et. al. OCOLOS: Online COde Layout OptimizationS.

December 8 to 12
N/A (RRR Week)

December TBD
Final Project Presentations (Finals Week)

Weekly Schedule

  • Lecture/Discussion: Tuesdays and Thursdays from 2pm-3:30pm in 405 Soda
  • Weekly Reading Reviews: See Ed for submission links.
    • Due Mondays @ noon pacific for Tuesday lecture papers.
    • Due Wednesdays @ noon pacific for Thursday lecture papers.
  • Weekly Student Presenter Slides: Check your email for submission instructions.
    • Due Fridays @ noon pacific for Tuesday lecture presentations.
    • Due Tuesdays @ noon pacific for Thursday lecture presentations.

Assignments and Grading

The course workload will consist of the following:

  • 25% of grade: Each class, students will be required to read and provide a review of the two papers for that day and attend and participate in the class discussion.
    • Can drop two classes’ worth, no questions asked.
    • After project proposal presentations take place (week of Oct 7 and 9), students are required to read and submit a review for only one paper per-class by the usual pre-class deadline and submit the second review during class based on in-class discussion.
  • 25% of grade: Each student will lead the discussion of a few papers during the semester.
  • 50% of grade: Students will complete a semester-long research project, in groups of 2 or 3, related to the course material.

Instructor