Machine Learning Systems (Fall 2019)

When: Mondays and Fridays from 2:00 to 3:30
Where: Soda 310
Instructor: Joseph E. Gonzalez
- Office Hours: Wednesdays from 4:00 to 5:00 in 773 Soda Hall.
Announcements: Piazza
Sign-up to Present: Google Spreadsheet Every student should sign-up to present in at least three rows and as different roles each time. Note that the Backup/Scribe presenter may be asked to fill in for one of the other roles with little notice.
If you have reading suggestions please send a pull request to this course website on Github by modifying the index.md file.

Course Description

The recent success of AI has been in large part due in part to advances in hardware and software systems. These systems have enabled training increasingly complex models on ever larger datasets. In the process, these systems have also simplified model development, enabling the rapid growth in the machine learning community. These new hardware and software systems include a new generation of GPUs and hardware accelerators (e.g., TPU and Nervana), open source frameworks such as Theano, TensorFlow, PyTorch, MXNet, Apache Spark, Clipper, Horovod, and Ray, and a myriad of systems deployed internally at companies just to name a few. At the same time, we are witnessing a flurry of ML/RL applications to improve hardware and system designs, job scheduling, program synthesis, and circuit layouts.

In this course, we will describe the latest trends in systems designs to better support the next generation of AI applications, and applications of AI to optimize the architecture and the performance of systems. The format of this course will be a mix of lectures, seminar-style discussions, and student presentations. Students will be responsible for paper readings, and completing a hands-on project. For projects, we will strongly encourage teams that contains both AI and systems students.

New Course Format

A previous version of this course was offered in Spring 2019. The format of this second offering is slightly different. Each week will cover a different research area in AI-Systems. The Monday lecture will be presented by Professor Gonzalez and will cover the context of the topic as well as a high-level overview of the reading for the week. The Friday lecture will be organized around a mini program committee meeting for the weeks readings. Students will be required to submit detailed reviews for a subset of the papers and lead the paper review discussions. The goal of this new format is to both build a mastery of the material and also to develop a deeper understanding of how to evaluate and review research and hopefully provide insight into how to write better papers.

Course Syllabus

This is a tentative schedule. Specific readings are subject to change as new material is published.

Jump to Today

Week	Date (Lec.)	Topic
1	8/30/19 ( 1 )	Introduction and Course Overview This lecture will be an overview of the class, requirements, and an introduction to the history of machine learning and systems research. Lecture slides: [pdf, pptx] How to read a paper provides some pretty good advice on how to read papers effectively. Timothy Roscoe’s writing reviews for systems conferences will also help you in the reviewing process.
2	9/2/19 ( 2 )	Holiday (Labor Day) There will be no class but please sign-up for the weekly discussion slots.
2	9/6/19 ( 3 )	Big Ideas and How to Evaluate ML Systems Research Submit your review before 1:00PM. Lecture slides: [pdf, pptx] SysML: The New Frontier of Machine Learning Systems Read Chapter 1 of Principles of Computer System Design. You will need to be on campus or use the Library VPN to obtain a free PDF. A Few Useful Things to Know About Machine Learning A Berkeley View of Systems Challenges for AI Additional Machine Learning Reading Kevin Murphy’s Textbook Introduction to Machine Learning. This provides a very high-level overview of machine learning. You should probably know all of this. Stanford CS231n Tutorial on Neural Networks. I recommend reading Module 1 for a quick crash course in machine learning and some of the techniques used in this class. Additional Systems Reading Hints for Computer System Design Open Debate about the Field Rich Sutton’s Post on Compute in ML and the corresponding Shimon Whiteson twitter debate
3	9/9/19 ( 4 )	Machine Learning Life-cycle This lecture will discuss the machine learning life-cycle, spanning model development, training, and serving. It will outline some of the technical machine learning and systems challenges at each stage and how these challenges interact. Lecture slides: [pdf, pptx] Template Slide Format for PC Meeting [Google Drive]
3	9/13/19 ( 5 )	Discussion of Papers on Machine Learning Life-cycle Submit your review before 1:00PM. Slides and scribe notes from the PC Meeting. (These are only accessible to students enrolled in the class.) Hidden Technical Debt in Machine Learning Systems TFX: A TensorFlow-Based Production-Scale Machine Learning Platform Towards Unified Data and Lifecycle Management for Deep Learning Data Engineering Bulletin: Machine Learning Life-cycle Management Context: The Missing Piece in the Machine Learning Lifecycle Software 2.0 Blog Post Doing Machine Learning the Uber Way: Five Lessons From the First Three Years of Michelangelo Introducing FBLearner Flow: Facebook’s AI backbone DeepBird: Twitters ML Deployment Framework Demonstration of Mlflow: A System to Accelerate the Machine Learning Lifecycle Software: KubeFlow: Kubernetes Pipeline Orchestration Framework MLflow
4	9/16/19 ( 6 )	Database Systems and Machine Learning In the previous lecture we saw that data and feature engineering is often the dominant hurtle in model development. Database systems are often the source of data and the platform in which feature engineering takes place. This lecture will cover some of the big ideas is database systems and how they relate to work on machine learning in databases. Lecture slides: [pdf, pptx] Project Proposal Sign-up doc. You must be enrolled in the class or on the waitlist to access this document. Please add any projects you are thinking about starting and list yourself as interested in anyone else’s projects.
4	9/20/19 ( 7 )	Discussion of Database Systems and Machine Learning Submit your review before 1:00PM. Slides for PC Meeting posted. (These slides will only be accessible to students enrolled in the class.) Towards a Unified Architecture for in-RDBMS Analytics Materialization Optimizations for Feature Selection Workloads Learning Generalized Linear Models Over Normalized Data Learning Linear Regression Models over Factorized Joins MauveDB: Supporting Model-based User Views in Database Systems The MADlib Analytics Library or MAD Skills, the SQL
5	9/23/19 ( 8 )	Machine Learning Frameworks and Automatic Differentiation This week we will discuss recent development in model development and training frameworks. While there is a long history of machine learning frameworks we will focus on frameworks for deep learning and automatic differentiation. In class we will review some of the big trends in machine learning framework design and basic ideas in forward and backward automatic differentiation. Lecture slides: [pdf, pptx] Project proposals are due next Monday
5	9/27/19 ( 9 )	Machine Learning Frameworks and Automatic Differentiation Update: Two of the readings were changed to reflect a focus on deep learning frameworks. The previous readings on SystemML and KeystoneML have been moved to optional reading. Submit your review before 1:00PM. Slides for PC Meeting ] These slides will only be accessible to students enrolled in the class. Automatic differentiation in ML: Where we are and where we should be going TensorFlow: A System for Large-Scale Machine Learning JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Pipeline Training Frameworks (Classical) KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics SystemML: Declarative Machine Learning on Spark Automatic Differentiation and Differentiable Programming Automatic Differentiation in Machine Learning: a Survey Roger Grosse’s Lecture Notes on Automatic Differentiation A Differentiable Programming System to Bridge Machine Learning and Scientific Computing Deep Learning Frameworks with Automatic Differentiation Caffe: Convolutional Architecture for Fast Feature Embedding Theano: A Python Framework for Fast Computation of Mathematical Expressions and Theano: A CPU and GPU Math Compiler in Python Automatic differentiation in PyTorch MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems Deep Learning Primitives cuDNN: Efficient Primitives for Deep Learning
6	9/30/19 ( 10 )	Distributed Model Training This week we will discuss developments in distributed training. We will quickly review the statistical query model pushed by early map-reduce machine learning frameworks and then discuss advances in parameter servers and distributed neural network training. Lecture slides: [pdf, pptx] Project Proposals Due! One Page Project description due at 11:59 PM. Check out the suggested projects. Submit a link to your one page Google document containing your project descriptions to this google form. You only need one submission per team but please list all the team member’s email addresses. You can also update your submission if needed.
6	10/4/19 ( 11 )	Discussion on Distributed Model Training Submit your review before 1:00PM. Slides for PC Meeting (These slides will only be accessible to students enrolled in the class.) Scaling Distributed Machine Learning with the Parameter Server PipeDream: Generalized Pipeline Parallelism for DNN Training Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Integrated Model, Batch, and Domain Parallelism in Training Neural Networks Effect of batch size on training dynamics Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent[pdf] Large Scale Distributed Deep Networks Scaling Distributed Machine Learning with In-Network Aggregation ImageNet in X Minutes Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Now anyone can train Imagenet in 18 minutes Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes All-Reduce Baidu Ring All-Reduce Blog Post The original Ring All-Reduce Paper “Bandwidth Optimal All-reduce Algorithms for Clusters of Workstations” Visual intuition on ring-Allreduce for distributed Deep Learning Double Binary Trees
7	10/7/19 ( 12 )	Prediction Serving Until recently, much of the focus on systems research was aimed at model training. However, recently there has been a growing interest in addressing the challenges of prediction serving. This lecture will frame the challenges of prediction serving and cover some of the recent advances. Lecture slides: [pdf, pptx]
7	10/11/19 ( 13 )	Power Outage Related Holiday Unfortunately, class was canceled and so the PC Meeting has been moved to Monday. Note that early project presentations are also due next Friday.
8	10/14/19 ( 14 )	Discussion on Prediction Serving Submit your review before 1:00PM. Slides for PC Meeting (These slides will only be accessible to students enrolled in the class.) Pretzel: Opening the Black Box of Machine Learning Prediction Serving Systems InferLine: ML Inference Pipeline Composition Framework (pre-print) Focus: Querying Large Video Datasets with Low Latency and Low Cost The Prediction-Serving Systems: What happens when we wish to actually deploy a machine learning model to production? ACM Queue article provides a nice overview. Systems Reading: Live Video Analytics at Scale with Approximation and Delay-Tolerance LASER: A Scalable Response Prediction Platform For Online Advertising TensorFlow-Serving: Flexible, High-Performance ML Serving Clipper: A Low-Latency Online Prediction Serving System Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox The Case for Predictive Database Systems: Opportunities and Challenges. More Efficient Models: Paul Viola and Michael Jones Rapid Object Detection using a Boosted Cascade of Simple Features CVPR 2001. Performance Breakdown of various models Benchmark Analysis of Representative Deep Neural Network Architectures
8	10/18/19 ( 15 )	Project Presentations
9	10/21/19 ( 16 )	Finish Project Presentations and Start Model Compilation This week we will explore the process of compiling/optimizing deep neural network computation graphs. This reading will span both graph level optimization as well as the compilation and optimization of individual tensor operations. Lecture slides: [pdf, pptx]
9	10/25/19 ( 17 )	Discussion of Model Compilation Submit your review before 1:00PM. Slides for PC Meeting (These slides will only be accessible to students enrolled in the class.) Optimizing DNN Computation with Relaxed Graph Substitutions TVM: An Automated End-to-End Optimizing Compiler for Deep Learning Learning to Optimize Halide with Tree Search and Random Programs Learning to Optimize Tensor Programs: The TVM story is two fold. There’s a System for ML story (above paper) and this paper is their the ML for System story. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks TensorComprehensions Supporting Very Large Models using Automatic Dataflow Graph Partitioning
10	10/28/19 ( 18 )	PG&E and Fire Related Cancellation Unfortunately, due to the power outage, lecture is canceled today. To make up for lost lecture(s) and accommodate our guest speakers, we will skip the overview lecture this week and start with the PC meeting on Machine Learning Applied to Systems. However, this will put a little extra pressure on the neutral presenters to provide additional context. We will then cover the discussion on machine learning hardware the following Monday.
10	11/1/19 ( 19 )	Discussion of Machine Learning Applied to Systems Day 1 Submit your review before 1:00PM. Slides for PC Meeting (These slides will only be accessible to students enrolled in the class.) Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms Device Placement Optimization with Reinforcement Learning The Case for Learned Index Structures Quasar: Resource-Efficient and QoS-Aware Cluster Management
11	11/4/19 ( 20 )	Hardware Acceleration for Machine Learning This lecture will be presented by Kurt Keutzer and Suresh Krishna who are experts in processor design as well as network and architecture co-design. Guest lecture slides: [pdf, pptx]
11	11/8/19 ( 21 )	Discussion Hardware Acceleration for Machine Learning Submit your review before 1:00PM. Slides for PC Meeting (These slides will only be accessible to students enrolled in the class.) A Configurable Cloud-Scale DNN Processor for Real-Time AI In-Datacenter Performance Analysis of a Tensor Processing Unit Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks Efficient Processing of Deep Neural Networks: A Tutorial and Survey A great spreadsheet analysis of the power and performance characteristics of all the publicly available hardware accelerators for deep learning (GPUs, CPU, TPUs). Nvidia post comparing different GPUs across a wide range of networks.
12	11/11/19 ( 22 )	(11/11) Administrative Holiday
12	11/15/19 ( 23 )	Discussion of Machine Learning Applied to Systems Day 2 Submit your review before 1:00PM. Slides for PC Meeting coming soon. (These slides will only be accessible to students enrolled in the class.) AuTO: Scaling Deep Reinforcement Learning to Enable Datacenter-Scale Automatic Traffic Optimization Neural Adaptive Video Streaming with Pensieve Neural Adaptive Content-aware Internet Video Delivery
13	11/18/19 ( 24 )	Learning with Adversaries This week we will discuss machine learning in adversarial settings. This includes secure federated learning, differential privacy, and adversarial examples. Lecture slides: [pdf, pptx]
13	11/22/19 ( 25 )	Discussion on Learning with Adversaries Submit your review before 1:00PM. Slides for PC Meeting coming soon. (These slides will only be accessible to students enrolled in the class.) Communication-Efficient Learning of Deep Networks from Decentralized Data Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Helen: Maliciously Secure Coopetitive Learning for Linear Models Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference Rendered Insecure: GPU Side Channel Attacks are Practical The Algorithmic Foundations of Differential Privacy Federated Learning: Collaborative Machine Learning without Centralized Training Data Federated Learning at Google … A comic strip? SecureML: A System for Scalable Privacy-Preserving Machine Learning More reading coming soon …
14	11/25/19 ( 26 )	Autonomous Driving Autonomous vehicles will likely transform society in the next decade and are fundamentally AI enabled systems. In this lecture we will discuss the AI-Systems challenges around autonomous driving. Lecture slides: [pdf, pptx]
14	11/29/19 ( 27 )	(11/29) Holiday (Thanksgiving)
15	12/2/19 ( 28 )	Discussion on Autonomous Driving Everyone must do one of the readings (you pick). Submit your review before 1:00PM. Slides for PC Meeting coming soon. (These slides will only be accessible to students enrolled in the class.) Self-Driving Cars: A Survey. This is a slightly longer survey so focus more on the overview and framing first few pages of the autonomous driving problem and common solutions. The Architectural Implications of Autonomous Driving: Constraints and Acceleration ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst An Open Approach to Autonomous Vehicles End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners DARPA Grand Challenges Software Infrastructure for an Autonomous Ground Vehicle Stanley: The Robot that Won the DARPA Grand Challenge Tartan Racing: A Multi-Modal Approach to the DARPA Urban Challenge Towards a Viable Autonomous Driving Research Platform Engineering Autonomous Driving Software
15	12/6/19 ( 29 )	Conclusion!
16	12/9/19 ( 30 )	(12/6) RRR Week
16	12/13/19 ( 31 )	(12/9) RRR Week
17	12/16/19 ( 32 )	(12/16) Poster Presentations
17	12/20/19 ( 33 )	(12/20) No Class Don’t forget to submit your final reports. As noted on Piazza, the final report should be 6-pages plus references (2-column, 10pt font, unlimited appendix). Please submit your report using this form: Submit Your Report Here You only need to submit the project once per team. The write-up should discuss the problem formulation, related work, your approach, and your results.

Projects

Detailed candidate project descriptions will be posted shortly. However, students are encourage to find projects that relate to their ongoing research.

Grading

Grades will be largely based on class participation and projects. In addition, we will require weekly paper summaries submitted before class.

Projects: 60%
Weekly Summaries: 20%
Class Participation: 20%

Machine Learning Systems (Fall 2019)

Course Description

New Course Format

Course Syllabus

Introduction and Course Overview

Holiday (Labor Day)

Big Ideas and How to Evaluate ML Systems Research

Additional Machine Learning Reading

Additional Systems Reading

Open Debate about the Field

Machine Learning Life-cycle

Discussion of Papers on Machine Learning Life-cycle

Software:

Database Systems and Machine Learning

Discussion of Database Systems and Machine Learning

Machine Learning Frameworks and Automatic Differentiation

Machine Learning Frameworks and Automatic Differentiation

Pipeline Training Frameworks (Classical)

Automatic Differentiation and Differentiable Programming

Deep Learning Frameworks with Automatic Differentiation

Deep Learning Primitives

Distributed Model Training

Project Proposals Due!

Discussion on Distributed Model Training

ImageNet in X Minutes

All-Reduce

Prediction Serving

Power Outage Related Holiday

Discussion on Prediction Serving

Systems Reading:

More Efficient Models:

Performance Breakdown of various models

Project Presentations

Finish Project Presentations and Start Model Compilation

Discussion of Model Compilation

PG&E and Fire Related Cancellation

Discussion of Machine Learning Applied to Systems Day 1

Hardware Acceleration for Machine Learning

Discussion Hardware Acceleration for Machine Learning

(11/11) Administrative Holiday

Discussion of Machine Learning Applied to Systems Day 2

Learning with Adversaries

Discussion on Learning with Adversaries

Autonomous Driving

(11/29) Holiday (Thanksgiving)

Discussion on Autonomous Driving

DARPA Grand Challenges

Conclusion!

(12/6) RRR Week

(12/9) RRR Week

(12/16) Poster Presentations

(12/20) No Class

Submit Your Report Here

Projects

Grading