This is a table containing items covered in the class.

Week Date Topic
1 08/29/2016

Course Overview (Ion, Raluca, Joe, and Joey)

In the first class we will outline the focus of the RISE Lab course, identify some of the key questions we will explore, and review some of the key solutions today.

Slides
2 09/5/2016

Class Canceled for Holiday

Topic Presentations Selected

3 09/12/2016

Data-Centric Programming (Joe)

Computer scientists have figured out how to build scalable distributed systems for analyzing data in parallel. But we haven't really cracked the general-purpose distributed programming problem. Could we translate our success with distributed data analytics back to solve the "hard parts" of distributed programming? Today we'll dig deep into that question.

4 09/19/2016

Trusted Hardware and Cloud Security (Raluca)

We will learn about a new trusted hardware proposal from Intel, SGX. Moreover, we will see how one can protect the confidentiality and integrity of the data and computation from a compromised cloud provider using Intel SGX (as in the Haven system). This area is fascinating because it enables a (compromised) cloud provider to run complex systems such as databases without ever seeing the data or the computation it is running, and without being able to cheat in the process. Even if attackers obtain root access on the cloud provider's machines, they won't see the data or affect the correctness of the computation.

5 09/26/2016

Cluster Management Systems (Ion)

In this lecture we will learn about cluster management systems. The role of a cluster management system is to allow different cluster computing frameworks (e.g., Hadoop MR, Spark, Web services) to share the same clusters. Challenges the cluster management systems need to address include performance isolation, the ability to implement various allocation polices, and the ability to scale to thousands of servers and beyond.

Project Ideas Posted

A complete set of project ideas will be posted here. If you have any suggestions for ideas that you would like collaborators on please add them by sending a pull request.

6 10/3/2016

Prediction Serving and the Machine Learning Life-cycle (Joey)

In this lecture we will explore the challenges of serving low-latency predictions at scale and managing the machine learning life-cycle spanning model training and management.

7 10/10/2016

Guest Speaker: Moritz Hardt

Moritz will talk about work on new tools for reliable data science.

Project Team Selections

Please enter your project selections in this form.

8 10/17/2016

Project Proposal Presentations

9 10/24/2016

Securing Distributed Computation and Leakage Vectors (Raluca)

We discuss how we to use SGX to secure distributed systems such as MapReduce (the VC3 system): the cloud performs the MapReduce computations without seeing the data or being able to tamper with the computation. Then, we will discuss various types of leakage in SGX and approaches (such as ORAM and GhostRider) to address these issues.

10 10/31/2016

Overview of Deep Learning (Joey)

This lecture will cover an overview of developments in deep learning and then we will dive into several recent papers on developments in deep learning related to model compression, time-series modeling, as well as serving deep models.

11 11/7/2016

Encrypted data analytics and learning (Raluca)

In this lecture, we are looking at a leakage that occurs in a distributed setting, and how we can build systems that are not susceptible to this leakage. We then discuss two systems, one for performing data analytics and one for performing machine learning, both on sensitive data the service provider cannot see.

12 11/14/2016

Data Context and Metadata Management (Joe)

We are missing so much rich contextual metadata in our data analysis projects: what data we have, why we have it, how and by whom it gets used, and how all these aspects evolve over time. This *data context* problem raises interesting systems challenges while also suggesting opportunities for new applications and algorithms to significantly improve the efficiency of data analysts.

13 11/21/2016

Dealing with Feedback (Joey)

This lecture covers the key ideas in bandits and reinforcement learning.

14 11/28/2016

Realtime Systems (Ion)

Much of the focus of the Real-time Systems research is on resource allocation and scheduling that ensure that jobs meet their timeline guarantees. In this lecture, we will go over several papers that touch on the main challenges and tradeoffs. These include handling bursty (interactive) workloads, heterogenous traffic, parallel jobs, and multitenancy.

15 12/5/2016

RRR Week

16 12/12/2016

Final Project Presentations and Posters (3 Hours+)

17 12/17/2016

Projects Due