This lecture will cover an overview of developments in deep learning and then we will dive into several recent papers on developments in deep learning related to convolutional networks, model compression, time-series modeling, and distributed deep learning.

Lecture Slides:

  1. Joseph Gonzalez Overview of Deep Learning [pptx][pdf]

  2. Francois Belletti Principles of Neural Network Design [pdf]

  3. Xin Wang Deep Model Compression [pdf]

  4. Sammy Sidhu Scalable Deep Learning [pdf]


Notes

For those who want a good introduction to big ideas in deep learning checkout the following links:

  1. The Stanford CS231n Convolutional Neural Networks for Visual Recognition course has excellent Exercises and Tutorials and some good slides as well.
    1. Linear Regression and Softmax
    2. Gradient Descent
    3. Backpropagation as Dynamic Programming
    4. Network Architecture and Terminology
    5. Convolution
  2. Chris Olah’s blog has some good tutorials as well:
    1. Backpropagation
    2. Types of Neural Networks
    3. Recurrent Neural Networks and LSTMs and More advanced RNNs and Turing Machines
    4. Convolutional Networks
  3. The WildML Blog and Glossary
    1. Recurrent Neural Networks

Reading List:

Important Neural Architectures [Francois Belletti]

  1. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton ImageNet Classification with Deep Convolutional Neural Networks, NIPS’12.

  2. Blog post. Recurrent Neural Networks and LSTMs

Optional Reading

  1. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 4 (December 1989)

  2. Szegedy et al. Going Deeper with Convolutions, CVPR’15. There is also a shorter inception blog post.

  3. Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y. Ng Multimodal Deep Learning, ICML’11.

  4. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A Simple Way to Prevent Neural Networks from Overfitting, JMLR’14.

  5. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS’15.

  6. Felix A. Gers, Jürgen A. Schmidhuber, and Fred A. Cummins. Learning to Forget: Continual Prediction with LSTM. Neural Comput. 12, 10 (October 2000), 2451-2471.

  7. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, Yoshua Bengio Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, NIPS’14 Deep Learning Workshop.

Scalable Training Systems [Sammy Sidhu]

  1. Martín Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv only 2016.

Optional Reading

  1. Martín Abadi et al. TensorFlow: A system for large-scale machine learning, arXiv only 2016.

  2. For those interested in learning more about TensorFlow checkout the tutorials.

  3. Forrest N. Iandola, Khalid Ashraf, Matthew W. Moskewicz, and Kurt Keutzer FireCaffe: near-linear acceleration of deep neural network training on compute clusters CVPR 2016. [Forrest’s Slides], arXiv only

  4. Caffe Tutorial

Model Quantization and Compression [Xin Wang]

  1. Geoffrey Hinton, Oriol Vinyals, Jeff Dean Distilling the Knowledge in a Neural Network, ICLR’16.

Optional Reading

  1. Song Han, Huizi Mao, William J. Dally Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR’16.

  2. Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, Yixin Chen Compressing Neural Networks with the Hashing Trick, ICML 15.

  3. Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size arXiv only

Questions