About This Specialization
This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving realworld problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of realworld data and settings.
Projects Overview
You will master your skills by solving a wide variety of realworld problems like image captioning and automatic game playing throughout the course projects. You will gain the handson experience of applying advanced machine learning techniques that provide the foundation to the current stateofthe art in AI.
Introduction to Deep Learning
Course can be found here
Lecture slides can be found here
About this course: The goal of this course is to give learners basic understanding of modern neural networks and their applications in computer vision and natural language understanding. The course starts with a recap of linear models and discussion of stochastic optimization methods that are crucial for training deep neural networks. Learners will study all popular building blocks of neural networks including fully connected layers, convolutional and recurrent layers.
Learners will use these building blocks to define complex modern architectures in TensorFlow and Keras frameworks. In the course project learner will implement deep neural network for the task of image captioning which solves the problem of giving a text description for an input image.
The prerequisites for this course are:
1) Basic knowledge of Python.
2) Basic linear algebra and probability.
Please note that this is an advanced course and we assume basic knowledge of machine learning. You should understand:
1) Linear regression: mean squared error, analytical solution.
2) Logistic regression: model, crossentropy loss, class probability estimation.
3) Gradient descent for linear models. Derivatives of MSE and crossentropy loss functions.
4) The problem of overfitting.
5) Regularization for linear models.
Who is this class for: Developers, analysts and researchers who are faced with tasks involving complex structure understanding such as image, sound and text analysis.
Week 1 Introduction to optimization
Welcome to the “Introduction to Deep Learning” course! In the first week you’ll learn about linear models and stochatic optimization methods. Linear models are basic building blocks for many deep architectures, and stochastic optimization is used to learn every model that we’ll discuss in our course.
Learning Objectives
 Train a linear model for classification or regression task using stochastic gradient descent
 Tune SGD optimization using different techniques
 Apply regularization to train better models
 Use linear models for classification and regression tasks
Course intro
Welcome!5 min
Linear model as the simplest neural network
Linear regression 9 min
Linear classification 10 min
Gradient descent 5 min
Quiz: Linear models 3 questions
QUIZ
Linear models
3 questions
To Pass80% or higher
Attempts3 every 8 hours
Deadline
November 26, 11:59 PM PST
1 point
1.Consider a vector (1,−2,0.5). Apply a softmax transform to it and enter the first component (accurate to 2 decimal places).
1 point
2.Suppose you are solving a 5class classification problem with 10 features. How many parameters a linear model would have? Don’t forget bias terms!
1 point
3.There is an analytical solution for linear regression parameters and MSE loss, but we usually prefer gradient descent optimization over it. What are the reasons?
Gradient descent is more scalable and can be applied for problems with high number of features
Gradient descent is a method developed especially for MSE loss
Gradient descent can find parameter values that give lower MSE value than parameters from analytical solution
Gradient descent doesn’t require to invert a matrix
Regularization in machine learning
Overfitting problem and model validation 6 min
Model regularization 5 min
Quiz: Overfitting and regularization 4 questions
QUIZ
Overfitting and regularization
4 questions
To Pass80% or higher
Attempts3 every 8 hours
Deadline
November 26, 11:59 PM PST
1 point
1.Select correct statements about overfitting:
Overfitting is a situation where a model gives lower quality for new data compared to quality on a training sample
Overfitting happens when model is too simple for the problem
Overfitting is a situation where a model gives comparable quality on new data and on a training sample
Large model weights can indicate that model is overfitted
1 point
2.What disadvantages do model validation on holdout sample have?
It requires multiple model fitting
It is sensitive to the particular split of the sample into training and test parts
It can give biased quality estimates for small samples
123,1,13,3,12,223
1 point
3.Suppose you are using kfold crossvalidation to assess model quality. How many times should you train the model during this procedure?
1
k
k(k−1)/2
k2
1 point
4.Select correct statements about regularization:
Weight penalty reduces the number of model parameters and leads to faster model training
Reducing the training sample size makes data simpler and then leads to better quality
Regularization restricts model complexity (namely the scale of the coefficients) to reduce overfitting
Weight penalty drives model parameters closer to zero and prevents the model from being too sensitive to small changes in features
Stochastic methods for optimization
Stochastic gradient descent 5 min
Gradient descent extensions 9 min
Linear models and optimization
Programming Assignment: Linear models and optimization 3h


Week 2 Introduction to neural networks
This module is an introduction to the concept of a deep neural network. You’ll begin with the linear model in numpy and finish with writing your very first deep network.
Learning Objectives
Explain the mechanics of basic building blocks for neural networks
Apply backpropagation algorithm to train deep neural networks using automatic differentiation
Implement, train and test neural networks using TensorFlow and Keras
Multilayer perceptron, or the basic principles of deep learning
Multilayer perceptron6 min
Training a neural network7 min
Backpropagation primer7 min
Practice Quiz: Multilayer perceptron4 questions
PRACTICE QUIZ
Multilayer perceptron
4 questions
To Pass100% or higher
Deadline
December 3, 11:59 PM PST
Question 11point
 Question 1
The best nonlinearity functions to use in a Multilayer perceptron are step functions as they allow to reconstruct the decision boundary with better precision.
Question 21 point  Question 2
A dense layer applies a linear transformation to its input
Question 31 point  Question 3
For an MLP to work, the nonlinearity function must have a finite upper bound
1 point  Question 4
How many dimensions will a derivative of a 1D vector by a 2D matrix have?
Tensorflow
Tensorflow_task.ipynb
Going deeper with Tensorflow11 min
Practice Programming Assignment: MSE in TensorFlow15 min
Gradients & optimization in Tensorflow8 min
Programming Assignment: Logistic regression in TensorFlow30 min


my1stNN boilerplate
Peergraded Assignment: my1stNN1h


Review Your Peers: my1stNN
Keras
Kerastask.ipynb
Keras introduction10 min
Programming Assignment: my1stNN  Keras this time1h
primary
Philosophy of deep learning
What Deep Learning is and is not8 min
Deep learning as a language6 min
Optional Honors Content
Neural networks the hard way
NumpyNN (honor).ipynb
Peergraded Assignment: Your very own neural network2h
primary
Review Your Peers: Your very own neural network
Week
primary
primary
Week
primary
primary
Week
primary
primary
Week
primary
primary
How to Win a Data Science Competition: Learn from Top Kagglers
Week
primary
primary
Week
primary
primary
Week
primary
primary
Bayesian Methods for Machine Learning
Week
primary
primary
Week
primary
primary
Week
primary
primary
Introduction to Reinforcement Learning
Week
primary
primary
Week
primary
primary
Week
primary
primary
Deep Learning in Computer Vision
Week
primary
primary
Week
primary
primary
Week
primary
primary
Natural Language Processing
Week
primary
primary
Week
primary
primary
Week
primary
primary
Addressing Large Hadron Collider Challenges by Machine Learning
Week
primary
primary
Week
primary
primary
Week
primary
primary