[TOC]

Welcome to the Nanodegree

Get started with learning about your Nanodegree. Introduction to Decision Trees, Naive Bayes, Linear and Logistic Regression and Support Vector Machines. You can join the MLND student community by following this link and registering your email - https://mlnd-slack.udacity.com

WELCOME TO THE NANODEGREE

Welcome to MLND

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/HG5IYufgDAo.mp4

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/dc9CmcGTnr0.mp4

What is Machine Learning?

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/K45QM8Wi7BU.mp4

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/_N2iIB_bLXA.mp4

Applications of Machine Learning

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/kIM5D_W6Mh8.mp4

Connections to GA Tech

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/DysCmGKRpvs.mp4

Program Outline

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/m0cIDrRWyLw.mp4

What is ML

Introduction to Machine Learning

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/bYeteZQrUcE.mp4

Decision Trees

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1RonLycEJ34.mp4

Decision Trees Quiz

QUIZ QUESTION

Between Gender and Age, which one seems more decisive for predicting what app will the users download?

• Gender
• Age

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/h8zH47iFhCo.mp4

Naive Bayes

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/jsLkVYXmr3E.mp4

Naive Bayes Quiz

QUIZ QUESTION

If an e-mail contains the word “cheap”, what is the probability of it being spam?

40%

60%
80%

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/YKN-fjuZ1VU.mp4

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/BEC0uH1fuGU.mp4

Linear Regression Quiz

QUIZ QUESTION

What’s the best estimate for the price of a house?

80k

120k

190k
SUBMIT

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/L5QBqYDNJn0.mp4

Logistic Regression Quiz

QUIZ QUESTION

Does the student get Accepted?

Yes

No
SUBMIT

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/JuAJd9Qvs6U.mp4

Support Vector Machines

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Fwnjx0s_AIw.mp4

Support Vector Machines Quiz

QUIZ QUESTION

Which one is a better line?
The yellow line

The blue line
SUBMIT

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/JrUtTwfnsfM.mp4

Neural Networks

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/xFu1_2K2D2U.mp4

Kernel Method

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/x0JqH6-Dhvw.mp4

Kernel Method Quiz

QUIZ QUESTION

Which equation could come to our rescue?

x+y
xy

x^2
SUBMIT

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/dRFd6HaAXys.mp4

Recap and Challenge

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/ecREasTrKu4.mp4

K-means Clustering

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/pv_i08zjpQw.mp4

Hierarchical Clustering

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1PldDT8AwMA.mp4

Practice Project: Detect Spam

Practice Project: Using Naive Bayes to detect spam.

From time to time you will be encouraged to work on practice projects which are aimed at deepening your understanding of the concepts being taught. In this practice project, you will be implementing the Naive Bayes algorithm to detect spam text messages(as taught by Luis earlier in the lesson) from an open source dataset.

Here is the notebook, the solutions are included.

Summary

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/hJEuaOUu2yA.mp4

MLND Program Orientation

Before the Program Orientation

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/73CdKtS-IwU.mp4

Introduction

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/fxNSn63xFvA.mp4

Projects and Progress

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Z9ZLMQWsbsk.mp4

Career Development

Being enrolled in one of Udacity’s Nanodegree programs has many careers-based perks. Our goal is to help you take your learning from this program and apply it in the real world and in your career.

As you venture through the Machine Learning Engineer Nanodegree program, you’ll have the opportunity to

• Update your resume through a peer-reviewed process using conventions that recruiters expect and get tips on how to best represent yourself to pass the “6 second screen”;
• Create a cover letter that portrays your soft and hard skills, and most importantly your passion for a particular job that you are interested in applying to;
• Get your GitHub and LinkedIn profiles reviewed through the lens of a recruiter or hiring manager, focusing on how your profile, projects, code, education and past experiences represent you as a potential candidate;
• Practice for a technical interview with a professional reviewer on a variety of topics;
• And more!

You can also find career workshops that Udacity has hosted over the years, where you can gain a plethora of information to prepare you for your ventures into a career. Udacity also provides job placement opportunities with many of our industry partners. To take advantage of this opportunity, fill out the career section of your Udacity professional profile, so we know more about you and your career goals! If all else fails, you can always default to emailing the career team at career-support@udacity.com.

Your Nanodegree community will play a huge role in supporting you when you get stuck and in helping you deepen your learning. Getting to know your fellow students will also make your experience a lot more fun!

To ask and answer questions, and to contribute to discussions, head to your program forum. You can get there by clicking the Discussion link in the classroom and in the Resources tab in your Udacity Home. You can search to see if someone has already asked a question related to yours, or you can make a new post if no one has. Chances are, someone else is wondering about the same thing you are, so don’t be shy!

In addition, students may connect with one another through Slack, a team-oriented chat program. You can join the MLND Slack student community by following this link and registering your email. There are many content-related channels where you can speak with students about a particular concept, and even discuss your first week in the program using the #first-week-experience channel. In addition, you can talk with MLND graduates and alumni to get a live perspective on the program in the #ask-alumni channel! You can find the student-run community wiki here.

Support from the Udacity Team

The Udacity team is here to help you reach your Nanodegree program goals! You can interact with us in the following ways:

• Forums: Along with your student community, the Udacity team maintains a strong presence in the forum to help make sure your questions get answered and to connect you with other useful resources.
• 1-on-1 Appointments: If you get stuck working on a project in the program, our mentors are here to help! You can set up a half-hour appointment with a mentor available for the project at a time you choose to get assistance.
• Project Reviews: During the project submission process, your submissions will be reviewed by a qualified member of our support team, who will provide comments and helpful feedback on where your submission is strongest, and where your submission needs improvement. The reviews team will support your submissions all the way up to meeting specifications!
• By email: You can always contact the Machine Learning team with support-related questions using machine-support@udacity.com. Please make sure that you have exhausted all other options before doing so!
Find out more about the support we offer using the Resources tab in your Udacity Nanodegree Home.

How Does Project Submission Work?

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/jCJa_VP6qgg.mp4

Integrity and Mindset

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/zCOr3O50gQM.mp4

How Do I Find Time for My Nanodegree?

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/d-VfUw7wNEQ.mp4
All calendar applications now let you set up a weekly reminder. I have included a screen capture below of how to set one up in Google Calendar. We recommend coming into the classroom at least twice a week. It is a best practice to set up at least one repeating weekly reminder to continue the Nanodegree program.

Final Tips

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1ZVBvM54hQw.mp4

Wrapping Up the Program Orientation

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/xujb3Rqxuog.mp4
You now have all the info you need to proceed on your Nanodegree journey!

• If you have any further questions, perhaps about payment or enrollment, read your Nanodegree Student Handbook for more details.
• Remember to put in time consistently, engage with your community, take advantage of the resources available to you, and give us feedback throughout the program.

(Optional) Exploratory Project

Software Requirements

1. windows + R and
2. type pip install --user pandas jupyter,
3. oops, error: Microsoft Visual C++ 9.0 is required. Get it from http://aka.ms/vcpython27
5. successfully

Starting the Project

First try

1. windows + R
2. typecd <path>my<path>isG:\Udacity\MLND\machine-learning-master\projects\titanic_survival_exploration
3. type<path> g:
4. typebash jupyter notebook titanic_survival_exploration.ipynb,show 'bash' 不是内部或外部命令，也不是可运行的程序
5. Failed

Second try

1. opengit bash
2. cd<path>with/G:/Udacity/MLND/machine-learning-master/projects/titanic_survival_exploration
3. typebash jupyter notebook titanic_survival_exploration.ipynb
4. failed

Third try

1. install Anaconda
2. windows + R
3. typecd <path>my<path>isG:\Udacity\MLND\machine-learning-master\projects\titanic_survival_exploration
4. type<path>g:
5. typejupyter notebook titanic_survival_exploration.ipynb
6. done

Fourth try

1. opengit bash
2. cd<path>with/G:/Udacity/MLND/machine-learning-master/projects/titanic_survival_exploration
3. typejupyter notebook titanic_survival_exploration.ipynb
4. done

Question 4(stay tuned):

1. Pclass == 3

Career: Orientation

Throughout your Nanodegree program, you will see Career Development Lessons and Projects that will help ensure you’re presenting your new skills best during your job search. In this short lesson, meet the Careers team and learn about the career resources available to you as a Nanodegree student.

If you are a Nanodegree Plus student, Career Content and Career Development Projects are required for graduation.

If you are enrolled in a standard Nanodegree program, Career Content and Career Development Projects are optional and do not affect your graduation.

ORIENTATION

Career Services Available to You

Meet the Careers Team

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/oR1IxPTTz0U.mp4

Resources

Connect to Hiring Partners through your Udacity Professional Profile

In addition to the Career Lessons and Projects you’ll find in your Nanodegree program, you have a Udacity Professional Profile linked in the left sidebar.
Your Udacity Professional Profile features important, professional information about yourself. When you make your profile public, it becomes accessible to our Hiring Partners, as well as to recruiters and hiring managers who come to Udacity to hire skilled Nanodegree graduates.

As you complete projects in your Nanodegree program, they will be automatically added to your Udacity Professional Profile to ensure you’re able to show employers the skills you’ve gained through the program. In order to differentiate yourself from other candidates, make sure to go in and customize those project cards. In addition to these projects, be sure to:

• Keep your profile updated with your basic info and job preferences, such as location
• Return regularly to your Profile to update your projects and ensure you’re showcasing your best work

If you are looking for a job, make sure to keep your Udacity Professional Profile updated and visible to recruiters!

Model Evaluation and Validation

Apply statistical analysis tools to model observed data, and gauge how well your models perform.

Project: Predicting Boston Housing Prices

For most students, this project takes approximately 8 - 15 hours to complete (about 1 - 3 weeks).

P1 Predicting Boston Housing Prices

STATISTICAL ANALYSIS

Intro: Model Evaluation and Validation

Intro to Model Evaluation and Validation

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/cseqEWRDs5Q.mp4

Model Evaluation What You’ll Watch

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/jYZO17CeZDI.mp4

Model Evaluation What You’ll Learn

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/ZLOucNwuqCk.mp4

Course Outline - Fitting It All Together

The ultimate goal of Machine Learning is to have data models that can learn and improve over time. In essence machine learning is making inferences on data from previous examples.

In this first section we review some basic statistics and numerical tools to manipulate & process our data.

Then we will move on to modeling data; reviewing different data types and seeing how they play out in the case of one specific dataset. The section ends by introducing the basic tool of a supervised learning algorithm.

Next, we’ll see how to use our dataset for both training and testing data, and review various tools for how to evaluate how well an algorithm performs.

Finally, we’ll look at the reasons that errors arise, and the relationship between adding more data and adding more complexity in getting good predictions. The last section ends by introducing cross validation, a powerful meta-tool for helping us use our tools correctly.

Model Evaluation What You’ll Do

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/kJCAuHjWiOA.mp4

Prerequisites

Statistics Review & Supporting Libraries

In this section we will go over some prerequisites for this course, review basic statistics concepts and problem sets, and finally teach you how to use some useful data analysis Python libraries to explore real-life datasets using the concepts you reviewed earlier.

Prerequisites

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/0ANDJ8i_deE.mp4
Here are shortcuts to the prerequisite statistics courses:

You will also need to have just a little bit of git experience — enough to check out our code repository. If you’ve ever used git before, you should be fine. If this is truly your first time with git, once you get to the first mini-project, you may want to quickly look at the first lesson of Udacity’s git course.

1. mode mean
2. variancestandard deviation
3. Bessel's Correction use n-1 instead of n
4. sample SD

Measures of Central Tendency

Introduction: Topics Covered

Measures of Central Tendency

In this lesson, we will cover the following topics:

• Mean
• Median
• Mode
• This lesson is meant to be a refresher for those who have no statistics background and therefore if you are familiar with these concepts you may skip this lesson.

Which Major?

Quiz: Which Major?

Software and Libraries

This project uses the following software and Python libraries:

Python 2.7
NumPy
pandas
scikit-learn (v0.17)
matplotlib
You will also need to have software installed to run and execute a Jupyter Notebook.

If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 2.7 installer and not the Python 3.x installer.

Starting the Project

For this assignment, you can find the finding_donors folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project!

This project contains three files:

• finding_donors.ipynb: This is the main file where you will be performing your work on the project.
• census.csv: The project dataset. You?ll load this data in the notebook.
• visuals.py: This Python script provides supplementary visualizations for the project. Do not modify.

In the Terminal or Command Prompt, navigate to the folder containing the project files, and then use the command jupyter notebook finding_donors.ipynb to open up a browser window or tab to work with your notebook. Alternatively, you can use the command jupyter notebook or ipython notebook and navigate to the notebook file in the browser window that opens. Follow the instructions in the notebook and answer each question presented to successfully complete the project. A README file has also been provided with the project files which may contain additional necessary information or instruction for the project.

Submitting the Project

Evaluation

Your project will be reviewed by a Udacity reviewer against the Finding Donors for CharityML project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named finding_donors for ease of access:

• The finding_donors.ipynb notebook file with all questions answered and all code cells executed and displaying output.
• An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated.
Once you have collected these files and reviewed the project rubric, proceed to the project submission page.

Submission

Finding Donors for CharityML

CharityML is a fictitious charity organization located in the heart of Silicon Valley that was established to provide financial support for people eager to learn machine learning. After nearly 32,000 letters sent to people in the community, CharityML determined that every donation they received came from someone that was making more than $50,000 annually. To expand their potential donor base, CharityML has decided to send letters to residents of California, but to only those most likely to donate to the charity. With nearly 15 million working Californians, CharityML has brought you on board to help build an algorithm to best identify potential donors and reduce overhead cost of sending mail. Your goal will be evaluate and optimize several different supervised learners to determine which algorithm will provide the highest donation yield while also reducing the total number of letters being sent. Project Files For this assignment, you can find the finding_donors folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project! Evaluation Your project will be reviewed by a Udacity reviewer against the Finding Donors for CharityML project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass. Submission Files When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named finding_donors for ease of access: • The finding_donors.ipynb notebook file with all questions answered and all code cells executed and displaying output. • An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated. I’m Ready! When you’re ready to submit your project, click on the Submit Project button at the bottom of the page. If you are having any problems submitting your project or wish to check on the status of your submission, please email us at machine-support@udacity.com or visit us in the discussion forums. What’s Next? You will get an email as soon as your reviewer has feedback for you. In the meantime, review your next project and feel free to get started on it or the courses supporting it! PROJECT Unsupervised Learning Learn how to find patterns and structures in unlabeled data, perform feature transformations and improve the predictive performance of your models. Project: Creating Customer Segments For most students, this project takes approximately 10 - 15 hours to complete (about 1 - 2 weeks). P3 Creating Customer Segments CLUSTERING Introduction to Unsupervised Learning Unsupervised Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/8oZpT6Hekhk.mp4 What You’ll Watch and Learn https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1a68kAJAgIU.mp4 Clustering Unsupervised Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Mx9f99bRB3Q.mp4 Clustering Movies https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/g8PKffm8IRY.mp4 More Clustering Single Linkage Clustering Quiz: Single Linkage Clustering Please use a comma to separate the two objects that will be linked in your answer. For instance, to describe a link from a to b, write “a,b” as your answer in the box. https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/vytc9CsjjAs.mp4 Single Linkage Clustering Two https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/aojgUed9M0w.mp4 Clustering Mini-Project Clustering Mini-Project Video https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/68EGMItJiNM.mp4 K-Means Clustering Mini-Project In this project, we’ll apply k-means clustering to our Enron financial data. Our final goal, of course, is to identify persons of interest; since we have labeled data, this is not a question that particularly calls for an unsupervised approach like k-means clustering. Nonetheless, you’ll get some hands-on practice with k-means in this project, and play around with feature scaling, which will give you a sneak preview of the next lesson’s material. The Enron dataset can be found here. Clustering Features The starter code can be found in k_means/k_means_cluster.py, which reads in the email + financial (E+F) dataset and gets us ready for clustering. You’ll start with performing k-means based on just two financial features–take a look at the code, and determine which features the code uses for clustering. Run the code, which will create a scatterplot of the data. Think a little bit about what clusters you would expect to arise if 2 clusters are created. Deploying Clustering Deploy k-means clustering on the financial_features data, with 2 clusters specified as a parameter. Store your cluster predictions to a list called pred, so that the Draw() command at the bottom of the script works properly. In the scatterplot that pops up, are the clusters what you expected? FEATURE ENGINEERING Feature Scaling Chris’s T-Shirt Size (Intuition) https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/oaqjLyiKOIA.mp4 A Metric for Chris https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/O0bvLU4l0is.mp4 Feature Selection Introduction https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/UAMwTr3cnok.mp4 Feature Selection https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/8CpRLplmdqE.mp4 DIMENSIONALITY REDUCTION PCA Data Dimensionality https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/gg7SAMMl4kM.mp4 Trickier Data Dimensionality https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/-dcNhrSPmoY.mp4 PCA Mini-Project PCA Mini-Project Intro https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/rR68JXwKBxE.mp4 PCA Mini-Project Our discussion of PCA spent a lot of time on theoretical issues, so in this mini-project we’ll ask you to play around with some sklearn code. The eigenfaces code is interesting and rich enough to serve as the testbed for this entire mini-project. The starter code can be found in pca/eigenfaces.py. This was mostly taken from the example found here, on the sklearn documentation. Take note when running the code, that there are changes in one of the parameters for the SVC function called on line 94 of pca/eigenfaces.py. For the ‘class_weight’ parameter, the argument string “auto” is a valid value for sklearn version 0.16 and prior, but will be depreciated by 0.19. If you are running sklearn version 0.17 or later, the expected argument string should be “balanced”. If you get an error or warning when running pca/eigenfaces.py, make sure that you have the correct argument on line 98 that matches your installed version of sklearn. Feature Transformation Introduction https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/J9JsMNownYM.mp4 Feature Transformation https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/B6mPphwAXZk.mp4 Summary What we have learned https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/74oyGTdFp0Y.mp4 UNSUPERVISED LEARNING PROJECT Things you will learn by completing this project: • How to apply preprocessing techniques such as feature scaling and outlier detection. • How to interpret data points that have been scaled, transformed, or reduced from PCA. • How to analyze PCA dimensions and construct a new feature space. • How to optimally cluster a set of data to find hidden patterns in a dataset. • How to assess information given by cluster data and use it in a meaningful way. jupyter notebook customer_segments.ipynb Creating Customer Segments project rubric submit project review second review seaborn.heatmap() git project: 1. opengit bashand tpyecd <path>with\before space key and/between directory 2. git init git status git add <> git commmit -m "description" 3. create a repo in github 4. git remote add origin URL git push origin master 5. git push origin master 6. git add <>git commit -m "second submit"git push origin master Identifying customers by clustering them. Overview Project Overview In this project you will apply unsupervised learning techniques on product spending data collected for customers of a wholesale distributor in Lisbon, Portugal to identify customer segments hidden in the data. You will first explore the data by selecting a small subset to sample and determine if any product categories highly correlate with one another. Afterwards, you will preprocess the data by scaling each product category and then identifying (and removing) unwanted outliers. With the good, clean customer spending data, you will apply PCA transformations to the data and implement clustering algorithms to segment the transformed customer data. Finally, you will compare the segmentation found with an additional labeling and consider ways this information could assist the wholesale distributor with future service changes. Project Highlights This project is designed to give you a hands-on experience with unsupervised learning and work towards developing conclusions for a potential client on a real-world dataset. Many companies today collect vast amounts of data on customers and clientele, and have a strong desire to understand the meaningful relationships hidden in their customer base. Being equipped with this information can assist a company engineer future products and services that best satisfy the demands or needs of their customers. Things you will learn by completing this project: • How to apply preprocessing techniques such as feature scaling and outlier detection. • How to interpret data points that have been scaled, transformed, or reduced from PCA. • How to analyze PCA dimensions and construct a new feature space. • How to optimally cluster a set of data to find hidden patterns in a dataset. • How to assess information given by cluster data and use it in a meaningful way. Software Requirements Description A wholesale distributor recently tested a change to their delivery method for some customers, by moving from a morning delivery service five days a week to a cheaper evening delivery service three days a week. Initial testing did not discover any significant unsatisfactory results, so they implemented the cheaper option for all customers. Almost immediately, the distributor began getting complaints about the delivery service change and customers were canceling deliveries — losing the distributor more money than what was being saved. You’ve been hired by the wholesale distributor to find what types of customers they have to help them make better, more informed business decisions in the future. Your task is to use unsupervised learning techniques to see if any similarities exist between customers, and how to best segment customers into distinct categories. Software and Libraries This project uses the following software and Python libraries: Python 2.7 NumPy pandas scikit-learn (v0.17) matplotlib  You will also need to have software installed to run and execute a Jupyter Notebook. If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 2.7 installer and not the Python 3.x installer. Starting the Project For this assignment, you can find the customer_segments folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project! This project contains three files: • customer_segments.ipynb: This is the main file where you will be performing your work on the project. • customers.csv: The project dataset. You’ll load this data in the notebook. • visuals.py: This Python script provides supplementary visualizations for the project. Do not modify. In the Terminal or Command Prompt, navigate to the folder containing the project files, and then use the command jupyter notebook customer_segments.ipynb to open up a browser window or tab to work with your notebook. Alternatively, you can use the command jupyter notebook or ipython notebook and navigate to the notebook file in the browser window that opens. Follow the instructions in the notebook and answer each question presented to successfully complete the project. A README file has also been provided with the project files which may contain additional necessary information or instruction for the project. Submitting the Project Evaluation Your project will be reviewed by a Udacity reviewer against the Creating Customer Segments project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass. Submission Files When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named customer_segments for ease of access: • The customer_segments.ipynb notebook file with all questions answered and all code cells executed and displaying output. • An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated. Once you have collected these files and reviewed the project rubric, proceed to the project submission page. Submission Creating Customer Segments A wholesale distributor recently tested a change to their delivery method for some customers, by moving from a morning delivery service five days a week to a cheaper evening delivery service three days a week.Initial testing did not discover any significant unsatisfactory results, so they implemented the cheaper option for all customers. Almost immediately, the distributor began getting complaints about the delivery service change and customers were canceling deliveries — losing the distributor more money than what was being saved. You’ve been hired by the wholesale distributor to find what types of customers they have to help them make better, more informed business decisions in the future. Your task is to use unsupervised learning techniques to see if any similarities exist between customers, and how to best segment customers into distinct categories. Project Files For this assignment, you can find the customer_segments folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project! Evaluation Your project will be reviewed by a Udacity reviewer against the Creating Customer Segments project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass. Submission Files When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named customer_segments for ease of access: • The customer_segments.ipynb notebook file with all questions answered and all code cells executed and displaying output. • An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated. I’m Ready! When you’re ready to submit your project, click on the Submit Project button at the bottom of the page. If you are having any problems submitting your project or wish to check on the status of your submission, please email us at machine-support@udacity.com or visit us in the discussion forums. What’s Next? You will get an email as soon as your reviewer has feedback for you. In the meantime, review your next project and feel free to get started on it or the courses supporting it! Supporting Materials Videos Zip File Career: Networking In the following lesson, you will learn how tell your unique story to recruiters in a succinct and professional but relatable way. After completing these lessons, be sure to complete the online profile review projects, such as LinkedIn Profile Review. If you are a Nanodegree Plus student, Career Content and Career Development Projects are required for graduation. If you are enrolled in a standard Nanodegree program, Career Content and Career Development Projects are optional and do not affect your graduation. NETWORKING Develop Your Personal Brand Why Network? https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/exjEm9Paszk.mp4 Elevator Pitch https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/S-nAHPrkQrQ.mp4 Personal Branding How to Stand Out Imagine you’re a hiring manager for a company, and you need to pick 5 people to interview for a role. But you get 50 applications, and everyone seems pretty qualified. How do you compare job candidates? You’ll probably pick the candidates that stand out the most to you. Personal Stories The thing that always makes a job candidate unique is their personal story - their passion and how they got there. Employers aren’t just looking for someone with the skills, but they’re looking for someone who can drive the company’s mission and will be a part of innovation. That’s why they need to know your work ethic and what drives you. As someone wanting to impress an employer, you need to tell your personal story. You want employers to know how you solve problems, overcome challenges, achieve results. You want employers to know what excites you, what motivates you, what drives you forward. All of this can be achieved through effective storytelling, and effective branding. I’ll let you know I’ve branded and rebranded myself many times. That’s okay - people are complex and have multiple interests that change over time. In this next video, we’ll meet my coworker Chris who will show us how he used personal branding to help him in his recent career change. Resources Blog post: Storytelling, Personal Branding, and Getting Hired Meet Chris https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/0ccflD9x5WU.mp4 Resources Blog post: Overcome Imposter Syndrome Elevator Pitch https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/0QtgTG49E9I.mp4 Pitching to a Recruiter https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/LxAdWaA-qTQ.mp4 Use Your Elevator Pitch https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/e-v60ieggSs.mp4 Optimize Your LinkedIn Profile Why LinkedIn LinkedIn is the most popular professional networking platform out there, so most recruiters use it to find job seekers. It’s so common for hiring teams to use LinkedIn to find and look at candidates, that it’s almost a red flag if they’re unable to find a LinkedIn profile for you. It’s also a great platform for you to connect with other people in your field. Udacity for example has an Alumni LinkedIn group where graduates can collaborate on projects, practice job interviews, or discuss new trends in the industry together. Connecting with a fellow alum and asking for a referral would increase your chances of getting an interview. Find Connections The best way to use your LinkedIn effectively, however, is to have over 500 connections. This may seem like a lot, but once you get rolling, you’ll get to that number fast. After you actively start using it it, by joining groups and going to networking events, your number of connections will climb. You are more likely to show up in search results on LinkedIn if you have more connections, which means you’ll be more visible to recruiters. Join Groups Increasing the group of people you’re connected with also exposes you to what they’re working on or have done. For example, if you move to a new city, you can search your network to see who lives in the area, and ask for recommendations on apartment hunting, job leads, or other advice on adjusting to life in another city. Also, if you’re active in a LinkedIn group or if you frequently write LinkedIn blog posts, you’ll increase your visibility on the platform and likelihood that a recruiter will find your profile. How to Build Your LinkedIn Profile LinkedIn guides you well when filling out your profile. It tells you if your profile is strong and offers recommendations on how to improve it. We recommend you follow LinkedIn’s advice because it’ll increase your visibility on the network, thus increasing the number of opportunities you may come across. Tips for an Awesome LinkedIn Profile In the lessons on conducting a successful job search and resume writing, we talk about how you can describe your work experiences in a way that targets a specific job. Use what you learn to describe your experiences in LinkedIn’s projects and work sections. You can even copy and paste over the bullet points in your resume to the work or project sections of LinkedIn. Making sure your resume and LinkedIn are consistent helps build your personal brand. Find Other Networking Platforms Remember that LinkedIn isn’t the only professional networking platform out there. If you do have a great LinkedIn profile, that means you can also build an amazing profile on other platforms. Find some recommendations for online profiles on the Career Resource Center. Up Next By now, you know how to target your job profile to your dream job. You know how to market yourself effectively through building off your elevator pitch. Being confident in this will help you network naturally, whether on LinkedIn or at an event in-person. Move on to the LinkedIn Profile Review and get personalized feedback on your online presence. GitHub Profile Review LinkedIn Profile Review Udacity Professional Profile Review Reinforcement Learning DUE OCT 19 Use Reinforcement Learning algorithms like Q-Learning to train artificial agents to take optimal actions in an environment. Project: Train a Smartcab to Drive For most students, this project takes approximately 15 - 21 hours to complete (about 2 - 3 weeks). P4 Train a Smartcab to Drive Markov Decision Processes • Further details on this quiz can be found in Chapter 17 of Artificial Intelligence: A Modern Approach REINFORCEMENT LEARNING • Andrew Moore’s slides on Zero-Sum Games • Andrew Moore’s slides on Non-Zero-Sum Games • This paper offers a summary and an investigation of the field of reinforcement learning. It’s long, but chock-full of information! PROJECT Software Requirements Common Problems with PyGame Train a Smartcab to Drive project rubric submit windows + rtypepip install pygame review REINFORCEMENT LEARNING Introduction to Reinforcement Learning Reinforcement Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/PeAHckcWFS0.mp4 What You’ll Watch and Learn https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Z6ATPu4b9nc.mp4 Reinforcement Learning What You’ll Do https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1vQQphPLnkM.mp4 Markov Decision processes Introduction https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/_ocNerSvh5Y.mp4 Reinforcement Learning Reinforcement Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/HeYSFWPX_4k.mp4 Rat Dinosaurs https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/h7ExhVneBDU.mp4 GAME THEORY Game Theory Game Theory https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/vYHk1SPpnmQ.mp4 What Is Game Theory? https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/jwlteKFyiHU.mp4 PROJECT Train a cab to drive itself. Overview Software Requirements Description In the not-so-distant future, taxicab companies across the United States no longer employ human drivers to operate their fleet of vehicles. Instead, the taxicabs are operated by self-driving agents, known as smartcabs, to transport people from one location to another within the cities those companies operate. In major metropolitan areas, such as Chicago, New York City, and San Francisco, an increasing number of people have come to depend on smartcabs to get to where they need to go as safely and reliably as possible. Although smartcabs have become the transport of choice, concerns have arose that a self-driving agent might not be as safe or reliable as human drivers, particularly when considering city traffic lights and other vehicles. To alleviate these concerns, your task as an employee for a national taxicab company is to use reinforcement learning techniques to construct a demonstration of a smartcab operating in real-time to prove that both safety and reliability can be achieved. Software Requirements This project uses the following software and Python libraries: • Python 2.7 • NumPy • pandas • matplotlib • PyGame If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 2.7 installer and not the Python 3.x installer. pygame can then be installed using one of the following commands: Mac: conda install -c https://conda.anaconda.org/quasiben pygame Linux: conda install -c https://conda.anaconda.org/tlatorre pygame Windows: conda install -c https://conda.anaconda.org/prkrekel pygame Please note that installing pygame can be done using pip as well. You can run an example to make sure pygame is working before actually performing the project by running: python -m pygame.examples.aliens Common Problems with PyGame Fixing Common PyGame Problems The PyGame library can in some cases require a bit of troubleshooting to work correctly for this project. While the PyGame aspect of the project is not required for a successful submission (you can complete the project without a visual simulation, although it is more difficult), it is very helpful to have it working! If you encounter an issue with PyGame, first see these helpful links below that are developed by communities of users working with the library: Problems most often reported by students “PyGame won’t install on my machine; there was an issue with the installation.” Solution: As has been recommended for previous projects, Udacity suggests that you are using the Anaconda distribution of Python, which can then allow you to install PyGame through the conda-specific command. “I’m seeing a black screen when running the code; output says that it can’t load car images.” Solution: The code will not operate correctly unless it is run from the top-level directory for smartcab. The top-level directory is the one that contains the README and the project notebook. If you continue to have problems with the project code in regards to PyGame, you can also use the discussion forums to find posts from students that encountered issues that you may be experiencing. Additionally, you can seek help from a swath of students in the MLND Student Slack Community. Starting the Project For this assignment, you can find the smartcab folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project! This project contains three directories: • /logs/: This folder will contain all log files that are given from the simulation when specific prerequisites are met. • /images/: This folder contains various images of cars to be used in the graphical user interface. You will not need to modify or create any files in this directory. • /smartcab/: This folder contains the Python scripts that create the environment, graphical user interface, the simulation, and the agents. You will not need to modify or create any files in this directory except for agent.py. It also contains two files: • smartcab.ipynb: This is the main file where you will answer questions and provide an analysis for your work. -visuals.py: This Python script provides supplementary visualizations for the analysis. Do not modify. Finally, in /smartcab/ are the following four files: • Modify: • agent.py: This is the main Python file where you will be performing your work on the project. • Do not modify: • environment.py: This Python file will create the smartcab environment. • planner.py: This Python file creates a high-level planner for the agent to follow towards a set goal. • simulator.py: This Python file creates the simulation and graphical user interface. Running the Code In a terminal or command window, navigate to the top-level project directory smartcab/ (that contains the three project directories) and run one of the following commands: python smartcab/agent.py or python -m smartcab.agent This will run the agent.py file and execute your implemented agent code into the environment. Additionally, use the command jupyter notebook smartcab.ipynbfrom this same directory to open up a browser window or tab to work with your analysis notebook. Alternatively, you can use the command jupyter notebook or ipython notebook and navigate to the notebook file in the browser window that opens. Follow the instructions in the notebook and answer each question presented to successfully complete the implementation necessary for your agent.py agent file. A README file has also been provided with the project files which may contain additional necessary information or instruction for the project. Definitions Environment The smartcab operates in an ideal, grid-like city (similar to New York City), with roads going in the North-South and East-West directions. Other vehicles will certainly be present on the road, but there will be no pedestrians to be concerned with. At each intersection there is a traffic light that either allows traffic in the North-South direction or the East-West direction. U.S. Right-of-Way rules apply: • On a green light, a left turn is permitted if there is no oncoming traffic making a right turn or coming straight through the intersection. • On a red light, a right turn is permitted if no oncoming traffic is approaching from your left through the intersection. To understand how to correctly yield to oncoming traffic when turning left, you may refer to this official drivers’ education video, or this passionate exposition. Inputs and Outputs Assume that the smartcab is assigned a route plan based on the passengers’ starting location and destination. The route is split at each intersection into waypoints, and you may assume that the smartcab, at any instant, is at some intersection in the world. Therefore, the next waypoint to the destination, assuming the destination has not already been reached, is one intersection away in one direction (North, South, East, or West). The smartcab has only an egocentric view of the intersection it is at: It can determine the state of the traffic light for its direction of movement, and whether there is a vehicle at the intersection for each of the oncoming directions. For each action, the smartcab may either idle at the intersection, or drive to the next intersection to the left, right, or ahead of it. Finally, each trip has a time to reach the destination which decreases for each action taken (the passengers want to get there quickly). If the allotted time becomes zero before reaching the destination, the trip has failed. Rewards and Goal The smartcab will receive positive or negative rewards based on the action it as taken. Expectedly, the smartcab will receive a small positive reward when making a good action, and a varying amount of negative reward dependent on the severity of the traffic violation it would have committed. Based on the rewards and penalties the smartcab receives, the self-driving agent implementation should learn an optimal policy for driving on the city roads while obeying traffic rules, avoiding accidents, and reaching passengers’ destinations in the allotted time. Submitting the Project Evaluation Your project will be reviewed by a Udacity reviewer against the Train a Smartcab to Drive project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass. Submission Files When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named smartcab for ease of access: • The agent.py Python file with all code implemented as required in the instructed tasks. • The /logs/ folder which should contain five log files that were produced from your simulation and used in the analysis. • The smartcab.ipynb notebook file with all questions answered and all visualization cells executed and displaying results. • An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated. Once you have collected these files and reviewed the project rubric, proceed to the project submission page. Submission Train a Smartcab to Drive In the not-so-distant future, taxicab companies across the United States no longer employ human drivers to operate their fleet of vehicles. Instead, the taxicabs are operated by self-driving agents — known as smartcabs — to transport people from one location to another within the cities those companies operate. In major metropolitan areas, such as Chicago, New York City, and San Francisco, an increasing number of people have come to rely on smartcabs to get to where they need to go as safely and efficiently as possible. Although smartcabs have become the transport of choice, concerns have arose that a self-driving agent might not be as safe or efficient as human drivers, particularly when considering city traffic lights and other vehicles. To alleviate these concerns, your task as an employee for a national taxicab company is to use reinforcement learning techniques to construct a demonstration of a smartcab operating in real-time to prove that both safety and efficiency can be achieved. Project Files For this assignment, you can find the smartcab folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project! Evaluation Your project will be reviewed by a Udacity reviewer against the Train a Smartcab to Drive project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass. Submission Files When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named smartcab for ease of access: • Theagent.py Python file with all code implemented as required in the instructed tasks. • The /logs/ folder which should contain five log files that were produced from your simulation and used in the analysis. • The smartcab.ipynb notebook file with all questions answered and all visualization cells executed and displaying results. • An HTML export of the project notebook with the name report.html. This file must be present for your project to be evaluated. I’m Ready! When you’re ready to submit your project, click on the Submit Project button at the bottom of this page. If you are having any problems submitting your project or wish to check on the status of your submission, please email us at machine-support@udacity.com or visit us in the discussion forums. What’s Next? You will get an email as soon as your reviewer has feedback for you. In the meantime, review your next project and feel free to get started on it or the courses supporting it! Supporting Materials Videos Zip File View Submission submission Deep Learning P5 Build a Digit Recognition Program FROM MACHINE LEARNING TO DEEP LEARNING SOFTWARE AND TOOLS TensorFlow Download and Setup Method 1: Pre-built Docker container with TensorFlow and all assignments To get started with TensorFlow quickly and work on your assignments, follow the instructions in this README. Note: If you are on a Windows machine, Method 1 is your only option due to lack of native TensorFlow support. (not needed) Check your GPU right click computer->property->设备管理器->显示适配器 I use the CPU only method (failed) First try from discussion at Udacity • Install Docker Toolbox (you can get it here). I recommend installing every optional package. ->failed • Create a virtual machine for your udacity tensorflow work: docker-machine create -d virtualbox --virtualbox-memory 2048 tensorflow • In a cmd.exe prompt, run FOR /f "tokens=*" %i IN ('docker-machine env --shell cmd tensorflow') DO %i • Next, run docker run -p 8888:8888 --name tensorflow-udacity -it b.gcr.io/tensorflow-udacity/assignments:0.5.0 • In a browser, go to http://192.168.99.100:8888/tree (failed) Second try I have 2 versions in python, so I will not use this one. (failed) Third try from discussion at Udacity windows + r ohe = preprocessing.OneHotEncoder() # creating OneHotEncoder object label_encoded_data = label_encoder.fit_transform(data['health']) ohe.fit_transform(label_encoded_data.reshape(-1,1))  After executing the above steps, I can use tensorflow by selecting the following option in Jupyter notebook: Kernel => Change kernel => python [conda env:py35] Note: I used python 2.7 and jupyter notebook for the earlier assignments. (Useful)Forth method Follow this video and install Ubuntu in Virtualbox. 虚拟硬盘文件保存位置C:\Users\SSQ\VirtualBox VMs\Deep Learning Ubuntu\Deep Learning Ubuntu.vdi location of shared file C:\Users\SSQ\virtualbox share Follow this blog to copy files between host OS and guest OS. for me I usesudo mount -t vboxsf virtualbox_share /mnt/ Follow this TensorFlow for mac ox, follow this video register mega https://www.tensorflow.org/get_started/os_setup#pip_installation_on_windows (success) Fifth try with pip install Follow this website When I type pip install tensorflow in Virtualbox (OS:Linux), it always shows ReadTimeoutError: HTTPSConnectionPool(host='pypi.python.org', port=443): Read timed out., so I choose sudo pip install --upgrade https://pypi.tuna.tsinghua.edu.cn/packages/7b/c5/a97ed48fcc878e36bb05a3ea700c077360853c0994473a8f6b0ab4c2ddd2/tensorflow-1.0.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=a7483a4da4d70cc628e9e207238f77c0 to install tensorflow Collecting numpy>=1.11.0 (from tensorflow==1.0.0) Downloading numpy-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl (16.5MB) sudo pip install --upgrade https://pypi.python.org/packages/cb/47/19e96945ee6012459e85f87728633f05b1e8791677ae64370d16ac4c849e/numpy-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=9f9bc53d2e281831e1a75be0c09a9548 From this mirror sudo pip install --upgrade https://mirrors.ustc.edu.cn/pypi/web/packages/cb/47/19e96945ee6012459e85f87728633f05b1e8791677ae64370d16ac4c849e/numpy-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=9f9bc53d2e281831e1a75be0c09a9548 Try again success pip install --index https://pypi.mirrors.ustc.edu.cn/simple/ tensorflow Validate your installation $ python

import tensorflow as tf
hello = tf.constant(‘Hello, TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))

Hello, TensorFlow!

(success) Sixth try with anaconda inatall

for me it shows readtimeouterror

So I decide to download it in my host OS and copy it to my share file C:\Users\SSQ\virtualbox share and I can find it in the /mnt from my Linux system.
type bash /mnt/Anaconda2-4.3.0-Linux-x86_64.sh
type yes
Anaconda2 will now be installed into this location:
/home/ssq/anaconda2

Press ENTER to confirm the location
Press CTRL-C to abort the installation
Or specify a different location below

click Enter

Do you wish the installer to prepend the Anaconda2 install location
to PATH in your /home/ssq/.bashrc ? [yes|no]

yes

Open new terminal and type conda create -n tensorflow

CondaHTTPError: HTTP None None for url
Elapsed: None

An HTTP error occurred when trying to retrieve this URL.

Try again conda create -n tensorflow

source activate tensorflow

From ssq@ssq-VirtualBox:~$ to (tensorflow) ssq@ssq-VirtualBox:~$

Success
y

pip install --index https://pypi.mirrors.ustc.edu.cn/simple/ tensorflow

Run the container: docker run -p 8888:8888 -it --rm $USER/assignments Now find your VM's IP using docker-machine ip default (say, 192.168.99.100) and open http://192.168.99.100:8888  You should be able to see a list of notebooks, one for each assignment. Click on the appropriate one to open it, and follow the inline instructions. And you’re ready to start exploring! To get further help on each assignment, navigate to the appropriate node. If you want to learn more about iPython (or Jupyter) notebooks, visit jupyter.org. Assignment 1: notMNIST Assignment 1: notMNIST Preprocess notMNIST data and train a simple logistic regression model on it notMNIST dataset samples Starter Code Open the iPython notebook for this assignment (1_notmnist.ipynb), and follow the instructions to implement and run each indicated step. Some of the early steps that preprocess the data have been implemented for you. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer the posed questions (save your responses as markdown in the notebook). In the end, you should have a model trained on the notMNIST dataset, which is able to recognize a subset of English letters in different fonts. How accurately does your model predict the correct labels on the test dataset? Problem 2: Verify normalized images Note how imshow() displays an image using a color map. You can change this using the cmap parameter. Check out more options in the API reference. DEEP NEURAL NETWORKS Deep Neural Networks Assignment 2: SGD Assignment 2: Stochastic Gradient Descent Train a fully-connected network using Gradient Descent and Stochastic Gradient Descent Note: The assignments in this course build on each other, so please finish Assignment 1 before attempting this. Starter Code Open the iPython notebook for this assignment (2_fullyconnected.ipynb), and follow the instructions to implement and/or run each indicated step. Some steps have been implemented for you. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer any posed questions (save your responses as markdown in the notebook). Your new model should perform better than the one you developed for Assignment 1. Also, the time required to train using Stochastic Gradient Descent (SGD) should be considerably less than simple Gradient Descent (GD). Errors Error: valueError: Only call softmax_cross_entropy_with_logits with named arguments (labels=…, logits=…, …) Fix: loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels)) Assignment 3: Regularization Assignment 3: Regularization Use regularization techniques to improve a deep learning model Note: The assignments in this course build on each other, so please finish them in order. Starter Code Open the iPython notebook for this assignment (3_regularization.ipynb), and follow the instructions to implement and run each indicated step. Some steps have been implemented for you. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer any posed questions (save your responses as markdown in the notebook). Try to apply the different regularization techniques you have learnt, and compare their results. Which seems to work better? Is one clearly better than the others? Error in VirtualBox Error: How to fix: Close all the process in the host OS and free up memory. Restart VM CONVOLUTIONAL NEURAL NETWORKS Readings Readings For a closer look at the arithmetic behind convolution, and how it is affected by your choice of padding scheme, stride and other parameters, please refer to this illustrated guide: V. Dumoulin and F. Visin, A guide to convolution arithmetic for deep learning. Assignment 4: Convolutional Models Design and train a Convolutional Neural Network Note: The assignments in this course build on each other, so please finish them in order. Starter Code Open the iPython notebook for this assignment (4_convolutions.ipynb), and follow the instructions to implement and run each indicated step. Some steps have been implemented for you. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer any posed questions (save your responses as markdown in the notebook). Improve the model by experimenting with its structure - how many layers, how they are connected, stride, pooling, etc. For more efficient training, try applying techniques such as dropout and learning rate decay. What does your final architecture look like? DEEP MODELS FOR TEXT AND SEQUENCES tSNE Laurens van der Maaten and Geoffrey Hinton. Visualizing Data using t-SNE. Journal of Machine Learning Research, 2008. Vol. 9, pp. 2579-2605. Assignment 5: Word2Vec and CBOW Assignment 5: Word2Vec and CBOW Train a skip-gram model on Text8 data and visualize the output Note: The assignments in this course build on each other, so please finish them in order. Starter Code Open the iPython notebook for this assignment (5_word2vec.ipynb), and follow the instructions to implement and run each indicated step. The first model (Word2Vec) has been implemented for you. Using that as a reference, train a CBOW (Continuous Bag of Words) model. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer any posed questions (save your responses as markdown in the notebook). How does your CBOW model perform compared to the given Word2Vec model? Open sudo mount -t vboxsf virtualbox_share /mnt/ jupyter notebook run TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'. Method: tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, train_labels, embed,num_sampled, vocabulary_size)) Assignment 6: LSTMs Assignment 6: LSTMs Train a Long Short-Term Memory network to predict character sequences Note: The assignments in this course build on each other, so please finish them in order. Starter Code Open the iPython notebook for this assignment (6_lstm.ipynb), and follow the instructions to implement and run each indicated step. A basic LSTM model has been provided; improve it by solving the given problems. Evaluation This is a self-evaluated assignment. As you go through the notebook, make sure you are able to solve each problem and answer any posed questions (save your responses as markdown in the notebook). What changes did you make to use bigrams as input instead of individual characters? Were you able to implement the sequence-to-sequence LSTM? If so, what additional challenges did you have to solve? Run AttributeError: 'module' object has no attribute 'concat_v2' ValueError: Only call softmax_cross_entropy_with_logits with named arguments (labels=…, logits=…, …) Method Have a try pip uninstall tensorflow pip install --ignore-installed --upgrade https://mirrors.ustc.edu.cn/pypi/web/packages/01/c5/adefd2d5c83e6d8b4a8efa5dd00e44dc05de317b744fb58aef6d8366ce2b/tensorflow-0.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=ebcd1b32ccf2279bfa688542cbdad5fb sudo pip install --index https://mirrors.ustc.edu.cn/pypi/web/packages/01/c5/adefd2d5c83e6d8b4a8efa5dd00e44dc05de317b744fb58aef6d8366ce2b/tensorflow-0.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=ebcd1b32ccf2279bfa688542cbdad5fb sudo pip install --upgrade https://mirrors.ustc.edu.cn/pypi/web/packages/01/c5/adefd2d5c83e6d8b4a8efa5dd00e44dc05de317b744fb58aef6d8366ce2b/tensorflow-0.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=ebcd1b32ccf2279bfa688542cbdad5fb sudo pip install --index https://mirrors.ustc.edu.cn/pypi/web/packages/01/c5/adefd2d5c83e6d8b4a8efa5dd00e44dc05de317b744fb58aef6d8366ce2b/tensorflow-0.12.0-cp27-cp27mu-manylinux1_x86_64.whl#md5=ebcd1b32ccf2279bfa688542cbdad5fb PROJECT (new) Deep Learning MACHINE LEARNING TO DEEP LEARNING Deep Learning Deep Learning Up to this point you’ve been introduced to a number of different learning schemes that take place in machine learning. You’ve seen supervised learning, where we try to extrapolate labels for new data given labelled data we already have. You’ve seen unsupervised learning, where we try to classify data into groups and extract new information hidden in the data. Lastly, you’ve seen reinforcement learning, where we try to create a model that learns the rules of an environment to best maximize its return or reward. In this lesson, you’ll learn about a relatively new branch of machine learning called deep learning, which attempts to model high-level abstractions about data using networks of graphs. Deep learning, much like the other branches of machine learning you’ve seen, is similarly focused on learning representations in data. Additionally, modeling high-level abstractions about data is very similar to artificial intelligence — the idea that knowledge can be represented and acted upon intelligently. What You’ll Watch and Learn For this lesson, you’ll want to learn about algorithms that help you to construct the deep network graphs necessary to model high-level abstractions about data. In addition, you’ll also want to learn how to construct deep models that can interpret and identify words and letters in text — just like how a human reads! To do that, you’ll work on Udacity’s Deep Learning course, co-authored by Google. Vincent Vanhoucke, Principle Scientist at Google Brain, will be your instructor for this lesson. With Vincent as your guide, you’ll learn the ins and outs of Deep Learning and TensorFlow, which is Google’s Deep Learning framework. Deep Learning What You’ll Do In this lesson, you’ll learn how you can develop algorithms that are suitable to model high-level abstractions of data and create a type of “intelligence” that is able to use this abstraction for processing new information. First, you’ll learn about deep neural networks — artificial neural networks that have multiple hidden layers of information between its input and output. Next, you’ll learn about convolutional neural networks — a different flavor of neural networks that are modeled after biological processes like visual and aural feedback. Finally, you’ll learn about deep models for sequence learning — models that can “understand” written and spoken language and text. The underlying lesson from these concepts is that, with enough data and time to learn, we can develop intelligent agents that think and act in many of the same ways we as humans do. Being able to model complex human behaviors and tasks like driving a car, processing spoken language, or even building a winning strategy for the game of Go, is a task that could not be done without use of deep learning. Software and Tools TensorFlow TensorFlow We will be using TensorFlow™, an open-source library developed by Google, to build deep learning models throughout the course. Coding will be in Python 2.7 using iPython notebooks, which you should be familiar with. Download and Setup Method 1: Pre-built Docker container with TensorFlow and all assignments To get started with TensorFlow quickly and work on your assignments, follow the instructions in this README. Note: If you are on a Windows machine, Method 1 is your only option due to lack of native TensorFlow support. – OR – Method 2: Install TensorFlow on your computer (Linux or Mac OS X only), then fetch assignment code separately Follow the instructions to download and setup TensorFlow. Choose one of the three ways to install: Pip: Install TensorFlow directly on your computer. You need to have Python 2.7 and pip installed; and this may impact other Python packages that you may have. Virtualenv: Install TensorFlow in an isolated (virtual) Python environment. You need to have Python 2.7 and virtualenv installed; this will not affect Python packages in any other environment. Docker: Run TensorFlow in an isolated Docker container (virtual machine) on your computer. You need to have Vagrant, Docker and virtualization software like VirtualBox installed; this will keep TensorFlow completely isolated from the rest of your computer, but may require more memory to run. Links: Tutorials, How-Tos, Resources, Source code, Stack Overflow INTRO TO TENSORFLOW Intro to TensorFlow What is Deep Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/INt1nULYPak.mp4 Solving Problems - Big and Small https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/WHcRQMGSbqg.mp4 Let’s Get Started https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/ySIDqaXLhHw.mp4 Installing TensorFlow Throughout this lesson, you’ll apply your knowledge of neural networks on real datasets using TensorFlow (link for China), an open source Deep Learning library created by Google. You’ll use TensorFlow to classify images from the notMNIST dataset - a dataset of images of English letters from A to J. You can see a few example images below. Your goal is to automatically detect the letter based on the image in the dataset. You’ll be working on your own computer for this lab, so, first things first, install TensorFlow! Install As usual, we’ll be using Conda to install TensorFlow. You might already have a TensorFlow environment, but check to make sure you have all the necessary packages. OS X or Linux Run the following commands to setup your environment: conda create -n tensorflow python=3.5 source activate tensorflow conda install pandas matplotlib jupyter notebook scipy scikit-learn pip install tensorflow  Windows And installing on Windows. In your console or Anaconda shell, conda create -n tensorflow python=3.5 activate tensorflow conda install pandas matplotlib jupyter notebook scipy scikit-learn pip install tensorflow  Hello, world! Try running the following code in your Python console to make sure you have TensorFlow properly installed. The console will print “Hello, world!” if TensorFlow is installed. Don’t worry about understanding what it does. You’ll learn about it in the next section. import tensorflow as tf # Create TensorFlow object called tensor hello_constant = tf.constant('Hello World!') with tf.Session() as sess: # Run the tf.constant operation in the session output = sess.run(hello_constant) print(output)  Try open cmd with admin conda create -n tensorflow python=3.5 C:\windows\system32>conda create -n tensorflow python=3.5 Fetching package metadata ........... Solving package specifications: . Package plan for installation in environment C:\Program Files\Anaconda2\envs\ten sorflow: The following NEW packages will be INSTALLED: pip: 9.0.1-py35_1 python: 3.5.3-0 setuptools: 27.2.0-py35_1 vs2015_runtime: 14.0.25123-0 wheel: 0.29.0-py35_0  Proceed ([y]/n)? y vs2015_runtime 100% |###############################| Time: 0:00:02 776.58 kB/s python-3.5.3-0 100% |###############################| Time: 0:01:29 361.95 kB/s setuptools-27. 100% |###############################| Time: 0:00:00 1.09 MB/s wheel-0.29.0-p 100% |###############################| Time: 0:00:00 1.55 MB/s pip-9.0.1-py35 100% |###############################| Time: 0:00:01 997.36 kB/s # # To activate this environment, use: # > activate tensorflow # # To deactivate this environment, use: # > deactivate tensorflow # # * for power-users using bash, you must source #  activate tensorflow (tensorflow) C:\windows\system32>  conda install pandas matplotlib jupyter notebook scipy scikit-learn Fetching package metadata ……….. Solving package specifications: . Package plan for installation in environment C:\Program Files\Anaconda2\envs\ten sorflow: The following NEW packages will be INSTALLED: bleach: 1.5.0-py35_0 colorama: 0.3.7-py35_0 cycler: 0.10.0-py35_0 decorator: 4.0.11-py35_0 entrypoints: 0.2.2-py35_1 html5lib: 0.999-py35_0 icu: 57.1-vc14_0 [vc14] ipykernel: 4.5.2-py35_0 ipython: 5.3.0-py35_0 ipython_genutils: 0.1.0-py35_0 ipywidgets: 6.0.0-py35_0 jinja2: 2.9.5-py35_0 jpeg: 9b-vc14_0 [vc14] jsonschema: 2.5.1-py35_0 jupyter: 1.0.0-py35_3 jupyter_client: 5.0.0-py35_0 jupyter_console: 5.1.0-py35_0 jupyter_core: 4.3.0-py35_0 libpng: 1.6.27-vc14_0 [vc14] markupsafe: 0.23-py35_2 matplotlib: 2.0.0-np112py35_0 mistune: 0.7.4-py35_0 mkl: 2017.0.1-0 nbconvert: 5.1.1-py35_0 nbformat: 4.3.0-py35_0 notebook: 4.4.1-py35_0 numpy: 1.12.0-py35_0 openssl: 1.0.2k-vc14_0 [vc14] pandas: 0.19.2-np112py35_1 pandocfilters: 1.4.1-py35_0 path.py: 10.1-py35_0 pickleshare: 0.7.4-py35_0 prompt_toolkit: 1.0.13-py35_0 pygments: 2.2.0-py35_0 pyparsing: 2.1.4-py35_0 pyqt: 5.6.0-py35_2 python-dateutil: 2.6.0-py35_0 pytz: 2016.10-py35_0 pyzmq: 16.0.2-py35_0 qt: 5.6.2-vc14_3 [vc14] qtconsole: 4.2.1-py35_2 scikit-learn: 0.18.1-np112py35_1 scipy: 0.19.0-np112py35_0 simplegeneric: 0.8.1-py35_1 sip: 4.18-py35_0 six: 1.10.0-py35_0 testpath: 0.3-py35_0 tk: 8.5.18-vc14_0 [vc14] tornado: 4.4.2-py35_0 traitlets: 4.3.2-py35_0 wcwidth: 0.1.7-py35_0 widgetsnbextension: 2.0.0-py35_0 win_unicode_console: 0.5-py35_0 zlib: 1.2.8-vc14_3 [vc14]  Proceed ([y]/n)? y mkl-2017.0.1-0 100% |###############################| Time: 0:04:46 470.85 kB/s icu-57.1-vc14_ 100% |###############################| Time: 0:01:28 403.91 kB/s jpeg-9b-vc14_0 100% |###############################| Time: 0:00:00 379.04 kB/s openssl-1.0.2k 100% |###############################| Time: 0:00:13 393.72 kB/s tk-8.5.18-vc14 100% |###############################| Time: 0:00:04 473.45 kB/s zlib-1.2.8-vc1 100% |###############################| Time: 0:00:00 503.24 kB/s colorama-0.3.7 100% |###############################| Time: 0:00:00 622.07 kB/s decorator-4.0. 100% |###############################| Time: 0:00:00 690.00 kB/s entrypoints-0. 100% |###############################| Time: 0:00:00 625.06 kB/s ipython_genuti 100% |###############################| Time: 0:00:00 597.35 kB/s jsonschema-2.5 100% |###############################| Time: 0:00:00 503.91 kB/s libpng-1.6.27- 100% |###############################| Time: 0:00:01 432.48 kB/s markupsafe-0.2 100% |###############################| Time: 0:00:00 520.82 kB/s mistune-0.7.4- 100% |###############################| Time: 0:00:00 441.53 kB/s numpy-1.12.0-p 100% |###############################| Time: 0:00:10 354.48 kB/s pandocfilters- 100% |###############################| Time: 0:00:00 363.00 kB/s path.py-10.1-p 100% |###############################| Time: 0:00:00 293.57 kB/s pygments-2.2.0 100% |###############################| Time: 0:00:04 302.43 kB/s pyparsing-2.1. 100% |###############################| Time: 0:00:00 270.85 kB/s pytz-2016.10-p 100% |###############################| Time: 0:00:00 233.38 kB/s pyzmq-16.0.2-p 100% |###############################| Time: 0:00:02 266.24 kB/s simplegeneric- 100% |###############################| Time: 0:00:00 373.89 kB/s sip-4.18-py35_ 100% |###############################| Time: 0:00:00 268.95 kB/s six-1.10.0-py3 100% |###############################| Time: 0:00:00 409.00 kB/s testpath-0.3-p 100% |###############################| Time: 0:00:00 329.72 kB/s tornado-4.4.2- 100% |###############################| Time: 0:00:02 253.88 kB/s wcwidth-0.1.7- 100% |###############################| Time: 0:00:00 329.53 kB/s win_unicode_co 100% |###############################| Time: 0:00:00 302.28 kB/s cycler-0.10.0- 100% |###############################| Time: 0:00:00 393.21 kB/s html5lib-0.999 100% |###############################| Time: 0:00:00 260.77 kB/s jinja2-2.9.5-p 100% |###############################| Time: 0:00:01 250.23 kB/s pickleshare-0. 100% |###############################| Time: 0:00:00 326.15 kB/s prompt_toolkit 100% |###############################| Time: 0:00:01 281.79 kB/s python-dateuti 100% |###############################| Time: 0:00:00 280.81 kB/s qt-5.6.2-vc14_ 100% |###############################| Time: 0:02:03 469.10 kB/s scipy-0.19.0-n 100% |###############################| Time: 0:00:20 656.15 kB/s traitlets-4.3. 100% |###############################| Time: 0:00:00 418.63 kB/s bleach-1.5.0-p 100% |###############################| Time: 0:00:00 508.29 kB/s ipython-5.3.0- 100% |###############################| Time: 0:00:02 406.32 kB/s jupyter_core-4 100% |###############################| Time: 0:00:00 365.87 kB/s pandas-0.19.2- 100% |###############################| Time: 0:00:13 548.51 kB/s pyqt-5.6.0-py3 100% |###############################| Time: 0:00:08 586.14 kB/s scikit-learn-0 100% |###############################| Time: 0:00:16 282.73 kB/s jupyter_client 100% |###############################| Time: 0:00:00 250.90 kB/s matplotlib-2.0 100% |###############################| Time: 0:00:17 508.36 kB/s nbformat-4.3.0 100% |###############################| Time: 0:00:00 1.41 MB/s ipykernel-4.5. 100% |###############################| Time: 0:00:00 1.39 MB/s nbconvert-5.1. 100% |###############################| Time: 0:00:00 1.42 MB/s jupyter_consol 100% |###############################| Time: 0:00:00 397.64 kB/s notebook-4.4.1 100% |###############################| Time: 0:00:06 890.12 kB/s qtconsole-4.2. 100% |###############################| Time: 0:00:00 705.98 kB/s widgetsnbexten 100% |###############################| Time: 0:00:01 727.40 kB/s ipywidgets-6.0 100% |###############################| Time: 0:00:00 632.13 kB/s jupyter-1.0.0- 100% |###############################| Time: 0:00:00 665.76 kB/s ERROR conda.core.link:_execute_actions(330): An error occurred while installing package 'defaults::qt-5.6.2-vc14_3'. UnicodeDecodeError('utf8', '\xd2\xd1\xb8\xb4\xd6\xc6 1 \xb8\xf6\xce\xc4\ xbc\xfe\xa1\xa3\r\n', 0, 1, 'invalid continuation byte') Attempting to roll back. UnicodeDecodeError('utf8', '\xd2\xd1\xb8\xb4\xd6\xc6 1 \xb8\xf6\xce\xc4\ xbc\xfe\xa1\xa3\r\n', 0, 1, 'invalid continuation byte')  (tensorflow) C:\windows\system32>pip install tensorflow Hello, Tensor World! Hello, Tensor World! Let’s analyze the Hello World script you ran. For reference, I’ve added the code below. import tensorflow as tf # Create TensorFlow object called hello_constant hello_constant = tf.constant('Hello World!') with tf.Session() as sess: # Run the tf.constant operation in the session output = sess.run(hello_constant) print(output)  Tensor In TensorFlow, data isn’t stored as integers, floats, or strings. These values are encapsulated(封装) in an object called a tensor. In the case of hello_constant = tf.constant('Hello World!'), hello_constant is a 0-dimensional string tensor, but tensors come in a variety of sizes as shown below: # A is a 0-dimensional int32 tensor A = tf.constant(1234) # B is a 1-dimensional int32 tensor B = tf.constant([123,456,789]) # C is a 2-dimensional int32 tensor C = tf.constant([ [123,456,789], [222,333,444] ])  tf.constant() is one of many TensorFlow operations you will use in this lesson. The tensor returned by tf.constant() is called a constant tensor, because the value of the tensor never changes. Session TensorFlow’s api is built around the idea of a computational graph, a way of visualizing a mathematical process which you learned about in the MiniFlow lesson. Let’s take the TensorFlow code you ran and turn that into a graph: A “TensorFlow Session”, as shown above, is an environment for running a graph. The session is in charge of allocating the operations to GPU(s) and/or CPU(s), including remote machines. Let’s see how you use it. with tf.Session() as sess: output = sess.run(hello_constant)  The code has already created the tensor, hello_constant, from the previous lines. The next step is to evaluate the tensor in a session. The code creates a session instance, sess, using tf.Session. The sess.run() function then evaluates the tensor and returns the results. Quiz: TensorFlow Input Input In the last section, you passed a tensor into a session and it returned the result. What if you want to use a non-constant? This is where tf.placeholder() and feed_dict come into place. In this section, you’ll go over the basics of feeding data into TensorFlow. tf.placeholder() Sadly you can’t just set x to your dataset and put it in TensorFlow, because over time you’ll want your TensorFlow model to take in different datasets with different parameters. You need tf.placeholder()! tf.placeholder() returns a tensor that gets its value from data passed to the tf.session.run() function, allowing you to set the input right before the session runs. Session’s feed_dict x = tf.placeholder(tf.string) with tf.Session() as sess: output = sess.run(x, feed_dict={x: 'Hello World'})  Use the feed_dict parameter in tf.session.run() to set the placeholder tensor. The above example shows the tensor x being set to the string "Hello, world". It’s also possible to set more than one tensor using feed_dict as shown below. x = tf.placeholder(tf.string) y = tf.placeholder(tf.int32) z = tf.placeholder(tf.float32) with tf.Session() as sess: output = sess.run(x, feed_dict={x: 'Test String', y: 123, z: 45.67})  Note: If the data passed to the feed_dict doesn’t match the tensor type and can’t be cast into the tensor type, you’ll get the error “ValueError: invalid literal for...”. Quiz Let’s see how well you understand tf.placeholder() and feed_dict. The code below throws an error, but I want you to make it return the number 123. Change line 11, so that the code returns the number 123. Note: The quizzes are running TensorFlow version 0.12.1. However, all the code used in this course is compatible with version 1.0. We’ll be upgrading our in class quizzes to the newest version in the near future. # Solution is available in the other "solution.py" tab import tensorflow as tf def run(): output = None x = tf.placeholder(tf.int32) with tf.Session() as sess: # TODO: Feed the x tensor 123 output = sess.run(x,feed_dict={x:123}) return output  Quiz: TensorFlow Math TensorFlow Math Getting the input is great, but now you need to use it. You’re going to use basic math functions that everyone knows and loves - add, subtract, multiply, and divide - with tensors. (There’s many more math functions you can check out in the documentation.) Addition x = tf.add(5, 2) # 7  You’ll start with the add function. The tf.add() function does exactly what you expect it to do. It takes in two numbers, two tensors, or one of each, and returns their sum as a tensor. Subtraction and Multiplication Here’s an example with subtraction and multiplication. x = tf.subtract(10, 4) # 6 y = tf.multiply(2, 5) # 10  The x tensor will evaluate to 6, because 10 - 4 = 6. The y tensor will evaluate to 10, because 2 * 5 = 10. That was easy! Converting types It may be necessary to convert between types to make certain operators work together. For example, if you tried the following, it would fail with an exception: tf.subtract(tf.constant(2.0),tf.constant(1)) # Fails with ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32:  That’s because the constant 1 is an integer but the constant 2.0 is a floating point value and subtract expects them to match. In cases like these, you can either make sure your data is all of the same type, or you can cast a value to another type. In this case, converting the 2.0 to an integer before subtracting, like so, will give the correct result: tf.subtract(tf.cast(tf.constant(2.0), tf.int32), tf.constant(1)) # 1  Quiz Let’s apply what you learned to convert an algorithm to TensorFlow. The code below is a simple algorithm using division and subtraction. Convert the following algorithm in regular Python to TensorFlow and print the results of the session. You can use tf.constant() for the values 10, 2, and 1. # Solution is available in the other "solution.py" tab import tensorflow as tf # TODO: Convert the following to TensorFlow: x = 10 y = 2 z = x/y - 1 x=tf.constant(x) y=tf.constant(y) z=tf.constant(z) #z=tf.subtract(tf.divide(x,y),tf.cast(tf.constant(1),tf.float64)) # TODO: Print z from a session with tf.Session() as sess: output = sess.run(z) print(output)  Transition to Classification Good job! You’ve accomplished a lot. In particular, you did the following: Supervised Classification https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/XTGsutypAPE.mp4 Training Your Logistic Classifier https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/WQsdr1EJgz8.mp4 Quiz: TensorFlow Linear Function Linear functions in TensorFlow The most common operation in neural networks is calculating the linear combination of inputs, weights, and biases. As a reminder, we can write the output of the linear operation as Here, W is a matrix of the weights connecting two layers. The output y, the input x, and the biases b are all vectors. Weights and Bias in TensorFlow The goal of training a neural network is to modify weights and biases to best predict the labels. In order to use weights and bias, you’ll need a Tensor that can be modified. This leaves out tf.placeholder() and tf.constant(), since those Tensors can’t be modified. This is where tf.Variable class comes in. tf.Variable() x = tf.Variable(5)  The tf.Variable class creates a tensor with an initial value that can be modified, much like a normal Python variable. This tensor stores its state in the session, so you must initialize the state of the tensor manually. You’ll use the tf.global_variables_initializer() function to initialize the state of all the Variable tensors. Initialization init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) The tf.global_variables_initializer() call returns an operation that will initialize all TensorFlow variables from the graph. You call the operation using a session to initialize all the variables as shown above. Using the tf.Variable class allows us to change the weights and bias, but an initial value needs to be chosen. Initializing the weights with random numbers from a normal distribution is good practice. Randomizing the weights helps the model from becoming stuck in the same place every time you train it. You’ll learn more about this in the next lesson, when you study gradient descent. Similarly, choosing weights from a normal distribution prevents any one weight from overwhelming other weights. You’ll use the tf.truncated_normal() function to generate random numbers from a normal distribution. tf.truncated_normal() n_features = 120 n_labels = 5 weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))  The tf.truncated_normal() function returns a tensor with random values from a normal distribution whose magnitude is no more than 2 standard deviations from the mean. Since the weights are already helping prevent the model from getting stuck, you don’t need to randomize the bias. Let’s use the simplest solution, setting the bias to 0. tf.zeros() n_labels = 5 bias = tf.Variable(tf.zeros(n_labels))  The tf.zeros() function returns a tensor with all zeros. Linear Classifier Quiz You’ll be classifying the handwritten numbers 0, 1, and 2 from the MNIST dataset using TensorFlow. The above is a small sample of the data you’ll be training on. Notice how some of the 1s are written with a serif at the top and at different angles. The similarities and differences will play a part in shaping the weights of the model. The images above are trained weights for each label (0, 1, and 2). The weights display the unique properties of each digit they have found. Complete this quiz to train your own weights using the MNIST dataset. Instructions 1. Open quiz.py. 1. Implement get_weights to return a tf.Variable of weights 2. Implement get_biases to return a tf.Variable of biases 3. Implement xW + b in the linear function 2. Open sandbox.py 1. Initialize all weights Since xW in xW + b is matrix multiplication, you have to use the tf.matmul() function instead of tf.multiply(). Don’t forget that order matters in matrix multiplication, so tf.matmul(a,b) is not the same as tf.matmul(b,a). quiz.py # Solution is available in the other "quiz_solution.py" tab import tensorflow as tf def get_weights(n_features, n_labels): """ Return TensorFlow weights :param n_features: Number of features :param n_labels: Number of labels :return: TensorFlow weights """ # TODO: Return weights return tf.Variable(tf.truncated_normal((n_features, n_labels))) def get_biases(n_labels): """ Return TensorFlow bias :param n_labels: Number of labels :return: TensorFlow bias """ # TODO: Return biases return tf.Variable(tf.zeros(n_labels)) def linear(input, w, b): """ Return linear function in TensorFlow :param input: TensorFlow input :param w: TensorFlow weights :param b: TensorFlow biases :return: TensorFlow linear function """ # TODO: Linear Function (xW + b) return tf.add(tf.matmul(input,w),b)  sandbox.py # Solution is available in the other "sandbox_solution.py" tab import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data from quiz import get_weights, get_biases, linear def mnist_features_labels(n_labels): """ Gets the first <n> labels from the MNIST dataset :param n_labels: Number of labels to use :return: Tuple of feature list and label list """ mnist_features = [] mnist_labels = [] mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True) # In order to make quizzes run faster, we're only looking at 10000 images for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)): # Add features and labels if it's for the first <n>th labels if mnist_label[:n_labels].any(): mnist_features.append(mnist_feature) mnist_labels.append(mnist_label[:n_labels]) return mnist_features, mnist_labels # Number of features (28*28 image is 784 features) n_features = 784 # Number of labels n_labels = 3 # Features and Labels features = tf.placeholder(tf.float32) labels = tf.placeholder(tf.float32) # Weights and Biases w = get_weights(n_features, n_labels) b = get_biases(n_labels) # Linear Function xW + b logits = linear(features, w, b) # Training data train_features, train_labels = mnist_features_labels(n_labels) with tf.Session() as session: # TODO: Initialize session variables session.run(tf.global_variables_initializer()) # Softmax prediction = tf.nn.softmax(logits) # Cross entropy # This quantifies how far off the predictions were. # You'll learn more about this in future lessons. cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1) # Training loss # You'll learn more about this in future lessons. loss = tf.reduce_mean(cross_entropy) # Rate at which the weights are changed # You'll learn more about this in future lessons. learning_rate = 0.08 # Gradient Descent # This is the method used to train the model # You'll learn more about this in future lessons. optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) # Run optimizer and get loss _, l = session.run( [optimizer, loss], feed_dict={features: train_features, labels: train_labels}) # Print loss print('Loss: {}'.format(l))  Quiz: TensorFlow Softmax TensorFlow Softmax You might remember in the Intro to TFLearn lesson we used the softmax function to calculate class probabilities as output from the network. The softmax function squashes it’s inputs, typically called logits or logit scores, to be between 0 and 1 and also normalizes the outputs such that they all sum to 1. This means the output of the softmax function is equivalent to a categorical probability distribution. It’s the perfect function to use as the output activation for a network predicting multiple classes. TensorFlow Softmax We’re using TensorFlow to build neural networks and, appropriately, there’s a function for calculating softmax. x = tf.nn.softmax([2.0, 1.0, 0.2])  Easy as that! tf.nn.softmax() implements the softmax function for you. It takes in logits and returns softmax activations. Quiz Use the softmax function in the quiz below to return the softmax of the logits. quiz.py # Solution is available in the other "solution.py" tab import tensorflow as tf def run(): output = None logit_data = [2.0, 1.0, 0.1] logits = tf.placeholder(tf.float32) # TODO: Calculate the softmax of the logits # softmax = softmax = tf.nn.softmax([2.0, 1.0, 0.1]) with tf.Session() as sess: # TODO: Feed in the logit data # output = sess.run(softmax, ) output = sess.run(softmax,feed_dict={logits:logit_data} ) return output  One-Hot Encoding https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/phYsxqlilUk.mp4 13 L One Hot Encoding One-Hot Encoding With Scikit-Learn Transforming your labels into one-hot encoded vectors is pretty simple with scikit-learn using LabelBinarizer. Check it out below! import numpy as np from sklearn import preprocessing # Example labels labels = np.array([1,5,3,2,1,4,2,1,3]) # Create the encoder lb = preprocessing.LabelBinarizer() # Here the encoder finds the classes and assigns one-hot vectors lb.fit(labels) # And finally, transform the labels into one-hot encoded vectors lb.transform(labels) >>> array([[1, 0, 0, 0, 0], [0, 0, 0, 0, 1], [0, 0, 1, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0], [0, 0, 0, 1, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0], [0, 0, 1, 0, 0]])  Quiz: TensorFlow Cross Entropy Cross Entropy in TensorFlow In the Intro to TFLearn lesson we discussed using cross entropy as the cost function for classification with one-hot encoded labels. Again, TensorFlow has a function to do the cross entropy calculations for us. Let’s take what you learned from the video and create a cross entropy function in TensorFlow. To create a cross entropy function in TensorFlow, you’ll need to use two new functions: Reduce Sum x = tf.reduce_sum([1, 2, 3, 4, 5]) # 15  The tf.reduce_sum() function takes an array of numbers and sums them together. Natural Log x = tf.log(100) # 4.60517  This function does exactly what you would expect it to do. tf.log() takes the natural log of a number. Quiz Print the cross entropy using softmax_data and one_hot_encod_label. (Alternative link for users in China.) quiz.py # Solution is available in the other "solution.py" tab import tensorflow as tf softmax_data = [0.7, 0.2, 0.1] one_hot_data = [1.0, 0.0, 0.0] softmax = tf.placeholder(tf.float32) one_hot = tf.placeholder(tf.float32) # TODO: Print cross entropy from session cross_entropy = -tf.reduce_sum(tf.multiply(one_hot, tf.log(softmax))) with tf.Session() as sess: print(sess.run(cross_entropy, feed_dict={softmax: softmax_data, one_hot: one_hot_data})) 0.356675  Minimizing Cross Entropy https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/YrDMXFhvh9E.mp4 Transition into Practical Aspects of Learning https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/bKqkRFOOKoA.mp4 Quiz: Numerical Stability https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/_SbGcOS-jcQ.mp4 a = 1000000000 for i in range(1000000): a = a + 1e-6 print(a - 1000000000) 0.953674316406  Normalized Inputs and Initial Weights Measuring Performance Optimizing a Logistic Classifier Stochastic Gradient Descent Momentum and Learning Rate Decay Parameter Hyperspace Quiz: Mini-batch Mini-batching In this section, you’ll go over what mini-batching is and how to apply it in TensorFlow. Mini-batching is a technique for training on subsets of the dataset instead of all the data at one time. This provides the ability to train a model, even if a computer lacks the memory to store the entire dataset. Mini-batching is computationally inefficient, since you can’t calculate the loss simultaneously across all samples. However, this is a small price to pay in order to be able to run the model at all. It’s also quite useful combined with SGD. The idea is to randomly shuffle the data at the start of each epoch, then create the mini-batches. For each mini-batch, you train the network weights with gradient descent. Since these batches are random, you’re performing SGD with each batch. Let’s look at the MNIST dataset with weights and a bias to see if your machine can handle it. from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf n_input = 784 # MNIST data input (img shape: 28*28) n_classes = 10 # MNIST total classes (0-9 digits) # Import MNIST data mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True) # The features are already scaled and the data is shuffled train_features = mnist.train.images test_features = mnist.test.images train_labels = mnist.train.labels.astype(np.float32) test_labels = mnist.test.labels.astype(np.float32) # Weights & bias weights = tf.Variable(tf.random_normal([n_input, n_classes])) bias = tf.Variable(tf.random_normal([n_classes]))  Question 1 Calculate the memory size of train_features, train_labels, weights, and bias in bytes. Ignore memory for overhead, just calculate the memory required for the stored data. You may have to look up how much memory a float32 requires, using this link. train_features Shape: (55000, 784) Type: float32 train_labels Shape: (55000, 10) Type: float32 weights Shape: (784, 10) Type: float32 bias Shape: (10,) Type: float32 How many bytes of memory does train_features need? 550007844=172480000 How many bytes of memory does train_labels need? 2200000 How many bytes of memory does weights need? 31360 How many bytes of memory does bias need? 40 The total memory space required for the inputs, weights and bias is around 174 megabytes, which isn’t that much memory. You could train this whole dataset on most CPUs and GPUs. But larger datasets that you’ll use in the future measured in gigabytes or more. It’s possible to purchase more memory, but it’s expensive. A Titan X GPU with 12 GB of memory costs over$1,000.

Instead, in order to run large models on your machine, you’ll learn how to use mini-batching.

Let’s look at how you implement mini-batching in TensorFlow.

TensorFlow Mini-batching

In order to use mini-batching, you must first divide your data into batches.

Unfortunately, it’s sometimes impossible to divide the data into batches of exactly equal size. For example, imagine you’d like to create batches of 128 samples each from a dataset of 1000 samples. Since 128 does not evenly divide into 1000, you’d wind up with 7 batches of 128 samples, and 1 batch of 104 samples. (7128 + 1104 = 1000)

In that case, the size of the batches would vary, so you need to take advantage of TensorFlow’s tf.placeholder() function to receive the varying batch sizes.

Continuing the example, if each sample had n_input = 784 features and n_classes = 10 possible labels, the dimensions for features would be [None, n_input] and labels would be [None, n_classes].

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])


What does None do here?

The None dimension is a placeholder for the batch size. At runtime, TensorFlow will accept any batch size greater than 0.

Going back to our earlier example, this setup allows you to feed features and labels into the model as either the batches of 128 samples or the single batch of 104 samples.

Question 2

Use the parameters below, how many batches are there, and what is the last batch size?

features is (50000, 400)

labels is (50000, 10)

batch_size is 128

How many batches are there?
50000/128+1=391

What is the last batch size?
50000%128=80

Now that you know the basics, let’s learn how to implement mini-batching.

Question 3

Implement the batches function to batch features and labels. The function should return each batch with a maximum size of batch_size. To help you with the quiz, look at the following example output of a working batches function.

# 4 Samples of features
example_features = [
['F11','F12','F13','F14'],
['F21','F22','F23','F24'],
['F31','F32','F33','F34'],
['F41','F42','F43','F44']]
# 4 Samples of labels
example_labels = [
['L11','L12'],
['L21','L22'],
['L31','L32'],
['L41','L42']]

example_batches = batches(3, example_features, example_labels)


The example_batches variable would be the following:

[
# 2 batches:
#   First is a batch of size 3.
#   Second is a batch of size 1
[
# First Batch is size 3
[
# 3 samples of features.
# There are 4 features per sample.
['F11', 'F12', 'F13', 'F14'],
['F21', 'F22', 'F23', 'F24'],
['F31', 'F32', 'F33', 'F34']
], [
# 3 samples of labels.
# There are 2 labels per sample.
['L11', 'L12'],
['L21', 'L22'],
['L31', 'L32']
]
], [
# Second Batch is size 1.
# Since batch size is 3, there is only one sample left from the 4 samples.
[
# 1 sample of features.
['F41', 'F42', 'F43', 'F44']
], [
# 1 sample of labels.
['L41', 'L42']
]
]
]


Implement the batches function in the “quiz.py” file below.

“quiz.py”
import math
def batches(batch_size, features, labels):
"""
Create batches of features and labels
:param batch_size: The batch size
:param features: List of features
:param labels: List of labels
:return: Batches of (Features, Labels)
"""
assert len(features) == len(labels)
# TODO: Implement batching
output_batches = []

sample_size = len(features)
for start_i in range(0, sample_size, batch_size):
end_i = start_i + batch_size
batch = [features[start_i:end_i], labels[start_i:end_i]]
output_batches.append(batch)

return output_batches

“sandbox.py”
from quiz import batches
from pprint import pprint

# 4 Samples of features
example_features = [
['F11','F12','F13','F14'],
['F21','F22','F23','F24'],
['F31','F32','F33','F34'],
['F41','F42','F43','F44']]
# 4 Samples of labels
example_labels = [
['L11','L12'],
['L21','L22'],
['L31','L32'],
['L41','L42']]

# PPrint prints data structures like 2d arrays, so they are easier to read
pprint(batches(3, example_features, example_labels))


Let’s use mini-batching to feed batches of MNIST features and labels into a linear model.

Set the batch size and run the optimizer over all the batches with the batches function. The recommended batch size is 128. If you have memory restrictions, feel free to make it smaller.

“quiz.py”
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np
from helper import batches

learning_rate = 0.001
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data

# The features are already scaled and the data is shuffled
train_features = mnist.train.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# TODO: Set batch size
batch_size = 128
assert batch_size is not None, 'You must set the batch size'

init = tf.global_variables_initializer()

with tf.Session() as sess:
sess.run(init)

# TODO: Train optimizer on all batches
# for batch_features, batch_labels in ______
for batch_features, batch_labels in batches(batch_size, train_features, train_labels):
sess.run(optimizer, feed_dict={features: batch_features, labels: batch_labels})

# Calculate accuracy for test dataset
test_accuracy = sess.run(
accuracy,
feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))


The accuracy is low, but you probably know that you could train on the dataset more than once. You can train a model using the dataset multiple times. You’ll go over this subject in the next section where we talk about “epochs”.

Epochs

Epochs

An epoch is a single forward and backward pass of the whole dataset. This is used to increase the accuracy of the model without requiring more data. This section will cover epochs in TensorFlow and how to choose the right number of epochs.

The following TensorFlow code trains a model using 10 epochs.

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np
from helper import batches  # Helper function created in Mini-batching section

def print_epoch_stats(epoch_i, sess, last_features, last_labels):
"""
Print cost and validation accuracy of an epoch
"""
current_cost = sess.run(
cost,
feed_dict={features: last_features, labels: last_labels})
valid_accuracy = sess.run(
accuracy,
feed_dict={features: valid_features, labels: valid_labels})
print('Epoch: {:<4} - Cost: {:<8.3} Valid Accuracy: {:<5.3}'.format(
epoch_i,
current_cost,
valid_accuracy))

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data

# The features are already scaled and the data is shuffled
train_features = mnist.train.images
valid_features = mnist.validation.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
valid_labels = mnist.validation.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b

# Define loss and optimizer
learning_rate = tf.placeholder(tf.float32)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()

batch_size = 128
epochs = 10
learn_rate = 0.001

train_batches = batches(batch_size, train_features, train_labels)

with tf.Session() as sess:
sess.run(init)

# Training cycle
for epoch_i in range(epochs):

# Loop over all batches
for batch_features, batch_labels in train_batches:
train_feed_dict = {
features: batch_features,
labels: batch_labels,
learning_rate: learn_rate}
sess.run(optimizer, feed_dict=train_feed_dict)

# Print cost and validation accuracy of an epoch
print_epoch_stats(epoch_i, sess, batch_features, batch_labels)

# Calculate accuracy for test dataset
test_accuracy = sess.run(
accuracy,
feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))


Running the code will output the following:

Epoch: 0    - Cost: 11.0     Valid Accuracy: 0.204
Epoch: 1    - Cost: 9.95     Valid Accuracy: 0.229
Epoch: 2    - Cost: 9.18     Valid Accuracy: 0.246
Epoch: 3    - Cost: 8.59     Valid Accuracy: 0.264
Epoch: 4    - Cost: 8.13     Valid Accuracy: 0.283
Epoch: 5    - Cost: 7.77     Valid Accuracy: 0.301
Epoch: 6    - Cost: 7.47     Valid Accuracy: 0.316
Epoch: 7    - Cost: 7.2      Valid Accuracy: 0.328
Epoch: 8    - Cost: 6.96     Valid Accuracy: 0.342
Epoch: 9    - Cost: 6.73     Valid Accuracy: 0.36
Test Accuracy: 0.3801000118255615


Each epoch attempts to move to a lower cost, leading to better accuracy.

This model continues to improve accuracy up to Epoch 9. Let’s increase the number of epochs to 100.

...
Epoch: 79   - Cost: 0.111    Valid Accuracy: 0.86
Epoch: 80   - Cost: 0.11     Valid Accuracy: 0.869
Epoch: 81   - Cost: 0.109    Valid Accuracy: 0.869
....
Epoch: 85   - Cost: 0.107    Valid Accuracy: 0.869
Epoch: 86   - Cost: 0.107    Valid Accuracy: 0.869
Epoch: 87   - Cost: 0.106    Valid Accuracy: 0.869
Epoch: 88   - Cost: 0.106    Valid Accuracy: 0.869
Epoch: 89   - Cost: 0.105    Valid Accuracy: 0.869
Epoch: 90   - Cost: 0.105    Valid Accuracy: 0.869
Epoch: 91   - Cost: 0.104    Valid Accuracy: 0.869
Epoch: 92   - Cost: 0.103    Valid Accuracy: 0.869
Epoch: 93   - Cost: 0.103    Valid Accuracy: 0.869
Epoch: 94   - Cost: 0.102    Valid Accuracy: 0.869
Epoch: 95   - Cost: 0.102    Valid Accuracy: 0.869
Epoch: 96   - Cost: 0.101    Valid Accuracy: 0.869
Epoch: 97   - Cost: 0.101    Valid Accuracy: 0.869
Epoch: 98   - Cost: 0.1      Valid Accuracy: 0.869
Epoch: 99   - Cost: 0.1      Valid Accuracy: 0.869
Test Accuracy: 0.8696000006198883


From looking at the output above, you can see the model doesn’t increase the validation accuracy after epoch 80. Let’s see what happens when we increase the learning rate.

learn_rate = 0.1

Epoch: 76   - Cost: 0.214    Valid Accuracy: 0.752
Epoch: 77   - Cost: 0.21     Valid Accuracy: 0.756
Epoch: 78   - Cost: 0.21     Valid Accuracy: 0.756
...
Epoch: 85   - Cost: 0.207    Valid Accuracy: 0.756
Epoch: 86   - Cost: 0.209    Valid Accuracy: 0.756
Epoch: 87   - Cost: 0.205    Valid Accuracy: 0.756
Epoch: 88   - Cost: 0.208    Valid Accuracy: 0.756
Epoch: 89   - Cost: 0.205    Valid Accuracy: 0.756
Epoch: 90   - Cost: 0.202    Valid Accuracy: 0.756
Epoch: 91   - Cost: 0.207    Valid Accuracy: 0.756
Epoch: 92   - Cost: 0.204    Valid Accuracy: 0.756
Epoch: 93   - Cost: 0.206    Valid Accuracy: 0.756
Epoch: 94   - Cost: 0.202    Valid Accuracy: 0.756
Epoch: 95   - Cost: 0.2974   Valid Accuracy: 0.756
Epoch: 96   - Cost: 0.202    Valid Accuracy: 0.756
Epoch: 97   - Cost: 0.2996   Valid Accuracy: 0.756
Epoch: 98   - Cost: 0.203    Valid Accuracy: 0.756
Epoch: 99   - Cost: 0.2987   Valid Accuracy: 0.756
Test Accuracy: 0.7556000053882599


Looks like the learning rate was increased too much. The final accuracy was lower, and it stopped improving earlier. Let’s stick with the previous learning rate, but change the number of epochs to 80.

Epoch: 65   - Cost: 0.122    Valid Accuracy: 0.868
Epoch: 66   - Cost: 0.121    Valid Accuracy: 0.868
Epoch: 67   - Cost: 0.12     Valid Accuracy: 0.868
Epoch: 68   - Cost: 0.119    Valid Accuracy: 0.868
Epoch: 69   - Cost: 0.118    Valid Accuracy: 0.868
Epoch: 70   - Cost: 0.118    Valid Accuracy: 0.868
Epoch: 71   - Cost: 0.117    Valid Accuracy: 0.868
Epoch: 72   - Cost: 0.116    Valid Accuracy: 0.868
Epoch: 73   - Cost: 0.115    Valid Accuracy: 0.868
Epoch: 74   - Cost: 0.115    Valid Accuracy: 0.868
Epoch: 75   - Cost: 0.114    Valid Accuracy: 0.868
Epoch: 76   - Cost: 0.113    Valid Accuracy: 0.868
Epoch: 77   - Cost: 0.113    Valid Accuracy: 0.868
Epoch: 78   - Cost: 0.112    Valid Accuracy: 0.868
Epoch: 79   - Cost: 0.111    Valid Accuracy: 0.868
Epoch: 80   - Cost: 0.111    Valid Accuracy: 0.869
Test Accuracy: 0.86909999418258667


The accuracy only reached 0.86, but that could be because the learning rate was too high. Lowering the learning rate would require more epochs, but could ultimately achieve better accuracy.

In the upcoming TensorFLow Lab, you’ll get the opportunity to choose your own learning rate, epoch count, and batch size to improve the model’s accuracy.

INTRO TO NEURAL NETWORKS

Intro to Neural Networks

Introducing Luis

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/nto-stLuN6M.mp4

Logistic Regression Quiz

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/kSs6O3R7JUI.mp4

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/1iNylA3fJDs.mp4

Neural Networks

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Mqogpnp1lrU.mp4

Perceptron

Perceptron

Now you’ve seen how a simple neural network makes decisions: by taking in input data, processing that information, and finally, producing an output in the form of a decision! Let’s take a deeper dive into the university admission example and learn more about how this input data is processed.

Data, like test scores and grades, is fed into a network of interconnected nodes. These individual nodes are called perceptrons or neurons, and they are the basic unit of a neural network. Each one looks at input data and decides how to categorize that data. In the example above, the input either passes a threshold for grades and test scores or doesn’t, and so the two categories are: yes (passed the threshold) and no (didn’t pass the threshold). These categories then combine to form a decision – for example, if both nodes produce a “yes” output, then this student gains admission into the university.

Let’s zoom in even further and look at how a single perceptron processes input data.

The perceptron above is one of the two perceptrons from the video that help determine whether or not a student is accepted to a university. It decides whether a student’s grades are high enough to be accepted to the university. You might be wondering: “How does it know whether grades or test scores are more important in making this acceptance decision?” Well, when we initialize a neural network, we don’t know what information will be most important in making a decision. It’s up to the neural network to learn for itself which data is most important and adjust how it considers that data.

It does this with something called weights.

Weights

When input data comes into a perceptron, it gets multiplied by a weight value that is assigned to this particular input. For example, the perceptron above have two inputs, tests for test scores and grades, so it has two associated weights that can be adjusted individually. These weights start out as random values, and as the neural network learns more about what kind of input data leads to a student being accepted into a university, the network adjusts the weights based on any errors in categorization that the previous weights resulted in. This is called training the neural network.

A higher weight means the neural network considers that input more important than other inputs, and lower weight means that the data is considered less important. An extreme example would be if test scores had no affect at all on university acceptance; then the weight of the test score input data would be zero and it would have no affect on the output of the perceptron.

Summing the Input Data

So, each input to a perceptron has an associated weight that represents its importance and these weights are determined during the learning process of a neural network, called training. In the next step, the weighted input data is summed up to produce a single value, that will help determine the final output - whether a student is accepted to a university or not. Let’s see a concrete example of this.

When writing equations related to neural networks, the weights will always be represented by some type of the letter w. It will usually look like a W when it represents a matrix of weights or a w when it represents an individual weight, and it may include some additional information in the form of a subscript to specify which weights (you’ll see more on that next). But remember, when you see the letter w, think weights.

In this example, we’ll use $w{​grades}$ for the weight of grades and $w{​test}$ for the weight of test. For the image above, let’s say that the weights are: $w{​grades}=-1$,$w{​test}=−0.2$. You don’t have to be concerned with the actual values, but their relative values are important. $w{​grades}$ is 5 times larger than ​​$w{​test}$, which means the neural network considers grades input 5 times more important than test in determining whether a student will be accepted into a university.

The perceptron applies these weights to the inputs and sums them in a process known as linear combination. In our case, this looks like $$w{​grades}x{​grades}+w{​test}x{​test}=-1x{​grades}-0.2x{​test}$$​.

Now, to make our equation less wordy, let’s replace the explicit names with numbers. Let’s use 1 for grades and 2 for tests. So now our equation becomes $$w{1}*x{1}+w{1}*x{1}$$​.

In this example, we just have 2 simple inputs: grades and tests. Let’s imagine we instead had m different inputs and we labeled them $x{1},x{2}…x{m}$. Let’s also say that the weight corresponding to $x{1}$​ is $w_{1}$ and so on. In that case, we would express the linear combination succintly as:

$$\Sigma1^mw{i}*x{i}$$
Here, the Greek letter Sigma $\Sigma$ is used to represent summation. It simply means to evaluate the equation to the right multiple times and add up the results. In this case, the equation it will sum is $w {i}*x_{i}$

But where do we get $w{i}$ and $x{i}$ ?

$\Sigma_1^m$ means to iterate over all i values, from 1 to m.

So to put it all together, $\Sigma1^mw{i}*x_{i}$ means the following:

• Start at i=1
• Evaluate $w{1}*x{1}$ and remember the results
• Move to i=2
• Evaluate $w{2}*x{2}$ and add these results to $w{1}*x{1}$
• Continue repeating that process until i=m, where m is the number of inputs.

One last thing: you’ll see equations written many different ways, both here and when reading on your own. For example, you will often just see $\Sigmai$ instead of $\Sigma{i=1}^m$. The first is simply a shorter way of writing the second. That is, if you see a summation without a starting number or a defined end value, it just means perform the sum for all of the them. And sometimes, if the value to iterate over can be inferred, you’ll see it as just $\Sigma$. Just remember they’re all the same thing: $\Sigma{i=1}^m w{i}*x_{i} = \Sigmai w{i}*x{i} = \Sigma w{i}*x_{i}$.

Calculating the Output with an Activation Function

Finally, the result of the perceptron’s summation is turned into an output signal! This is done by feeding the linear combination into an activation function.

Activation functions are functions that decide, given the inputs into the node, what should be the node’s output? Because it’s the activation function that decides the actual output, we often refer to the outputs of a layer as its “activations”.

One of the simplest activation functions is the Heaviside step function. This function returns a 0 if the linear combination is less than 0. It returns a 1 if the linear combination is positive or equal to zero. The Heaviside step function is shown below, where h is the calculated linear combination:

In the university acceptance example above, we used the weights $w{grades} = -1$, $w{​test} = −0.2$. Since​​ $w{grades}$ and $w{​test}$ are negative values, the activation function will only return a 1 if grades and test are 0! This is because the range of values from the linear combination using these weights and inputs are (−∞,0] (i.e. negative infinity to 0, including 0 itself).

It’s easiest to see this with an example in two dimensions. In the following graph, imagine any points along the line or in the shaded area represent all the possible inputs to our node. Also imagine that the value along the y-axis is the result of performing the linear combination on these inputs and the appropriate weights. It’s this result that gets passed to the activation function.

Now remember that the step activation function returns 1 for any inputs greater than or equal to zero. As you can see in the image, only one point has a y-value greater than or equal to zero – the point right at the origin, (0,0):

Now, we certainly want more than one possible grade/test combination to result in acceptance, so we need to adjust the results passed to our activation function so it activates – that is, returns 1 – for more inputs. Specifically, we need to find a way so all the scores we’d like to consider acceptable for admissions produce values greater than or equal to zero when linearly combined with the weights into our node.

One way to get our function to return 1 for more inputs is to add a value to the results of our linear combination, called a bias.

A bias, represented in equations as b, lets us move values in one direction or another.

For example, the following diagram shows the previous hypothetical function with an added bias of +3. The blue shaded area shows all the values that now activate the function. But notice that these are produced with the same inputs as the values shown shaded in grey – just adjusted higher by adding the bias term:

Of course, with neural networks we won’t know in advance what values to pick for biases. That’s ok, because just like the weights, the bias can also be updated and changed by the neural network during training. So after adding a bias, we now have a complete perceptron formula:

This formula returns 1 if the input $x{1},x{2}…x_{m}$ belongs to the accepted-to-university category or returns 0 if it doesn’t. The input is made up of one or more real numbers, each one represented by $x_{i}$, where m is the number of inputs.

Then the neural network starts to learn! Initially, the weights $w_{i}$ and bias (b) are assigned a random value, and then they are updated using a learning algorithm like gradient descent. The weights and biases change so that the next training example is more accurately categorized, and patterns in data are “learned” by the neural network.

Now that you have a good understanding of perceptions, let’s put that knowledge to use. In the next section, you’ll create the AND perceptron from the Neural Networks video by setting the values for weights and bias.

AND Perceptron Quiz

What are the weights and bias for the AND perceptron?

Set the weights (weight1, weight2) and bias bias to the correct values that calculate AND operation as shown above.
In this case, there are two inputs as seen in the table above (let’s call the first column input1 and the second column input2), and based on the perceptron formula, we can calculate the output.

First, the linear combination will be the sum of the weighted inputs: linear_combination = weight1*input1 + weight2*input2 then we can put this value into the biased Heaviside step function, which will give us our output (0 or 1):

If you still need a hint, think of a concrete example like so:

Consider input1 and input2 both = 1, for an AND perceptron, we want the output to also equal 1! The output is determined by the weights and Heaviside step function such that

output = 1, if  weight1*input1 + weight2*input2 + bias >= 0
or
output = 0, if  weight1*input1 + weight2*input2 + bias < 0


So, how can you choose the values for weights and bias so that if both inputs = 1, the output = 1?

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/29PmNG7fuuM.mp4
Gradient is another term for rate of change or slope. If you need to brush up on this concept, check out Khan Academy’s great lectures on the topic.

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/7sxA5Ap8AWM.mp4

Multilayer Perceptrons

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Rs9petvTBLk.mp4

Backpropagation

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/MZL97-2joxQ.mp4

Implementing Backpropagation

From Andrej Karpathy: Yes, you should understand backprop
Also from Andrej Karpathy, a lecture from Stanford’s CS231n course

DEEP NEURAL NETWORKS

Two-Layer Neural Network

Multilayer Neural Networks

In this lesson, you’ll learn how to build multilayer neural networks with TensorFlow. Adding a hidden layer to a network allows it to model more complex functions. Also, using a non-linear activation function on the hidden layer lets it model non-linear functions.

We shall learn about ReLU, a non-linear function, or rectified linear unit. The ReLU function is 0 for negative inputs and x for all inputs x>0.

Next, you’ll see how a ReLU hidden layer is implemented in TensorFlow.

Deep Neural Network in TensorFlow

Deep Neural Network in TensorFlow

You’ve seen how to build a logistic classifier using TensorFlow. Now you’re going to see how to use the logistic classifier to build a deep neural network.

Step by Step

In the following walkthrough, we’ll step through TensorFlow code written to classify the letters in the MNIST database. If you would like to run the network on your computer, the file is provided here. You can find this and many more examples of TensorFlow at Aymeric Damien’s GitHub repository.

Code

TensorFlow MNIST

from tensorflow.examples.tutorials.mnist import input_data


You’ll use the MNIST dataset provided by TensorFlow, which batches and One-Hot encodes the data for you.

Learning Parameters

import tensorflow as tf

# Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128  # Decrease batch size if you don't have enough memory
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)


The focus here is on the architecture of multilayer neural networks, not parameter tuning, so here we’ll just give you the learning parameters.

Hidden Layer Parameters

n_hidden_layer = 256 # layer number of features


The variable n_hidden_layer determines the size of the hidden layer in the neural network. This is also known as the width of a layer.

Weights and Biases

# Store layers weight & bias
weights = {
'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_classes]))
}


Deep neural networks use multiple layers with each layer requiring it’s own weight and bias. The 'hidden_layer' weight and bias is for the hidden layer. The 'out' weight and bias is for the output layer. If the neural network were deeper, there would be weights and biases for each additional layer.

Input

# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])


The MNIST data is made up of 28px by 28px images with a single channel. The tf.reshape() function above reshapes the 28px by 28px matrices in x into row vectors of 784px.

Multilayer Perceptron

# Hidden layer with RELU activation
biases['hidden_layer'])
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation


You’ve seen the linear function tf.add(tf.matmul(x_flat, weights['hidden_layer']), biases['hidden_layer'])before, also known as xw + b. Combining linear functions together using a ReLU will give you a two layer network.

Optimizer

# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
.minimize(cost)


This is the same optimization technique used in the Intro to TensorFLow lab.

Session

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})


The MNIST library in TensorFlow provides the ability to receive the dataset in batches. Calling the mnist.train.next_batch() function returns a subset of the training data.

Deeper Neural Network

That’s it! Going from one layer to two is easy. Adding more layers to the network allows you to solve more complicated problems. In the next video, you’ll see how changing the number of layers can affect your network.

Training a Deep Learning Network

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/CsB7yUtMJyk.mp4

Save and Restore TensorFlow Models

Save and Restore TensorFlow Models

Training a model can take hours. But once you close your TensorFlow session, you lose all the trained weights and biases. If you were to reuse the model in the future, you would have to train it all over again!

Fortunately, TensorFlow gives you the ability to save your progress using a class called tf.train.Saver. This class provides the functionality to save any tf.Variable to your file system.

Saving Variables

Let’s start with a simple example of saving weights and bias Tensors. For the first example you’ll just save two variables. Later examples will save all the weights in a practical model.

import tensorflow as tf

# The file path to save the data
save_file = './model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
# Initialize all the Variables
sess.run(tf.global_variables_initializer())

# Show the values of weights and bias
print('Weights:')
print(sess.run(weights))
print('Bias:')
print(sess.run(bias))

# Save the model
saver.save(sess, save_file)


Weights:

[[-0.97990924 1.03016174 0.74119264]

[-0.82581609 -0.07361362 -0.86653847]]

Bias:

[ 1.62978125 -0.37812829 0.64723819]

The Tensors weights and bias are set to random values using the tf.truncated_normal() function. The values are then saved to the save_file location, “model.ckpt”, using the tf.train.Saver.save() function. (The “.ckpt” extension stands for “checkpoint”.)

If you’re using TensorFlow 0.11.0RC1 or newer, a file called “model.ckpt.meta” will also be created. This file contains the TensorFlow graph.

Now that the Tensor Variables are saved, let’s load them back into a new model.

# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
# Load the weights and bias
saver.restore(sess, save_file)

# Show the values of weights and bias
print('Weight:')
print(sess.run(weights))
print('Bias:')
print(sess.run(bias))


Weights:

[[-0.97990924 1.03016174 0.74119264]

[-0.82581609 -0.07361362 -0.86653847]]

Bias:

[ 1.62978125 -0.37812829 0.64723819]

You’ll notice you still need to create the weights and bias Tensors in Python. The tf.train.Saver.restore() function loads the saved data into weights and bias.

Since tf.train.Saver.restore() sets all the TensorFlow Variables, you don’t need to call tf.global_variables_initializer().

Save a Trained Model

Let’s see how to train a model and save its weights.

# Remove previous Tensors and Operations
tf.reset_default_graph()

from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

learning_rate = 0.001
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b

# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
.minimize(cost)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


Let’s train that model, then save the weights:

import math

save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100

saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

# Training cycle
for epoch in range(n_epochs):
total_batch = math.ceil(mnist.train.num_examples / batch_size)

# Loop over all batches
for i in range(total_batch):
batch_features, batch_labels = mnist.train.next_batch(batch_size)
sess.run(
optimizer,
feed_dict={features: batch_features, labels: batch_labels})

# Print status for every 10 epochs
if epoch % 10 == 0:
valid_accuracy = sess.run(
accuracy,
feed_dict={
features: mnist.validation.images,
labels: mnist.validation.labels})
print('Epoch {:<3} - Validation Accuracy: {}'.format(
epoch,
valid_accuracy))

# Save the model
saver.save(sess, save_file)
print('Trained Model Saved.')


Epoch 0 - Validation Accuracy: 0.06859999895095825

Epoch 10 - Validation Accuracy: 0.20239999890327454

Epoch 20 - Validation Accuracy: 0.36980000138282776

Epoch 30 - Validation Accuracy: 0.48820000886917114

Epoch 40 - Validation Accuracy: 0.5601999759674072

Epoch 50 - Validation Accuracy: 0.6097999811172485

Epoch 60 - Validation Accuracy: 0.6425999999046326

Epoch 70 - Validation Accuracy: 0.6733999848365784

Epoch 80 - Validation Accuracy: 0.6916000247001648

Epoch 90 - Validation Accuracy: 0.7113999724388123

Trained Model Saved.

Let’s load the weights and bias from memory, then check the test accuracy.

saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
saver.restore(sess, save_file)

test_accuracy = sess.run(
accuracy,
feed_dict={features: mnist.test.images, labels: mnist.test.labels})

print('Test Accuracy: {}'.format(test_accuracy))


Test Accuracy: 0.7229999899864197

That’s it! You now know how to save and load a trained model in TensorFlow. Let’s look at loading weights and biases into modified models in the next section.

Finetuning

Sometimes you might want to adjust, or “finetune” a model that you have already trained and saved.

However, loading saved Variables directly into a modified model can generate errors. Let’s go over how to avoid these problems.

Naming Error

TensorFlow uses a string identifier for Tensors and Operations called name. If a name is not given, TensorFlow will create one automatically. TensorFlow will give the first node the name , and then give the name <Type>_<number> for the subsequent nodes. Let’s see how this can affect loading a model with a different order of weights and bias:

import tensorflow as tf

# Remove the previous weights and bias
tf.reset_default_graph()

save_file = 'model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, save_file)

# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
bias = tf.Variable(tf.truncated_normal([3]))
weights = tf.Variable(tf.truncated_normal([2, 3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias

with tf.Session() as sess:
# Load the weights and bias - ERROR
saver.restore(sess, save_file)


The code above prints out the following:

Save Weights: Variable:0

Save Bias: Variable_1:0

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match.

You’ll notice that the name properties for weights and bias are different than when you saved the model. This is why the code produces the “Assign requires shapes of both tensors to match” error. The code saver.restore(sess, save_file) is trying to load weight data into bias and bias data into weights.

Instead of letting TensorFlow set the name property, let’s set it manually:

import tensorflow as tf

tf.reset_default_graph()

save_file = 'model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]), name='weights_0')
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, save_file)

# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')
weights = tf.Variable(tf.truncated_normal([2, 3]) ,name='weights_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias

with tf.Session() as sess:
# Load the weights and bias - No Error
saver.restore(sess, save_file)



Save Weights: weights_0:0

Save Bias: bias_0:0

That worked! The Tensor names match and the data loaded correctly.

Regularization Intro

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/pECnr-5F3_Q.mp4

Regularization

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/QcJBhbuCl5g.mp4

Regularization Quiz

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/E0eEW6V0_sA.mp4

Dropout

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/6DcImJS8uV8.mp4

Dropout Pt. 2

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/8nG8zzJMbZw.mp4

Quiz: TensorFlow Dropout

TensorFlow Dropout

https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf
Dropout is a regularization technique for reducing overfitting. The technique temporarily drops units (artificial neurons) from the network, along with all of those units’ incoming and outgoing connections. Figure 1 illustrates how dropout works.

TensorFlow provides the tf.nn.dropout() function, which you can use to implement dropout.

Let’s look at an example of how to use tf.nn.dropout().

keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)



The code above illustrates how to apply dropout to a neural network.

The tf.nn.dropout() function takes in two parameters:

1. hidden_layer: the tensor to which you would like to apply dropout
2. keep_prob: the probability of keeping (i.e. not dropping) any given unit

keep_prob allows you to adjust the number of units to drop. In order to compensate for dropped units, tf.nn.dropout() multiplies all units that are kept (i.e. not dropped) by 1/keep_prob.

During training, a good starting value for keep_prob is 0.5.

During testing, use a keep_prob value of 1.0 to keep all units and maximize the power of the model.

Quiz 1

Take a look at the code snippet below. Do you see what’s wrong?

There’s nothing wrong with the syntax, however the test accuracy is extremely low.

...

keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

...

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

for epoch_i in range(epochs):
for batch_i in range(batches):
....

sess.run(optimizer, feed_dict={
features: batch_features,
labels: batch_labels,
keep_prob: 0.5})

validation_accuracy = sess.run(accuracy, feed_dict={
features: test_features,
labels: test_labels,
keep_prob: 0.5})


QUESTION 1 OF 2

What’s wrong with the above code?

Dropout doesn’t work with batching.

The keep_prob value of 0.5 is too low.

(correct)There shouldn’t be a value passed to keep_prob when testing for accuracy.
keep_prob should be set to 1.0 when evaluating validation accuracy.

Quiz 2

This quiz will be starting with the code from the ReLU Quiz and applying a dropout layer. Build a model with a ReLU layer and dropout layer using the keep_prob placeholder to pass in a probability of 0.5. Print the logits from the model.

Note: Output will be different every time the code is run. This is caused by dropout randomizing the units it drops.

“solution.py”

# Quiz Solution
# Note: You can't run code in this tab
import tensorflow as tf

hidden_layer_weights = [
[0.1, 0.2, 0.4],
[0.4, 0.6, 0.6],
[0.5, 0.9, 0.1],
[0.8, 0.2, 0.8]]
out_weights = [
[0.1, 0.6],
[0.2, 0.1],
[0.7, 0.9]]

# Weights and biases
weights = [
tf.Variable(hidden_layer_weights),
tf.Variable(out_weights)]
biases = [
tf.Variable(tf.zeros(3)),
tf.Variable(tf.zeros(2))]

# Input
features = tf.Variable([[0.0, 2.0, 3.0, 4.0], [0.1, 0.2, 0.3, 0.4], [11.0, 12.0, 13.0, 14.0]])

# TODO: Create Model with Dropout
keep_prob = tf.placeholder(tf.float32)
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

# TODO: Print logits from a session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(logits, feed_dict={keep_prob: 0.5}))


[[ 1.10000002 6.60000038]
[ 0.30800003 0.7700001 ]
[ 9.56000042 4.78000021]]

CONVOLUTIONAL NEURAL NETWORKS

Intro To CNNs

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/B61jxZ4rkMs.mp4

Color

QUIZ QUESTION

What would be easier for your classifier to learn?

R, G, B
(correct)(R + G + B) / 3

Statistical Invariance

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/0Hr5YwUUhr0.mp4

Convolutional Networks

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/ISHGyvsT0QY.mp4

Intuition

Intuition

Let’s develop better intuition for how Convolutional Neural Networks (CNN) work. We’ll examine how humans classify images, and then see how CNNs use similar approaches.

Let’s say we wanted to classify the following image of a dog as a Golden Retriever.

As humans, how do we do this?

One thing we do is that we identify certain parts of the dog, such as the nose, the eyes, and the fur. We essentially break up the image into smaller pieces, recognize the smaller pieces, and then combine those pieces to get an idea of the overall dog.

In this case, we might break down the image into a combination of the following:

• A nose
• Two eyes
• Golden fur

These pieces can be seen below:

Going One Step Further

But let’s take this one step further. How do we determine what exactly a nose is? A Golden Retriever nose can be seen as an oval with two black holes inside it. Thus, one way of classifying a Retriever’s nose is to to break it up into smaller pieces and look for black holes (nostrils) and curves that define an oval as shown below.

Broadly speaking, this is what a CNN learns to do. It learns to recognize basic lines and curves, then shapes and blobs, and then increasingly complex objects within the image. Finally, the CNN classifies the image by combining the larger, more complex objects.

In our case, the levels in the hierarchy are:

• Simple shapes, like ovals and dark circles
• Complex objects (combinations of simple shapes), like eyes, nose, and fur
• The dog as a whole (a combination of complex objects)

With deep learning, we don’t actually program the CNN to recognize these specific features. Rather, the CNN learns on its own to recognize such objects through forward propagation and backpropagation!

It’s amazing how well a CNN can learn to classify images, even though we never program the CNN with information about specific features to look for.

A CNN might have several layers, and each layer might capture a different level in the hierarchy of objects. The first layer is the lowest level in the hierarchy, where the CNN generally classifies small parts of the image into simple shapes like horizontal and vertical lines and simple blobs of colors. The subsequent layers tend to be higher levels in the hierarchy and generally classify more complex ideas like shapes (combinations of lines), and eventually full objects like dogs.

Once again, the CNN learns all of this on its own. We don’t ever have to tell the CNN to go looking for lines or curves or noses or fur. The CNN just learns from the training set and discovers which characteristics of a Golden Retriever are worth looking for.

That’s a good start! Hopefully you’ve developed some intuition about how CNNs work.

Next, let’s look at some implementation details.

Filters

Breaking up an Image

The first step for a CNN is to break up the image into smaller pieces. We do this by selecting a width and height that defines a filter.

The filter looks at small pieces, or patches, of the image. These patches are the same size as the filter.

We then simply slide this filter horizontally or vertically to focus on a different piece of the image.

The amount by which the filter slides is referred to as the ‘stride’. The stride is a hyperparameter which you, the engineer, can tune. Increasing the stride reduces the size of your model by reducing the number of total patches each layer observes. However, this usually comes with a reduction in accuracy.

Let’s look at an example. In this zoomed in image of the dog, we first start with the patch outlined in red. The width and height of our filter define the size of this square.

We then move the square over to the right by a given stride (2 in this case) to get another patch.

What’s important here is that we are grouping together adjacent pixels and treating them as a collective.

In a normal, non-convolutional neural network, we would have ignored this adjacency. In a normal network, we would have connected every pixel in the input image to a neuron in the next layer. In doing so, we would not have taken advantage of the fact that pixels in an image are close together for a reason and have special meaning.

By taking advantage of this local structure, our CNN learns to classify local patterns, like shapes and objects, in an image.

Filter Depth

It’s common to have more than one filter. Different filters pick up different qualities of a patch. For example, one filter might look for a particular color, while another might look for a kind of object of a specific shape. The amount of filters in a convolutional layer is called the filter depth.

How many neurons does each patch connect to?

That’s dependent on our filter depth. If we have a depth of k, we connect each patch of pixels to k neurons in the next layer. This gives us the height of k in the next layer, as shown below. In practice, k is a hyperparameter we tune, and most CNNs tend to pick the same starting values.

But why connect a single patch to multiple neurons in the next layer? Isn’t one neuron good enough?

Multiple neurons can be useful because a patch can have multiple interesting characteristics that we want to capture.

For example, one patch might include some white teeth, some blonde whiskers, and part of a red tongue. In that case, we might want a filter depth of at least three - one for each of teeth, whiskers, and tongue.

Having multiple neurons for a given patch ensures that our CNN can learn to capture whatever characteristics the CNN learns are important.

Remember that the CNN isn’t “programmed” to look for certain characteristics. Rather, it learns on its own which characteristics to notice.

Feature Map Sizes

What are the width, height and depth for padding = ‘same’, stride = 1?

28,28,8

What are the width, height and depth for padding = ‘valid’, stride = 1?

26,26,8

What are the width, height and depth for padding = ‘valid’, stride = 2?

13,13,8

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/W4xtf8LTz1c.mp4

dropped

Convolutions continued

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/utOv-BKI_vo.mp4

Convolutions Cont.

Note, a “Fully Connected” layer is a standard, non convolutional layer, where all inputs are connected to all output neurons. This is also referred to as a “dense” layer, and is what we used in the previous two lessons.

Parameters

Parameter Sharing

The weights, w, are shared across patches for a given layer in a CNN to detect the cat above regardless of where in the image it is located.

When we are trying to classify a picture of a cat, we don’t care where in the image a cat is. If it’s in the top left or the bottom right, it’s still a cat in our eyes. We would like our CNNs to also possess this ability known as translation invariance. How can we achieve this?

As we saw earlier, the classification of a given patch in an image is determined by the weights and biases corresponding to that patch.

If we want a cat that’s in the top left patch to be classified in the same way as a cat in the bottom right patch, we need the weights and biases corresponding to those patches to be the same, so that they are classified the same way.

This is exactly what we do in CNNs. The weights and biases we learn for a given output layer are shared across all patches in a given input layer. Note that as we increase the depth of our filter, the number of weights and biases we have to learn still increases, as the weights aren’t shared across the output channels.

There’s an additional benefit to sharing our parameters. If we did not reuse the same weights across all patches, we would have to learn new parameters for every single patch and hidden layer neuron pair. This does not scale well, especially for higher fidelity images. Thus, sharing parameters not only helps us with translation invariance, but also gives us a smaller, more scalable model.

A 5x5 grid with a 3x3 filter. Source: Andrej Karpathy.

Let’s say we have a 5x5 grid (as shown above) and a filter of size 3x3 with a stride of 1. What’s the width and height of the next layer? We see that we can fit at most three patches in each direction, giving us a dimension of 3x3 in our next layer. As we can see, the width and height of each subsequent layer decreases in such a scheme.

In an ideal world, we’d be able to maintain the same width and height across layers so that we can continue to add layers without worrying about the dimensionality shrinking and so that we have consistency. How might we achieve this? One way is to simply add a border of 0s to our original 5x5 image. You can see what this looks like in the below image.

The same grid with 0 padding. Source: Andrej Karpathy.

This would expand our original image to a 7x7. With this, we now see how our next layer’s size is again a 5x5, keeping our dimensionality consistent.

Dimensionality

From what we’ve learned so far, how can we calculate the number of neurons of each layer in our CNN?

Given:

• our input layer has a width of W and a height of H
• our convolutional layer has a filter size F
• we have a stride of S
• a padding of P
• and a filter depth of K,

the following formula gives us the width of the next layer: W_out = (W−F+2P)/S+1.

The output height would be H_out = (H-F+2P)/S + 1.

And the output depth would be equal to the filter depth D_out = K.

The output volume would be W_out * H_out * D_out.

Knowing the dimensionality of each additional layer helps us understand how large our model is and how our decisions around filter size and stride affect the size of our network.

Quiz: Convolution Output Shap

Introduction

For the next few quizzes we’ll test your understanding of the dimensions in CNNs. Understanding dimensions will help you make accurate tradeoffs between model size and performance. As you’ll see, some parameters have a much bigger impact on model size than others.

Setup

H = height, W = width, D = depth

• We have an input of shape 32x32x3 (HxWxD)
• 20 filters of shape 8x8x3 (HxWxD)
• A stride of 2 for both the height and width (S)
• Valid padding of size 1 ( P )

Recall the formula for calculating the new height or width:

new_height = (input_height - filter_height + 2 * P)/S + 1
new_width = (input_width - filter_width + 2 * P)/S + 1


Convolutional Layer Output Shape
What’s the shape of the output?

The answer format is HxWxD, so if you think the new height is 9, new width is 9, and new depth is 5, then type 9x9x5.

14x14x20

Solution: Convolution Output

Solution

We can get the new height and width with the formula resulting in:

(32 - 8 + 2 * 1)/2 + 1 = 14
(32 - 8 + 2 * 1)/2 + 1 = 14


The new depth is equal to the number of filters, which is 20.
This would correspond to the following code:

input = tf.placeholder(tf.float32, (None, 32, 32, 3))
filter_weights = tf.Variable(tf.truncated_normal((8, 8, 3, 20))) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(20))
strides = [1, 2, 2, 1] # (batch, height, width, depth)
conv = tf.nn.conv2d(input, filter_weights, strides, padding) + filter_bias


Note the output shape of conv will be [1, 13, 13, 20]. It’s 4D to account for batch size, but more importantly, it’s not [1, 14, 14, 20]. This is because the padding algorithm TensorFlow uses is not exactly the same as the one above. An alternative algorithm is to switch padding from 'VALID' to SAME which would result in an output shape of [1, 16, 16, 20]. If you’re curious how padding works in TensorFlow, read this document.

Quiz: Number of Parameters

We’re now going to calculate the number of parameters of the convolutional layer. The answer from the last quiz will come into play here!

Being able to calculate the number of parameters in a neural network is useful since we want to have control over how much memory a neural network uses.

Setup

H = height, W = width, D = depth

• We have an input of shape 32x32x3 (HxWxD)
• 20 filters of shape 8x8x3 (HxWxD)
• A stride of 2 for both the height and width (S)
• Valid padding of size 1 ( P )

Output Layer

• 14x14x20 (HxWxD)

Hint

Without parameter sharing, each neuron in the output layer must connect to each neuron in the filter. In addition, each neuron in the output layer must also connect to a single bias neuron.

Solution: Number of Parameters

Solution

There are 756560 total parameters. That’s a HUGE amount! Here’s how we calculate it:

(8 * 8 * 3 + 1) * (14 * 14 * 20) = 756560

8 * 8 * 3 is the number of weights, we add 1 for the bias. Remember, each weight is assigned to every single part of the output (14 * 14 * 20). So we multiply these two numbers together and we get the final answer.

Quiz: Parameter Sharing

Now we’d like you to calculate the number of parameters in the convolutional layer, if every neuron in the output layer shares its parameters with every other neuron in its same channel.

This is the number of parameters actually used in a convolution layer (tf.nn.conv2d()).

Setup

H = height, W = width, D = depth

• We have an input of shape 32x32x3 (HxWxD)
• 20 filters of shape 8x8x3 (HxWxD)
• A stride of 2 for both the height and width (S)
• Zero padding of size 1 (P)

Output Layer

• 14x14x20 (HxWxD)

Hint

With parameter sharing, each neuron in an output channel shares its weights with every other neuron in that channel. So the number of parameters is equal to the number of neurons in the filter, plus a bias neuron, all multiplied by the number of channels in the output layer.

Convolution Layer Parameters 2
How many parameters does the convolution layer have (with parameter sharing)?
3860

Solution: Parameter Sharing

Solution

There are 3860 total parameters. That’s 196 times fewer parameters! Here’s how the answer is calculated:

(8 * 8 * 3 + 1) * 20 = 3840 + 20 = 3860

That’s 3840 weights and 20 biases. This should look similar to the answer from the previous quiz. The difference being it’s just 20 instead of (14 * 14 * 20). Remember, with weight sharing we use the same filter for an entire depth slice. Because of this we can get rid of 14 * 14 and be left with only 20.

Visualizing CNNs

Visualizing CNNs

Let’s look at an example CNN to see how it works in action.

The CNN we will look at is trained on ImageNet as described in this paper by Zeiler and Fergus. In the images below (from the same paper), we’ll see what each layer in this network detects and see how each layer detects more and more complex ideas.

Layer 1

Example patterns that cause activations in the first layer of the network. These range from simple diagonal lines (top left) to green blobs (bottom middle).

The images above are from Matthew Zeiler and Rob Fergus’ deep visualization toolbox, which lets us visualize what each layer in a CNN focuses on.

Each image in the above grid represents a pattern that causes the neurons in the first layer to activate - in other words, they are patterns that the first layer recognizes. The top left image shows a -45 degree line, while the middle top square shows a +45 degree line. These squares are shown below again for reference.

As visualized here, the first layer of the CNN can recognize -45 degree lines.

The first layer of the CNN is also able to recognize +45 degree lines, like the one above.

Let’s now see some example images that cause such activations. The below grid of images all activated the -45 degree line. Notice how they are all selected despite the fact that they have different colors, gradients, and patterns.

Example patches that activate the -45 degree line detector in the first layer.

So, the first layer of our CNN clearly picks out very simple shapes and patterns like lines and blobs(斑点).

Layer 2

A visualization of the second layer in the CNN. Notice how we are picking up more complex ideas like circles and stripes. The gray grid on the left represents how this layer of the CNN activates (or “what it sees”) based on the corresponding images from the grid on the right.

The second layer of the CNN captures complex ideas.

As you see in the image above, the second layer of the CNN recognizes circles (second row, second column), stripes (first row, second column), and rectangles (bottom right).

The CNN learns to do this on its own. There is no special instruction for the CNN to focus on more complex objects in deeper layers. That’s just how it normally works out when you feed training data into a CNN.

Layer 3

A visualization of the third layer in the CNN. The gray grid on the left represents how this layer of the CNN activates (or “what it sees”) based on the corresponding images from the grid on the right.

The third layer picks out complex combinations of features from the second layer. These include things like grids, and honeycombs (top left), wheels (second row, second column), and even faces (third row, third column).

Layer 5

A visualization of the fifth and final layer of the CNN. The gray grid on the left represents how this layer of the CNN activates (or “what it sees”) based on the corresponding images from the grid on the right.

We’ll skip layer 4, which continues this progression, and jump right to the fifth and final layer of this CNN.

The last layer picks out the highest order ideas that we care about for classification, like dog faces, bird faces, and bicycles.

On to TensorFlow

This concludes our high-level discussion of Convolutional Neural Networks.

Next you’ll practice actually building these networks in TensorFlow.

TensorFlow Convolution Layer

TensorFlow Convolution Layer

Let’s examine how to implement a CNN in TensorFlow.

TensorFlow provides the tf.nn.conv2d() and tf.nn.bias_add() functions to create your own convolutional layers.

# Output depth
k_output = 64

# Image Properties
image_width = 10
image_height = 10
color_channels = 3

# Convolution filter
filter_size_width = 5
filter_size_height = 5

# Input/Image
input = tf.placeholder(
tf.float32,
shape=[None, image_height, image_width, color_channels])

# Weight and bias
weight = tf.Variable(tf.truncated_normal(
[filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# Apply Convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# Apply activation function
conv_layer = tf.nn.relu(conv_layer)


The code above uses the tf.nn.conv2d() function to compute the convolution with weight as the filter and [1, 2, 2, 1] for the strides. TensorFlow uses a stride for each input dimension,[batch, input_height, input_width, input_channels]. We are generally always going to set the stride for batch and input_channels (i.e. the first and fourth element in the strides array) to be 1.

You’ll focus on changing input_height and input_width while setting batch and input_channels to 1. The input_height and input_width strides are for striding the filter over input. This example code uses a stride of 2 with 5x5 filter over input.

The tf.nn.bias_add() function adds a 1-d bias to the last dimension in a matrix.

Explore The Design Space

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/FG7M9tWH2nQ.mp4

TensorFlow Max Pooling

TensorFlow Max Pooling

By Aphex34 (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

The image above is an example of max pooling with a 2x2 filter and stride of 2. The four 2x2 colors represent each time the filter was applied to find the maximum value.

For example, [[1, 0], [4, 6]] becomes 6, because 6 is the maximum value in this set. Similarly, [[2, 3], [6, 8]] becomes 8.

Conceptually, the benefit of the max pooling operation is to reduce the size of the input, and allow the neural network to focus on only the most important elements. Max pooling does this by only retaining the maximum value for each filtered area, and removing the remaining values.

TensorFlow provides the tf.nn.max_pool() function to apply max pooling to your convolutional layers.

...
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.relu(conv_layer)
# Apply Max Pooling
conv_layer = tf.nn.max_pool(
conv_layer,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],


The tf.nn.max_pool() function performs max pooling with the ksize parameter as the size of the filter and the strides parameter as the length of the stride. 2x2 filters with a stride of 2x2 are common in practice.

The ksize and strides parameters are structured as 4-element lists, with each element corresponding to a dimension of the input tensor ([batch, height, width, channels]). For both ksize and strides, the batch and channel dimensions are typically set to 1.

Quiz: Pooling Intuition

The next few quizzes will test your understanding of pooling layers.

QUIZ QUESTION

A pooling layer is generally used to …

Increase the size of the output
(correct)Decrease the size of the output
(correct)Prevent overfitting

Gain information

Solution: Pooling Intuition

Solution

The correct answer is decrease the size of the output and prevent overfitting. Preventing overfitting is a consequence of reducing the output size, which in turn, reduces the number of parameters in future layers.
Recently, pooling layers have fallen out of favor. Some reasons are:

• Recent datasets are so big and complex we’re more concerned about underfitting.
• Dropout is a much better regularizer.
• Pooling results in a loss of information. Think about the max pooling operation as an example. We only keep the largest of n numbers, thereby disregarding n-1 numbers completely.

Quiz: Pooling Mechanics

Setup

H = height, W = width, D = depth

We have an input of shape 4x4x5 (HxWxD)
Filter of shape 2x2 (HxW)
A stride of 2 for both the height and width (S)


Recall the formula for calculating the new height or width:

new_height = (input_height - filter_height)/S + 1
new_width = (input_width - filter_width)/S + 1


NOTE: For a pooling layer the output depth is the same as the input depth. Additionally, the pooling operation is applied individually for each depth slice.

The image below gives an example of how a max pooling layer works. In this case, the max pooling filter has a shape of 2x2. As the max pooling filter slides across the input layer, the filter will output the maximum value of the 2x2 square.

Pooling Layer Output Shape

What’s the shape of the output? Format is HxWxD.
2x2x5

Solution: Pooling Mechanics

Solution

The answer is 2x2x5. Here’s how it’s calculated using the formula:

(4 - 2)/2 + 1 = 2
(4 - 2)/2 + 1 = 2


The depth stays the same.
Here’s the corresponding code:

input = tf.placeholder(tf.float32, (None, 4, 4, 5))
filter_shape = [1, 2, 2, 1]
strides = [1, 2, 2, 1]
pool = tf.nn.max_pool(input, filter_shape, strides, padding)


The output shape of pool will be [1, 2, 2, 5], even if padding is changed to 'SAME'.

Quiz: Pooling Practice

Great, now let’s practice doing some pooling operations manually.

Max Pooling
What’s the result of a max pooling operation on the input:

[[[0, 1, 0.5, 10],
[2, 2.5, 1, -8],
[4, 0, 5, 6],
[15, 1, 2, 3]]]
Assume the filter is 2x2 and the stride is 2 for both height and width. The output shape is 2x2x1.

The answering format will be 4 numbers, each separated by a comma, such as: 1,2,3,4.

Work from the top left to the bottom right

SUBMIT

NEXT

Solution: Pooling Practice

Solution

The correct answer is 2.5,10,15,6. We start with the four numbers in the top left corner. Then we work left-to-right and top-to-bottom, moving 2 units each time.

max(0, 1, 2, 2.5) = 2.5
max(0.5, 10, 1, -8) = 10
max(4, 0, 15, 1) = 15
max(5, 6, 2, 3) = 6


Quiz: Average Pooling

Mean Pooling
What’s the result of a average (or mean) pooling?

[[[0, 1, 0.5, 10],
[2, 2.5, 1, -8],
[4, 0, 5, 6],
[15, 1, 2, 3]]]
Assume the filter is 2x2 and the stride is 2 for both height and width. The output shape is 2x2x1.

The answering format will be 4 numbers, each separated by a comma, such as: 1,2,3,4.

Answer to 3 decimal places. Work from the top left to the bottom right

Solution: Average Pooling

Solution

The correct answer is 1.375,0.875,5,4. We start with the four numbers in the top left corner. Then we work left-to-right and top-to-bottom, moving 2 units each time.

mean(0, 1, 2, 2.5) = 1.375
mean(0.5, 10, 1, -8) = 0.875
mean(4, 0, 15, 1) = 5
mean(5, 6, 2, 3) = 4


1x1 Convolutions

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/Zmzgerm6SjA.mp4

Inception Module

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/SlTm03bEOxA.mp4

Convolutional Network in TensorFlow

Convolutional Network in TensorFlow

It’s time to walk through an example Convolutional Neural Network (CNN) in TensorFlow.

The structure of this network follows the classic structure of CNNs, which is a mix of convolutional layers and max pooling, followed by fully-connected layers.

The code you’ll be looking at is similar to what you saw in the segment on Deep Neural Network in TensorFlow, except we restructured the architecture of this network as a CNN.

Just like in that segment, here you’ll study the line-by-line breakdown of the code. If you want, you can even download the code and run it yourself.

Thanks to Aymeric Damien for providing the original TensorFlow model on which this segment is based.

Time to dive in!

Dataset

You’ve seen this section of code from previous lessons. Here we’re importing the MNIST dataset and using a convenient TensorFlow function to batch, scale, and One-Hot encode the data.

from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

# Parameters
learning_rate = 0.00001
epochs = 10
batch_size = 128

# Number of samples to calculate validation and accuracy
# Decrease this if you're running out of memory to calculate accuracy
test_valid_size = 256

# Network Parameters
n_classes = 10  # MNIST total classes (0-9 digits)
dropout = 0.75  # Dropout, probability to keep units


Weights and Biases

# Store layers weight & bias
weights = {
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {
'bc1': tf.Variable(tf.random_normal([32])),
'bc2': tf.Variable(tf.random_normal([64])),
'bd1': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}


Convolutions

Convolution with 3×3 Filter. Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

The above is an example of a convolution with a 3x3 filter and a stride of 1 being applied to data with a range of 0 to 1. The convolution for each 3x3 section is calculated against the weight, [[1, 0, 1], [0, 1, 0], [1, 0, 1]], then a bias is added to create the convolved feature on the right. In this case, the bias is zero. In TensorFlow, this is all done using tf.nn.conv2d() and tf.nn.bias_add().
def conv2d(x, W, b, strides=1):
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding=’SAME’)
return tf.nn.relu(x)
The tf.nn.conv2d() function computes the convolution against weight W as shown above.

In TensorFlow, strides is an array of 4 elements; the first element in this array indicates the stride for batch and last element indicates stride for features. It’s good practice to remove the batches or features you want to skip from the data set rather than use a stride to skip them. You can always set the first and last element to 1 in strides in order to use all batches and features.

The middle two elements are the strides for height and width respectively. I’ve mentioned stride as one number because you usually have a square stride where height = width. When someone says they are using a stride of 3, they usually mean tf.nn.conv2d(x, W, strides=[1, 3, 3, 1]).

To make life easier, the code is using tf.nn.bias_add() to add the bias. Using tf.add() doesn’t work when the tensors aren’t the same shape.

Max Pooling

Max Pooling with 2x2 filter and stride of 2. Source: http://cs231n.github.io/convolutional-networks/

The above is an example of max pooling with a 2x2 filter and stride of 2. The left square is the input and the right square is the output. The four 2x2 colors in input represents each time the filter was applied to create the max on the right side. For example, [[1, 1], [5, 6]] becomes 6 and [[3, 2], [1, 2]] becomes 3.
def maxpool2d(x, k=2):
return tf.nn.max_pool(
x,
ksize=[1, k, k, 1],
strides=[1, k, k, 1],
The tf.nn.max_pool() function does exactly what you would expect, it performs max pooling with the ksize parameter as the size of the filter.

Model

Image from Explore The Design Space video

In the code below, we’re creating 3 layers alternating between convolutions and max pooling followed by a fully connected and output layer. The transformation of each layer to new dimensions are shown in the comments. For example, the first layer shapes the images from 28x28x1 to 28x28x32 in the convolution step. Then next step applies max pooling, turning each sample into 14x14x32. All the layers are applied from conv1 to output, producing 10 class predictions.

def conv_net(x, weights, biases, dropout):
# Layer 1 - 28*28*1 to 14*14*32
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
conv1 = maxpool2d(conv1, k=2)

# Layer 2 - 14*14*32 to 7*7*64
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
conv2 = maxpool2d(conv2, k=2)

# Fully connected layer - 7*7*64 to 1024
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.nn.relu(fc1)
fc1 = tf.nn.dropout(fc1, dropout)

# Output Layer - class prediction - 1024 to 10
return out


Session

Now let’s run it!

# tf Graph input
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# Model
logits = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
.minimize(cost)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf. global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)

for epoch in range(epochs):
for batch in range(mnist.train.num_examples//batch_size):
batch_x, batch_y = mnist.train.next_batch(batch_size)
sess.run(optimizer, feed_dict={
x: batch_x,
y: batch_y,
keep_prob: dropout})

# Calculate batch loss and accuracy
loss = sess.run(cost, feed_dict={
x: batch_x,
y: batch_y,
keep_prob: 1.})
valid_acc = sess.run(accuracy, feed_dict={
x: mnist.validation.images[:test_valid_size],
y: mnist.validation.labels[:test_valid_size],
keep_prob: 1.})

print('Epoch {:>2}, Batch {:>3} -'
'Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
epoch + 1,
batch + 1,
loss,
valid_acc))

# Calculate Test Accuracy
test_acc = sess.run(accuracy, feed_dict={
x: mnist.test.images[:test_valid_size],
y: mnist.test.labels[:test_valid_size],
keep_prob: 1.})
print('Testing Accuracy: {}'.format(test_acc))


That’s it! That is a CNN in TensorFlow. Now that you’ve seen a CNN in TensorFlow, let’s see if you can apply it on your own!

TensorFlow Convolution Layer

Using Convolution Layers in TensorFlow

Let’s now apply what we’ve learned to build real CNNs in TensorFlow. In the below exercise, you’ll be asked to set up the dimensions of the Convolution filters, the weights, the biases. This is in many ways the trickiest part to using CNNs in TensorFlow. Once you have a sense of how to set up the dimensions of these attributes, applying CNNs will be far more straight forward.

Review

You should go over the TensorFlow documentation for 2D convolutions. Most of the documentation is straightforward, except perhaps the padding argument. The padding might differ depending on whether you pass 'VALID' or 'SAME'.

Here are a few more things worth reviewing:

Introduction to TensorFlow -> TensorFlow Variables.
How to determine the dimensions of the output based on the input size and the filter size (shown below). You’ll use this to determine what the size of your filter should be.
new_height = (input_height - filter_height + 2 P)/S + 1
new_width = (input_width - filter_width + 2
P)/S + 1

Instructions

1. Finish off each TODO in the conv2d function.
2. Setup the strides, padding and filter weight/bias (F_w and F_b) such that the output shape is (1, 2, 2, 3). Note that all of these except strides should be TensorFlow variables.

Solution: TensorFlow Convolution Layer

Solution

Here’s how I did it. NOTE: there’s more than 1 way to get the correct output shape. Your answer might differ from mine.
def conv2d(input):

# Filter (weights and bias)
F_W = tf.Variable(tf.truncated_normal((2, 2, 1, 3)))
F_b = tf.Variable(tf.zeros(3))
strides = [1, 2, 2, 1]
return tf.nn.conv2d(input, F_W, strides, padding) + F_b


I want to transform the input shape (1, 4, 4, 1) to (1, 2, 2, 3). I choose ‘VALID’ for the padding algorithm. I find it simpler to understand and it achieves the result I’m looking for.

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))


Plugging in the values:

out_height = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2
out_width  = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2


In order to change the depth from 1 to 3, I have to set the output depth of my filter appropriately:

F_W = tf.Variable(tf.truncated_normal((2, 2, 1, 3))) # (height, width, input_depth, output_depth)
F_b = tf.Variable(tf.zeros(3)) # (output_depth)


The input has a depth of 1, so I set that as the input_depth of the filter.

TensorFlow Pooling Layer

Using Pooling Layers in TensorFlow

In the below exercise, you’ll be asked to set up the dimensions of the pooling filters, strides, as well as the appropriate padding. You should go over the TensorFlow documentation for tf.nn.max_pool(). Padding works the same as it does for a convolution.

Instructions

Finish off each TODO in the maxpool function.
Setup the strides, padding and ksize such that the output shape after pooling is (1, 2, 2, 1).

Solution: TensorFlow Pooling Layer

Solution

Here’s how I did it. NOTE: there’s more than 1 way to get the correct output shape. Your answer might differ from mine.
def maxpool(input):
ksize = [1, 2, 2, 1]
strides = [1, 2, 2, 1]
I want to transform the input shape (1, 4, 4, 1) to (1, 2, 2, 1). I choose 'VALID' for the padding algorithm. I find it simpler to understand and it achieves the result I’m looking for.

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))


Plugging in the values:

out_height = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2
out_width  = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2


The depth doesn’t change during a pooling operation so I don’t have to worry about that.

There are many wonderful free resources that allow you to go into more depth around Convolutional Neural Networks. In this course, our goal is to give you just enough intuition to start applying this concept on real world problems so you have enough of an exposure to explore more on your own. We strongly encourage you to explore some of these resources more to reinforce your intuition and explore different ideas.

These are the resources we recommend in particular:

• Andrej Karpathy’s CS231n Stanford course on Convolutional Neural Networks.
• Michael Nielsen’s free book on Deep Learning.
• Goodfellow, Bengio, and Courville’s more advanced free book on Deep Learning.

DEEP LEARNING PROJECT

Project Details

Introduction to the Project

https://s3.cn-north-1.amazonaws.com.cn/u-vid-hd/awEYy2Df3hg.mp4

Starting the Project

Starting the Project

For this assignment, you can find the image_classification folder containing the necessary project files on the Machine Learning projects GitHub, under the projects folder. You may download all of the files for projects we’ll use in this Nanodegree program directly from this repo. Please make sure that you use the most recent version of project files when completing a project!

This project contains 3 files:

• image_classification.ipynb: This is the main file where you will be performing your work on the project.
• Two helper files, problem_unittests.py and helper.py

Submitting the Project

Evaluation

Your project will be reviewed by a Udacity reviewer against the Object Classification Program project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

When you are ready to submit your project, collect the following files and compress them into a single archive for upload. Alternatively, you may supply the following files on your GitHub Repo in a folder named image_recognition for ease of access:

• The image_classification.ipynb notebook file with all questions answered and all code cells executed and displaying output along with the .html version of the notebook.
• All helper files.

Once you have collected these files and reviewed the project rubric, proceed to the project submission page.

PROJECT

Implement this project

2. open virtualbox
3. copy downloaded file to my shared file C:\Users\SSQ\virtualbox share
4. type sudo mount -t vboxsf virtualbox_share /mnt/ in ubuntu terminal
5. type jupyter notebook image_classification.ipynb in the right directory
error
ImportError: No module named request

(failed)try with anaconda3 in ubuntu

2. type ./Anaconda3-4.3.1-Linux-x86_64.sh in terminal to run sh file
3. Anaconda3 will now be installed into this location:/home/ssq/anaconda3

installation finished.
Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /home/ssq/.bashrc ? [yes|no]
[no] >>>
You may wish to edit your .bashrc or prepend the Anaconda3 install location:

$export PATH=/home/ssq/anaconda3/bin:$PATH

Thank you for installing Anaconda3!

Share your notebooks and packages on Anaconda Cloud!

4. export PATH=/home/ssq/anaconda3/bin:$PATH in your .ipynb location 5. conda create -n tensorflow 6. error: ModuleNotFoundError: No module named 'tqdm' method: conda install -c conda-forge tqdm Package plan for installation in environment /home/ssq/anaconda3: The following NEW packages will be INSTALLED: tqdm: 4.11.2-py36_0 conda-forge  The following packages will be SUPERCEDED by a higher-priority channel: conda: 4.3.14-py36_0 --> 4.2.13-py36_0 conda-forge conda-env: 2.6.0-0 --> 2.6.0-0 conda-forge  Proceed ([y]/n)? y 7. Anaconda installation 8. export PATH=/home/ssq/anaconda3/bin:$PATH
9. source activate tensorflow
10. conda install -c conda-forge tensorflow

11. pip3 install --upgrade pip

(failed)pip3 install tensorflow in ubuntu

1. from this web to create a new vb
2. 安装增强 剪切板双向
3. set the shared file from this blog and type sudo mount -t vboxsf virtualbox_share /mnt/
4. type sudo apt install python3-pip in terminal to install python3
5. pip3 install tensorflow
6. python3 and test

(success)anaconda3 install in Win7 tensorflow

2. (tensorflow) C:\Users\SSQ>cd C:\Users\SSQ\virtualbox share\image-classification

3. (tensorflow) C:\Users\SSQ\virtualbox share\image-classification>jupyter notebook image_classification.ipynb
4. ModuleNotFoundError: No module named 'tqdm'
method:
(tensorflow) C:\Users\SSQ\virtualbox share\image-classification>conda install tqdm

anaconda3 install in win7 tensorflow-gpu

2. cuda_8.0.61_windows in win7
3. cudnn-8.0-windows7-x64-v6.0

Submission

Image Classification

Project Submission

Image Classification

Introduction

In this project, you’ll classify images from the CIFAR-10 dataset. The dataset consists of airplanes, dogs, cats, and other objects. The dataset will need to be preprocessed, then train a convolutional neural network on all the samples. You’ll normalize the images, one-hot encode the labels, build a convolutional layer, max pool layer, and fully connected layer. At then end, you’ll see their predictions on the sample images.

Getting the project files

The project files can be found in our public GitHub repo, in the image-classification folder. You can download the files from there, but it’s better to clone the repository to your computer

This way you can stay up to date with any changes we make by pulling the changes to your local repository with git pull.

Submission
1. Ensure you’ve passed all the unit tests in the notebook.
2. Ensure you pass all points on the rubric.
3. When you’re done with the project, please save the notebook as an HTML file. You can do this by going to the File menu in the notebook and choosing “Download as” > HTML. Ensure you submit both the Jupyter Notebook and it’s HTML version together.
4. Package the “dlnd_image_classification.ipynb”, “helper.py”, “problem_unittests.py”, and the HTML file into a zip archive, or push the files from your GitHub repo.
5. Hit Submit Project below!

Capstone Proposal

PROJECT

Writing up a Capstone proposal

Overview

Capstone Proposal Overview

Please note that once your Capstone Proposal has been submitted and you have passed the evaluation, you have to submit your Capstone project using the same proposal that you submitted. We do not allow the Capstone Proposal and the Capstone project to differ in terms of dataset and approach.

In this capstone project proposal, prior to completing the following Capstone Project, you you will leverage what you’ve learned throughout the Nanodegree program to author a proposal for solving a problem of your choice by applying machine learning algorithms and techniques. A project proposal encompasses seven key points:

• The project’s domain background — the field of research where the project is derived;
• A problem statement — a problem being investigated for which a solution will be defined;
• The datasets and inputs — data or inputs being used for the problem;
• A solution statement — a the solution proposed for the problem given;
• A benchmark model — some simple or historical model or result to compare the defined solution to;
• A set of evaluation metrics — functional representations for how the solution can be measured;
• An outline of the project design — how the solution will be developed and results obtained.
Capstone Proposal Highlights

The capstone project proposal is designed to introduce you to writing proposals for major projects. Typically, before you begin working on a solution to a problem, a proposal is written to your peers, advisor, manager, etc., to outline the details of the problem, your research, and your approach to a solution.

Things you will learn by completing this project proposal:

• How to research a real-world problem of interest.
• How to author a technical proposal document.
• How to organize a proposed workflow for designing a solution.

Description

Capstone Proposal Description

Think about a technical field or domain that you are passionate about, such as robotics, virtual reality, finance, natural language processing, or even artificial intelligence (the possibilities are endless!). Then, choose an existing problem within that domain that you are interested in which you could solve by applying machine learning algorithms and techniques. Be sure that you have collected all of the resources needed (such as datasets, inputs, and research) to complete this project, and make the appropriate citations wherever necessary in your proposal. Below are a few suggested problem areas you could explore if you are unsure what your passion is:

In addition, you may find a technical domain (along with the problem and dataset) as competitions on platforms such as Kaggle, or Devpost. This can be helpful for discovering a particular problem you may be interested in solving as an alternative to the suggested problem areas above. In many cases, some of the requirements for the capstone proposal are already defined for you when choosing from these platforms.

To determine whether your project and the problem you want to solve fits Udacity’s vision of a Machine Learning Capstone Project , please refer to the capstone proposal rubric and the capstone project rubric and make a note of each rubric criteria you will be evaluated on. A satisfactory project will have a proposal that clearly satisfies these requirements.

Software and Data Requirements

Software Requirements

Your proposed project must be written in Python 2.7. Given the free-form nature of the machine learning capstone, the software and libraries you will need to successfully complete your work will vary depending on the chosen application area and problem definition. Because of this, it is imperative that all necessary software and libraries you consider using in your capstone project are accessible clearly documented. Please note that proprietary software, software that requires private licenses, or software behind a paywall or login account should be avoided.

Data Requirements

Every machine learning capstone project will most certainly require some form of dataset or input data structure (input text files, images, etc.). Similar to the software requirements above, the data you are considering must either be publicly accessible or provided by you during the submission process, and private or proprietary data should not be used without expressed permission. Please take into consideration the file size of your data — while there is no strict upper limit, input files that are excessively large may require reviewers longer than an acceptable amount of time to acquire all of your project files. This can take away from the reviewer’s time that could be put towards evaluating your proposal. If the data you are considering fits the criteria of being too large, consider whether you could work with a subset of the data instead, or provide a representative sample of the data.

Ethics

Udacity’s A/B Testing course, as part of the Data Analyst Nanodegree, has a segment that discusses the sensitivity of data and the expectation of privacy from those whose information has been collected. While most data you find available to the public will not have any ethical complications, it is extremely important that you are considering where the data you are using came from, and whether that data contains any sensitive information. For example, if you worked for a bank and wanted to use customers’ bank statements as part of your project, this would most likely be an unethical choice of data and should be avoided.

If you have any questions regarding the nature of a dataset or software you intend to use for the capstone project, please send an email to machine-support@udacity.com with the subject “Capstone Project Dataset/Software Inquiry”.

Proposal Guidelines

Report Guidelines

Your project submission will be evaluated on the written proposal that is submitted. Additionally, depending on the project you are proposing, other materials such as the data being used will be evaluated. It is expected that the proposal contains enough detail, documentation, analysis, and discussion to adequately reflect the work you intend to complete for the project. Because of this, it is extremely important that the proposal is written in a professional, standardized way, so those who review your project’s proposal are able to clearly identify each component of your project in the report. Without a properly written proposal, your project cannot be sufficiently evaluated. A project proposal template is provided for you to understand how a project proposal should be structured. We strongly encourage students to have a proposal that is approximately two to three pages in length.

The Machine Learning Capstone Project proposal should be treated no different than a written research paper for academics. Your goal is to ultimately present the research you’ve discovered into the respective problem domain you’ve chosen, and then clearly articulate your intended project to your peers. The narrative found in the project proposal template provides for a “proposal checklist” that will aid you in fully completing a documented proposal. Please make use of this resource!

Submitting the Project

Evaluation

Your project will be reviewed by a Udacity reviewer against the Capstone Project Proposal rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

At minimum, your submission will be required to have the following files listed below. If your submission method of choice is uploading an archive (*.zip), please take into consideration the total file size. You will need to include

• A project proposal, in PDF format only, with the name proposal.pdf, addressing each of the seven key points of a proposal. The recommended page length for a proposal is approximately two to three pages.
• Any additional supporting material such as datasets, images, or input files that are necessary for your project and proposal. If these files are too large and you are uploading your submission, instead provide appropriate means of acquiring the necessary files in an included README.md file.
Once you have collected these files and reviewed the project rubric, proceed to the project submission page.

Submission

Capstone Proposal

Project Submission

In this capstone project proposal, prior to completing the following Capstone Project, you you will leverage what you’ve learned throughout the Nanodegree program to author a proposal for solving a problem of your choice by applying machine learning algorithms and techniques. A project proposal encompasses seven key points:

• The project’s domain background — the field of research where the project is derived;
• A problem statement — a problem being investigated for which a solution will be defined;
• The datasets and inputs — data or inputs being used for the problem;
• A solution statement — a the solution proposed for the problem given;
• A benchmark model — some simple or historical model or result to compare the defined solution to;
• A set of evaluation metrics — functional representations for how the solution can be measured;
• An outline of the project design — how the solution will be developed and results obtained.

Think about a technical field or domain that you are passionate about, such as robotics, virtual reality, finance, natural language processing, or even artificial intelligence (the possibilities are endless!). Then, choose an existing problem within that domain that you are interested in which you could solve by applying machine learning algorithms and techniques. Be sure that you have collected all of the resources needed (such as datasets, inputs, and research) to complete this project, and make the appropriate citations wherever necessary in your proposal. Below are a few suggested problem areas you could explore if you are unsure what your passion is:

In addition, you may find a technical domain (along with the problem and dataset) as competitions on platforms such as Kaggle, or Devpost. This can be helpful for discovering a particular problem you may be interested in solving as an alternative to the suggested problem areas above. In many cases, some of the requirements for the capstone proposal are already defined for you when choosing from these platforms.

Evaluation

Your project will be reviewed by a Udacity reviewer against the Capstone Project Proposal rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

At minimum, your submission will be required to have the following files listed below. If your submission method of choice is uploading an archive (*.zip), please take into consideration the total file size. You will need to include

• A project proposal, in PDF format only, with the name proposal.pdf, addressing each of the seven key points of a proposal. The recommended page length for a proposal is approximately two to three pages.
• Any additional supporting material such as datasets, images, or input files that are necessary for your project and proposal. If these files are too large and you are uploading your submission, instead provide appropriate means of acquiring the necessary files in an included README.md file.
Once you have collected these files and reviewed the project rubric, proceed to the project submission page.

When you’re ready to submit your project, click on the Submit Project button at the bottom of the page.

If you are having any problems submitting your project or wish to check on the status of your submission, please email us at machine-support@udacity.com or visit us in the discussion forums.

What’s Next?

You will get an email as soon as your reviewer has feedback for you. In the meantime, review your next project and feel free to get started on it or the courses supporting it!

submit

Capstone Project

PROJECT

Machine Learning Capstone Project

Overview

Capstone Project Overview

In this capstone project, you will leverage what you’ve learned throughout the Nanodegree program to solve a problem of your choice by applying machine learning algorithms and techniques. You will first define the problem you want to solve and investigate potential solutions and performance metrics. Next, you will analyze the problem through visualizations and data exploration to have a better understanding of what algorithms and features are appropriate for solving it. You will then implement your algorithms and metrics of choice, documenting the preprocessing, refinement, and postprocessing steps along the way. Afterwards, you will collect results about the performance of the models used, visualize significant quantities, and validate/justify these values. Finally, you will construct conclusions about your results, and discuss whether your implementation adequately solves the problem.

Capstone Project Highlights

This project is designed to prepare you for delivering a polished, end-to-end solution report of a real-world problem in a field of interest. When developing new technology, or deriving adaptations of previous technology, properly documenting your process is critical for both validating and replicating your results.

Things you will learn by completing this project:

• How to research and investigate a real-world problem of interest.
• How to accurately apply specific machine learning algorithms and techniques.
• How to properly analyze and visualize your data and results for validity.
• How to document and write a report of your work.

Description

Capstone Description

Think about a technical field or domain that you are passionate about, such as robotics, virtual reality, finance, natural language processing, or even artificial intelligence (the possibilities are endless!). Then, choose an existing problem within that domain that you are interested in which you could solve by applying machine learning algorithms and techniques. Be sure that you have collected all of the resources needed (such as data sets) to complete this project, and make the appropriate citations wherever necessary in your report. Below are a few suggested problem areas you could explore if you are unsure what your passion is:

In addition, you may find a technical domain (along with the problem and dataset) as competitions on platforms such as Kaggle, or Devpost. This can be helpful for discovering a particular problem you may be interested in solving as an alternative to the suggested problem areas above. In many cases, some of the requirements for the capstone proposal are already defined for you when choosing from these platforms.

Note: For students who have enrolled before October 17th, we strongly encourage that you look at the Capstone Proposal project that is available as an elective before this project. If you have an idea for your capstone project but aren’t ready to begin working on the implementation, or even if you want to get feedback on how you will approach a solution to your problem, you can use the Capstone Proposal project to have a peer-review from one of our Capstone Project reviewers!

For whichever application area or problem you ultimately investigate, there are five major stages to this capstone project which you will move through and subsequently document. Each stage plays a significant role in the development life cycle of beginning with a problem definition and finishing with a polished, working solution. As you make your way through developing your project, be sure that you are also working on a rough draft of your project report, as it is the most important aspect to your submission!

To determine whether your project and the problem you want to solve fits Udacity’s vision of a Machine Learning Capstone Project , please refer to the capstone project rubric and make a note of each rubric criteria you will be evaluated on. A satisfactory project will have a report that encompasses each stage and component of the rubric.

Software and Data Requirements

Software Requirements

Your project must be written in Python 2.7. Given the free-form nature of the machine learning capstone, the software and libraries you will need to successfully complete your work will vary depending on the chosen application area and problem definition. Because of this, it is imperative that all necessary software and libraries used in your capstone project are accessible to the reviewer and clearly documented. Information regarding the software and libraries your project makes use of should be included in the README along with your submission. Please note that proprietary software, software that requires private licenses, or software behind a paywall or login account should be avoided.

Data Requirements

Every machine learning capstone project will most certainly require some form of dataset or input data structure (input text files, images, etc.). Similar to the software requirements above, the data you use must either be publicly accessible or provided by you during the submission process, and private or proprietary data should not be used without expressed permission. Please take into consideration the file size of your data — while there is no strict upper limit, input files that are excessively large may require reviewers longer than an acceptable amount of time to acquire all of your project files and/or execute the provided development code. This can take away from the reviewer’s time that could be put towards evaluating your submission. If the data you are working with fits the criteria of being too large, consider whether you can work with a subset of the data instead, or provide a representative sample of the data which the reviewer may use to verify the solution explored in the project.

Ethics

Udacity’s A/B Testing course, as part of the Data Analyst Nanodegree, has a segment that discusses the sensitivity of data and the expectation of privacy from those whose information has been collected. While most data you find available to the public will not have any ethical complications, it is extremely important that you are considering where the data you are using came from, and whether that data contains any sensitive information. For example, if you worked for a bank and wanted to use customers’ bank statements as part of your project, this would most likely be an unethical choice of data and should be avoided.

Report Guidelines

Report Guidelines

Your project submission will be evaluated primarily on the report that is submitted. It is expected that the project report contains enough detail, documentation, analysis, and discussion to adequately reflect the work you completed for your project. Because of this, it is extremely important that the report is written in a professional, standardized way, so those who review your project submission are able to clearly identify each component of your project in the report. Without a properly written report, your project cannot be sufficiently evaluated. A project report template is provided for you to understand how a project report should be structured. We strongly encourage students to have a report that is approximately nine to fifteen pages in length.

The Machine Learning Capstone Project report should be treated no different than a written research paper for academics. Your goal is to ultimately present the research you’ve discovered into the respective problem domain you’ve chosen, and then discuss each stage of the project as they are completed. The narrative found in the A project report template provides for a “report checklist” that will aid you in staying on track for both your project and the documentation in your report. Each stage can be found as a section that will guide you through each component of the project development life cycle. Please make use of this resource!

Example Reports

Example Machine Learning Capstone Reports

Included in the project files for the Capstone are three example reports that were written by students just like yourselves. Because the written report for your project will be how you are evaluated, it is absolutely critical that you are producing a clear, detailed, well-written report that adequately reflects the work that you’ve completed for your Capstone. Following along with the Capstone Guidelines will be very helpful as you begin writing your report.

Our first example report comes from graduate Martin Bede, whose project design in the field of computer vision, named “Second Sight”, was to create an Android application that would extract text from the device’s camera and read it aloud. Martin’s project cites the growing concern of vision loss as motivation for developing software that can aid those unable to see or read certain print.

Our second example report comes from an anonymous graduate whose project design in the field of image recognition was to implement a Convolutional Neural Network (CNN) to train on the Cifar-10 dataset and successfully identify different objects in new images. This student describes with thorough detail how a CNN can be used quite effectively as a descriptor-learning image recognition algorithm.

Our third example report comes from graduate Naoki Shibuya, who took advantage of the pre-curated robot motion planning “Plot and Navigate a Virtual Maze” project. Pay special attention to the emphasis Naoki places on discussing the methodology and results: Projects relying on technical implementations require valuable observations and visualizations of how the solution performs under various circumstances and constraints.

Each example report given has many desirable qualities we expect from students when completing the Machine Learning Capstone project. Once you begin writing your project report for which ever problem domain you choose, be sure to reference these examples whenever necessary!

Submitting the Project

Evaluation

Your project will be reviewed by a Udacity reviewer against the Machine Learning Capstone project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

At minimum, your submission will be required to have the following files listed below. If your submission method of choice is uploading an archive (*.zip), please take into consideration the total file size. You will need to include

• Your capstone proposal document as proposal.pdf if you have completed the pre-requisite Capstone Proposal project. Please also include your review link in the student submission notes.
• A project report (in PDF format only) addressing the five major project development stages. The recommended page length for a project report is approximately nine to fifteen pages. Please do not export an iPython Notebook as PDF for your project report.
• All development Python code used for your project that is required to reproduce your implemented solution and result. Your code should be in a neat and well-documented format. Using iPython Notebooks is strongly encouraged for development.
• A README documentation file which briefly describes the software and libraries used in your project, including any necessary references to supporting material. If your project requires setup/startup, ensure that your README includes the necessary instructions.
• Any additional supporting material such as datasets, images, or input files that are necessary for your project’s development and implementation. If these files are too large and you are uploading your submission, instead provide appropriate means of acquiring the necessary files in your included README.
Once you have collected these files and reviewed the project rubric, proceed to the project submission page.

Submission

Capstone Project

Project Submission

In this capstone project, you will leverage what you’ve learned throughout the Nanodegree program to solve a problem of your choice by applying machine learning algorithms and techniques. You will first define the problem you want to solve and investigate potential solutions and performance metrics. Next, you will analyze the problem through visualizations and data exploration to have a better understanding of what algorithms and features are appropriate for solving it. You will then implement your algorithms and metrics of choice, documenting the preprocessing, refinement, and postprocessing steps along the way. Afterwards, you will collect results about the performance of the models used, visualize significant quantities, and validate/justify these values. Finally, you will construct conclusions about your results, and discuss whether your implementation adequately solves the problem.

Think about a technical field or domain that you are passionate about, such as robotics, virtual reality, finance, natural language processing, or even artificial intelligence (the possibilities are endless!). Then, choose an existing problem within that domain that you are interested in which you could solve by applying machine learning algorithms and techniques. Be sure that you have collected all of the resources needed (such as datasets, inputs, and research) to complete this project, and make the appropriate citations wherever necessary in your proposal. Below are a few suggested problem areas you could explore if you are unsure what your passion is:

In addition, you may find a technical domain (along with the problem and dataset) as competitions on platforms such as Kaggle, or Devpost. This can be helpful for discovering a particular problem you may be interested in solving as an alternative to the suggested problem areas above. In many cases, some of the requirements for the capstone proposal are already defined for you when choosing from these platforms.

Note: For students who have enrolled before October 17th, we strongly encourage that you look at the Capstone Proposal project that is available as an elective before this project. If you have an idea for your capstone project but aren’t ready to begin working on the implementation, or even if you want to get feedback on how you will approach a solution to your problem, you can use the Capstone Proposal project to have a peer-review from one of our Capstone Project reviewers!

For whichever application area or problem you ultimately investigate, there are five major stages to this capstone project which you will move through and subsequently document. Each stage plays a significant role in the development life cycle of beginning with a problem definition and finishing with a polished, working solution. As you make your way through developing your project, be sure that you are also working on a rough draft of your project report, as it is the most important aspect to your submission!

Evaluation

Your project will be reviewed by a Udacity reviewer against the Machine Learning Capstone project rubric. Be sure to review this rubric thoroughly and self-evaluate your project before submission. All criteria found in the rubric must be meeting specifications for you to pass.

Submission Files

At minimum, your submission will be required to have the following files listed below. If your submission method of choice is uploading an archive (*.zip), please take into consideration the total file size. You will need to include

• Your capstone proposal document as proposal.pdf if you have completed the pre-requisite Capstone Proposal project. Please also include your review link in the student submission notes.
• A project report (in PDF format only) addressing the five major project development stages. The recommended page length for a project report is approximately nine to fifteen pages. Please do not export an iPython Notebook as PDF for your project report.
• All development Python code used for your project that is required to reproduce your implemented solution and result. Your code should be in a neat and well-documented format. Using iPython Notebooks is strongly encouraged for development.
• A README documentation file which briefly describes the software and libraries used in your project, including any necessary references to supporting material. If your project requires setup/startup, ensure that your README includes the necessary instructions.
• Any additional supporting material such as datasets, images, or input files that are necessary for your project’s development and implementation. If these files are too large and you are uploading your submission, instead provide appropriate means of acquiring the necessary files in your included README.

When you’re ready to submit your project, click on the Submit Project button at the bottom of the page.

If you are having any problems submitting your project or wish to check on the status of your submission, please email us at machine-support@udacity.com or visit us in the discussion forums.

What’s Next?

You will get an email as soon as your reviewer has feedback for you. In the meantime, review your next project and feel free to get started on it or the courses supporting it!

Supporting Materials
 (2,2),25,500,128,200,success
Testing Accuracy: 0.8081980186934564

First result

1 convnet
1 fully_con

save as capstone_model.meta

Testing Accuracy: 0.808136261212445

Second result

3 convnets
1 fully_con

Testing Accuracy: 0.8427798201640447
with fully datasets
Testing Accuracy: 0.8851721937559089