Writing this blog for quick searching, resume and interview.
Shortest Paths Revisited, NPComplete Problems and What To Do About Them.
stay stuned…
Courses can be found here
Summary can be found here
Algorithms are the heart of computer science, and the subject has countless practical applications as well as intellectual depth. This specialization is an introduction to algorithms for learners with at least a little programming experience. The specialization is rigorous but emphasizes the big picture and conceptual understanding over lowlevel implementation and mathematical details. After completing this specialization, you will be wellpositioned to ace your technical interviews and speak fluently about algorithms with other programmers and computer scientists.
About the instructor: Tim Roughgarden has been a professor in the Computer Science Department at Stanford University since 2004. He has taught and published extensively on the subject of algorithms and their applications.
Divide and Conquer, Sorting and Searching, and Randomized Algorithms
Course can be found here
Lecture slides can be found here
Summary can be found in my Github
About this course: The primary topics in this part of the specialization are: asymptotic (“Bigoh”) notation, sorting and searching, divide and conquer (master method, integer and matrix multiplication, closest pair), and randomized algorithms (QuickSort, contraction algorithm for min cuts).
Who is this class for: Learners with at least a little bit of programming experience who want to learn the essentials of algorithms. In a University computer science curriculum, this course is typically taken in the third year.
Week 1
Introduction;
“bigoh” notation and asymptotic analysis.
I. INTRODUCTION
Welcome and Week 1 Overview 10 min
WELCOME: Welcome to the Algorithms Specialization! Here’s an overview of the first week of material.
INTRODUCTION: The first set of lectures for this week is meant to give you the flavor of the course, and hopefully get you excited about it. We begin by discussing algorithms in general and why they’re so important, and then use the problem of multiplying two integers to illustrate how algorithmic ingenuity can often improve over more straightforward or naive solutions. We discuss the Merge Sort algorithm in detail, for several reasons: it’s a practical and famous algorithm that you should all know; it’s a good warmup to get you ready for more intricate algorithms; and it’s the canonical introduction to the “divide and conquer” algorithm design paradigm. These lectures conclude by describing several guiding principles for how we’ll analyze algorithms in this course.
ASYMPTOTIC ANALYSIS: The second set of lectures for this week is an introduction to bigoh notation and its relatives, which belongs in the vocabulary of every serious programmer and computer scientist. The goal is to identify a “sweet spot” of granularity for reasoning about algorithms — we want to suppress secondorder details like constant factors and lowerorder terms, and focus on how the running time of an algorithm scales as the input size grows large.
PREREQUISITES: This course is not an introduction to programming, and it assumes that you have basic programming skills in a language such as Python, Java, or C. There are several outstanding free online courses that teach basic programming. We also use mathematical analysis as needed to understand how and why algorithms and data structures really work. If you need a refresher on the basics of proofs (induction, contradiction, etc.), I recommend the lecture notes “Mathematics for Computer Science” by Lehman and Leighton (see separate Resources pages).
DISCUSSION FORUMS: The discussion forums play a crucial role in massive online courses like this one. If you have trouble understanding a lecture or completing an assignment, you should turn to the forums for help. After you’ve mastered the lectures and assignments for a given week, I hope you’ll contribute to the forums and help out your fellow students. While I won’t have time to carefully monitor the discussion forums, I’ll check in and answer questions whenever I find the time.
VIDEOS AND SLIDES: Videos can be streamed or downloaded and watched offline (recommended for commutes, etc.). We are also providing PDF lecture slides (typed versions of what’s written in the lecture videos), as well as subtitle files (in English and in some cases other languages as well). And if you find yourself wishing that I spoke more quickly or more slowly, note that you can adjust the video speed to accommodate your preferred pace.
HOMEWORK #1: The first problem set consists of 5 multiple choice problems, mostly about Merge Sort and asymptotic notation. The first programming assignment asks you to implement one or more of the integer multiplication algorithms covered in lecture.
SUGGESTED READINGS FOR WEEK 1: Abbreviations in suggested readings refer to the following textbooks:
 CLRS  Cormen, Leiserson, Rivest, and Stein, Introduction to Algorithms (3rd edition)
 DPV  Dasgupta, Papadimitriou, and Vazirani, Algorithms
 KT  Kleinberg and Tardos, Algorithm Design
 SW  Sedgewick and Wayne, Algorithms (4th edition)
 CLRS: Chapters 2 and 3
 DPV: Sections 0.3, 2.1, 2.3
 KT: Sections 2.1, 2.2, 2.4, and 5.1
 SW: Sections 1.4 and 2.2
Overview, Resources, and Policies 10 min
Course overview
Algorithms are the heart of computer science, and the subject has countless practical applications as well as intellectual depth. This specialization is an introduction to algorithms for learners with at least a little programming experience. The specialization is rigorous but emphasizes the big picture and conceptual understanding over lowlevel implementation and mathematical details. After completing this specialization, you will have a greater mastery of algorithms than almost anyone without a graduate degree in the subject. Specific topics in Part 1 of the specialization include: “Bigoh” notation, sorting and searching, divide and conquer (master method, integer and matrix multiplication, closest pair), and randomized algorithms (QuickSort, contraction algorithm for min cuts).
Resources
 Mathematics for Computer Science (by Eric Lehman and Tom Leighton):
book
Course Policies
Video Lectures
 Lectures will be made available weekly, for a total of four weeks. In a given week, there will be roughly three hours of material. Some weeks include extra optional material (some review, some advanced topics).
 Below each lecture video there is a PDF of typed versions of the slides.
 You can download the videos of the lectures for offline viewing.
 The video player supports speeding up and slowing down the pace.
Weekly Programming Assignments and Problem Sets
 Every week there will be a new problem set and a new programming assignment.
 For each problem set you are allowed a maximum of two attempts in a 12hour period (we’ll use the best score).
 For each programming assignment you’re allowed a maximum of 10 attempts in a 12hour period (we’ll use the best score).
 For the final exam, you’re allowed one attempt per 24 hours.
Grading
 To pass a problem set, you must get at least 4 of the 5 questions correct (80%).
 To pass a programming assignment, you must get all of the answers correct (100%).
 To pass the final exam, you must get at least 70% of the total points (7 out of 10).
 To pass the course, you must pass all of the problem sets, all of the programming assignments, and the final exam.
Theory Problems
 These are totally optional theory questions (no deadlines or credit given).
 We encourage you to attempt these questions and discuss them in the forums to develop a deeper understanding of the design and analysis of algorithms.
Lecture slides 10 min
See the zip file below (which includes slides for both Part 1 and Part 2 of the specialization). Also available from the video lectures on a lecturebylecture basis.
Why Study Algorithms?4 min
Integer Multiplication 8 min
Karatsuba Multiplication 12 min
About the Course17 min
Merge Sort: Motivation and Example8 min
Merge Sort: Pseudocode12 min
Merge Sort: Analysis 9 min
https://www.coursera.org/learn/algorithmsdivideconquer/lecture/wW9On/mergesortanalysis
Guiding Principles for Analysis of Algorithms 15 min
II. ASYMPTOTIC ANALYSIS
The Gist 14 min
https://www.coursera.org/learn/algorithmsdivideconquer/lecture/o2NpH/thegist
BigOh Notation 4 min
https://www.coursera.org/learn/algorithmsdivideconquer/lecture/KGtUp/bigohnotation
Basic Examples 7 min
https://www.coursera.org/learn/algorithmsdivideconquer/lecture/mb8bV/basicexamples
Big Omega and Theta7 min
https://www.coursera.org/learn/algorithmsdivideconquer/lecture/SxSch/bigomegaandtheta
Additional Examples [Review  Optional]7 min
Problem Set #1
Quiz: Problem Set #1 5 questions
Programming Assignment #1
Quiz: Programming Assignment #1 1 question
In this programming assignment you will implement one or more of the integer multiplication algorithms described in lecture.
To get the most out of this assignment, your program should restrict itself to multiplying only pairs of singledigit numbers. You can implement the gradeschool algorithm if you want, but to get the most out of the assignment you’ll want to implement recursive integer multiplication and/or Karatsuba’s algorithm.
So: what’s the product of the following two 64digit numbers?
3141592653589793238462643383279502884197169399375105820974944592
2718281828459045235360287471352662497757247093699959574966967627
[TIP: before submitting, first test the correctness of your program on some small test cases of your own devising. Then post your best test cases to the discussion forums to help your fellow students!]
[Food for thought: the number of digits in each input number is a power of 2. Does this make your life easier? Does it depend on which algorithm you’re implementing?]
The numeric answer should be typed in the space below. So if your answer is 1198233847, then just type 1198233847 in the space provided without any space / commas / any other punctuation marks.
(We do not require you to submit your code, so feel free to use any programming language you want — just type the final numeric answer in the following space.)
Week 2
Divideandconquer basics;
the master method for analyzing divide and conquer algorithms.
III. DIVIDE & CONQUER ALGORITHMS
Week 2 Overview 10 min
DIVIDE AND CONQUER ALGORITHMS: The next set of lectures discusses three nontrivial examples of the divide and conquer algorithm design paradigm. The first is for counting the number of inversions in an array. This problem is related to measuring similarity between two ranked lists, which in turn is relevant for making good recommendations to someone based on your knowledge of their and others’ preferences (“collaborative filtering”). The second algorithm is Strassen’s mindblowing recursive algorithm for matrix multiplication, which improves over the obvious iterative method. The third algorithm, which is more advanced and is optional material, is for computing the closest pair of points in the plane.
THE MASTER METHOD: These lectures cover a “blackbox” method for solving recurrences. You can then immediately determine the running time of most of the divideandconquer algorithms that you’ll ever see! (Including Karatsuba’s integer multiplication algorithm from Week 1.) The proof is a nice generalization of the recursion tree method that we used to analyze MergeSort. Ever wonder about the mysterious three cases of the Master Method? Watch these videos and hopefully all will become clear.
HOMEWORK: Problem Set #2 has five questions that should give you practice with divideandconquer algorithms and the Master Method. Programming assignment #2 asks you to implement the counting inversions algorithm (from Part III) in whatever programming language you please, run it on a quite large input, and enter the answer.
SUGGESTED READINGS FOR WEEK 2:
 CLRS Chapter 4 (except Section 4.3), and Sections 28.1 and 33.4
 DPV Sections 2.2 and 2.5
 KT Sections 5.25.5
O(n log n) Algorithm for Counting Inversions I12 min
O(n log n) Algorithm for Counting Inversions II1 6 min
Strassen’s Subcubic Matrix Multiplication Algorithm22 min
O(n log n) Algorithm for Closest Pair I [Advanced  Optional]31 min
O(n log n) Algorithm for Closest Pair II [Advanced  Optional]18 min
IV. THE MASTER METHOD
Motivation7 min
Formal Statement10 min
Examples13 min
Proof I9 min
Interpretation of the 3 Cases10 min
Proof II16 min
Problem Set #2
Quiz: Problem Set #2 5 questions
Optional Theory Problems (Batch #1) 10 min
The following problems are for those of you looking to challenge yourself beyond the required problem sets and programming questions. Most of these have been given in Stanford’s CS161 course, Design and Analysis of Algorithms, at some point. They are completely optional and will not be graded. While they vary in level, many are pretty challenging, and we strongly encourage you to discuss ideas and approaches with your fellow students on the “Theory Problems” discussion form.
You are given as input an unsorted array of n distinct numbers, where n is a power of 2. Give an algorithm that identifies the secondlargest number in the array, and that uses at most n+log2n−2 comparisons.
You are a given a unimodal array of n distinct elements, meaning that its entries are in increasing order up until its maximum element, after which its elements are in decreasing order. Give an algorithm to compute the maximum element that runs in O(log n) time.
You are given a sorted (from smallest to largest) array A of n distinct integers which can be positive, negative, or zero. You want to decide whether or not there is an index i such that A[i] = i. Design the fastest algorithm that you can for solving this problem.
You are given an n by n grid of distinct numbers. A number is a local minimum if it is smaller than all of its neighbors. (A neighbor of a number is one immediately above, below, to the left, or the right. Most numbers have four neighbors; numbers on the side have three; the four corners have two.) Use the divideandconquer algorithm design paradigm to compute a local minimum with only O(n) comparisons between pairs of numbers. (Note: since there are n2 numbers in the input, you cannot afford to look at all of them. Hint: Think about what types of recurrences would give you the desired upper bound.)
Programming Assignment #2
Quiz: Programming Assignment #2 1 question
Download the following text file:
IntegerArray.txt
This file contains all of the 100,000 integers between 1 and 100,000 (inclusive) in some order, with no integer repeated.
Your task is to compute the number of inversions in the file given, where the ith row of the file indicates the ith entry of an array.
Because of the large size of this array, you should implement the fast divideandconquer algorithm covered in the video lectures.
The numeric answer for the given input file should be typed in the space below.
So if your answer is 1198233847, then just type 1198233847 in the space provided without any space / commas / any other punctuation marks. You can make up to 5 attempts, and we’ll use the best one for grading.
(We do not require you to submit your code, so feel free to use any programming language you want — just type the final numeric answer in the following space.)
[TIP: before submitting, first test the correctness of your program on some small test files or your own devising. Then post your best test cases to the discussion forums to help your fellow students!]
Week 3
The QuickSort algorithm and its analysis;
probability review.
V. QUICKSORT  ALGORITHM
Week 3 Overview10 min
QUICKSORT  THE ALGORITHM (Part V). One of the greatest algorithms ever, and our first example of a randomized algorithm. These lectures go over the pseudocode — the highlevel approach, how to partition an array around a pivot element in linear time with minimal extra storage, and the ramifications of different pivot choices — and explain how the algorithm works.
QUICKSORT  THE ANALYSIS (Part VI). These lectures prove that randomized QuickSort (i.e., with random pivot choices) runs in O(n log n) time on average. The analysis is as elegant as the algorithm itself, and is based on a “decomposition principle” that is often useful in the analysis of randomized algorithms. Note that there are some accompanying lecture notes for this part (available for download underneath each video). Also, it may be helpful to watch the first probability review video (below) before watching this sequence.
PROBABILITY REVIEW (Part VII). This first of these videos reviews the concepts from discrete probability that are necessary for the QuickSort analysis — sample spaces, events, random variables, expectation, and linearity of expectation. The second video covers just two topics, although quite tricky ones! (Namely, conditional probability and independence.) You need to review these two topics (via this video or some other source, as you wish) before studying the analysis of the randomized contraction algorithm in Week 4.
HOMEWORK: This week’s quiz will help you understand QuickSort and probability more deeply. Programming Assignment #3 asks you to implement QuickSort and compute the number of comparisons that it makes for three different pivot rules.
SUGGESTED READINGS FOR WEEK 3:
 CLRS Chapter 7
 KT Section 13.5
 SW Section 2.3
Quicksort: Overview12 min
Partitioning Around a Pivot24 min
Correctness of Quicksort [Review  Optional]10 min
Choosing a Good Pivot22 min
VI. QUICKSORT  ANALYSIS
Analysis I: A Decomposition Principle21 min
Analysis II: The Key Insight11 min
Analysis III: Final Calculations8 min
VII. PROBABILITY REVIEW
Probability Review I25 min
Probability Review II17 min
Problem Set #3
Quiz: Problem Set #35 questions
Programming Assignment #3
Quiz: Programming Assignment #33 questions
Week 4
Lineartime selection;
graphs, cuts, and the contraction algorithm.
VIII. LINEARTIME SELECTION
Week 4 Overview10 min
Part VIII — LINEARTIME SELECTION (Required). These lectures study the problem of computing the ith smallest element of an input array (e.g., the median). It’s easy to solve this problem in O(n log n) time using sorting, but we can do better! The required material goes over a superpractical randomized algorithm, very much in the spirit of QuickSort, that has linear expected running time. Don’t forget that it takes linear time just to read the input! The analysis is somewhat different than what we studied for QuickSort, but is equally slick. Basically, there’s a cool way to think about the progress the algorithm makes in terms of a simple coinflipping experiment. Linearity of expectation (yes, it’s back…) seals the deal.
Part VIII — LINEARTIME SELECTION (Optional). I couldn’t resist covering some advanced related material. The first is an algorithm that has more Turing Awardwinning authors than any other algorithm that I know of. It is a deterministic (i.e., no randomization allowed) lineartime algorithm for the Selection problem, based on an ingenious “medianofmedians” idea for guaranteeing good pivot choices. (There are some accompanying lectures notes for this part, available for download underneath each video.) The second optional topic answers the question “can we do better?” for sorting, unfortunately in the negative. That is, a counting argument shows that there is no “comparisonbased” sorting algorithm (like MergeSort, QuickSort, or HeapSort) with worstcase running time better than n log n.
Part IX — GRAPHS AND THE CONTRACTION ALGORITHM. The second set of lectures for this week is a segue from randomized algorithms to graph algorithms. We first review graphs and the most standard ways of representing them (most commonly, by adjacency lists). We then consider the random contraction algorithm, discovered by Karger “only” 20ish years ago (while a PhD student here at Stanford). This algorithm solves the minimum cut problem — given an undirected graph, separate the vertices into two nonempty groups to minimize the number of “crossing edges”. Such problems come up when reasoning about, for example, physical networks, social networks, and images. This algorithm was perhaps the first strong evidence that graph problems could be added to the long list of “killer applications” of random sampling. Don’t tune out before the final plot twist — a simple but useful trick for transforming an algorithm that almost always fails into one that almost always succeeds.
HOMEWORK: Problem Set #4 has five questions about the randomized selection algorithm, cuts in graphs, and the contraction algorithm. Programming Assignment #4 asks you to implement the contraction algorithm and use it to compute the min cut of the graph that we provide.
SUGGESTED READINGS FOR WEEK 3:
 CLRS Chapter 9, 22 (Only 22.1)
 DPV Chapter 3 (only 3.1)
 KT Chapter 13, Sections 13.2,13.5
 SW Chapter 4, Section 4.1
Randomized Selection  Algorithm21 min
Randomized Selection  Analysis20 min
Deterministic Selection  Algorithm [Advanced  Optional]16 min
Deterministic Selection  Analysis I [Advanced  Optional]22 min
Deterministic Selection  Analysis II [Advanced  Optional]12 min
Omega(n log n) Lower Bound for ComparisonBased Sorting [Advanced  Optional]13 min
IX. GRAPHS AND THE CONTRACTION ALGORITHM
Graphs and Minimum Cuts15 min
Graph Representations14 min
Random Contraction Algorithm8 min
Analysis of Contraction Algorithm30 min
Counting Minimum Cuts7 min
Problem Set #4
Quiz: Problem Set #45 questions
Optional Theory Problems (Batch #2)10 min
Programming Assignment #4
Quiz: Programming Assignment #41 question
Final Exam (1 attempt per 24 hours)
Info and FAQ for final exam10 min
Quiz: Final Exam10 questions
Graph Search, Shortest Paths, and Data Structures
Course can be found here
Lecture slides can be found here
Summary can be found in my Github
About this course: The primary topics in this part of the specialization are: data structures (heaps, balanced search trees, hash tables, bloom filters), graph primitives (applications of breadthfirst and depthfirst search, connectivity, shortest paths), and their applications (ranging from deduplication to social network analysis).
Who is this class for: Learners with at least a little bit of programming experience who want to learn the essentials of algorithms. In a University computer science curriculum, this course is typically taken in the third year.
Week 1
Breadthfirst and depthfirst search;
computing strong components;
applications.
X. GRAPH SEARCH AND CONNECTIVITY (Week 1)
Week 1 Overview10 min
WELCOME: Welcome to Part 2 of the Algorithms Specialization: Graph Search, Shortest Paths, and Data Structures! Like the previous part, the course will have four weeks of lectures and assignments, followed by a final exam. Here are the highlights of the course’s first week.
SUMMARY: Week 1 is all about graph search and its applications (section X). We’ll cover a selection of fundamental primitives for reasoning about graphs. One very cool aspect of this material is that all of the algorithms that we’ll cover are insanely fast (linear time with small constants); and, it can be quite subtle to understand why they work! The culmination of these lectures — computing the strongly connected components of a directed graph with just two passes of depthfirst search — vividly illustrates the point that fast algorithms often require deep insight into the structure of the problem that you’re solving. There are also lecture notes for this last topic (available for download underneath the videos). (If you’re feeling REALLY motivated, you might read up on Tarjan’s earlier algorithm for computing SCCs that needs only a single DFS pass!)
THE VIDEOS: We begin with an overview video, which gives some reasons why you should care about graph search, a general strategy for searching a graph without doing any redundant work, and a highlevel introduction to the two most important search strategies, breadthfirst search (BFS) and depthfirst search (DFS). The second video discusses BFS in more detail, including applications to computing shortest paths and the connected components of an undirected graph. The third video drills down on DFS, and shows how to use it to compute a toplogical ordering of a directed acyclic graph (i.e., to sequence tasks in a way that respects precedence constraints). The fourth and fifth videos cover the description and analysis, respectively, of the aforementioned algorithm for computing SCCs. A final optional video — hopefully one of the more fun ones in the course — discusses the structure of the Web, and lightly touches on topics like Google’s PageRank algorithm and the “six degrees of separation” phenomenon in social networks.
DISCUSSION FORUMS: The discussion forums play a crucial role in massive online courses like this one. If you have trouble understanding a lecture or completing an assignment, you should turn to the forums for help. After you’ve mastered the lectures and assignments for a given week, I hope you’ll contribute to the forums and help out your fellow students. While I won’t have time to carefully monitor the discussion forums, I’ll check in and answer questions whenever I find the time.
VIDEOS AND SLIDES: Videos can be streamed or downloaded and watched offline (recommended for commutes, etc.). We are also providing PDF lecture slides (typed versions of what’s written in the lecture videos), as well as subtitle files (in English and in some cases other languages as well). And if you find yourself wishing that I spoke more quickly or more slowly, note that you can adjust the video speed to accommodate your preferred pace.
THE HOMEWORK: Problem Set #1 should help you solidify your understanding of graph representations and graph search. The programming assignment asks you to implement the SCC algorithm from the lectures, and report your findings about the SCCs of a large graph. Programming Assignment #1 is the most difficult one of the course (and one of the more difficult ones in the entire specialization); as always, I encourage you to use the discussion forums to exchange ideas, tips, and test cases. If you can get through the first week of the course, it should be all downhill from there!
SUGGESTED READINGS FOR WEEK 1: Abbreviations in suggested readings refer to the following textbooks:
 CLRS  Cormen, Leiserson, Rivest, and Stein, Introduction to Algorithms (3rd edition)
DPV  Dasgupta, Papadimitriou, and Vazirani, Algorithms  KT  Kleinberg and Tardos, Algorithm Design
 SW  Sedgewick and Wayne, Algorithms (4th edition)
 CLRS Chapter 22
 DPV Chapter 3
 KT Chapter 3, Section 3.5, 3.6
 SW Chapter 4, Section 4.1,4.2
Overview, Resources, and Policies10 min
Resources
Mathematics for Computer Science (by Eric Lehman and Tom Leighton): https://www.cs.princeton.edu/courses/archive/fall06/cos341/handouts/mathcs.pdf
Lecture slides10 min
Graph Search  Overview23 min
BreadthFirst Search (BFS): The Basics14 min
BFS and Shortest Paths7 min
BFS and Undirected Connectivity13 min
DepthFirst Search (DFS): The Basics7 min
Topological Sort21 min
Computing Strong Components: The Algorithm29 min
Computing Strong Components: The Analysis26 min
Structure of the Web [Optional]18 min
Problem Set #1
Quiz: Problem Set #15 questions
Optional Theory Problems (Week 1)10 min
In the 2SAT problem, you are given a set of clauses, where each clause is the disjunction of two literals (a literal is a Boolean variable or the negation of a Boolean variable). You are looking for a way to assign a value “true” or “false” to each of the variables so that all clauses are satisfied — that is, there is at least one true literal in each clause. For this problem, design an algorithm that determines whether or not a given 2SAT instance has a satisfying assignment. (Your algorithm does not need to exhibit a satisfying assignment, just decide whether or not one exists.) Your algorithm should run in O(m+n) time, where m and n are the number of clauses and variables, respectively. [Hint: strongly connected components.]
Programming Assignment #1
Quiz: Programming Assignment #11 question
Week 2
Dijkstra’s shortestpath algorithm.
XI. DIJKSTRA’S SHORTESTPATH ALGORITHM (Week 2)
Week 2 Overview10 min
SUMMARY: This week is the climax of all our work on graph search — Dijkstra’s shortestpath algorithm, surely one of the greatest hits of algorithms (Part XI). It works in any directed graph with nonnegative edge lengths, and it computes the shortest paths from a source vertex to all other vertices. Particularly nice is the blazingly fast implementation that uses a heap data structure (more on heaps next week).
THE HOMEWORK: Problem Set #2 should solidify your understanding of Dijkstra’s shortestpath algorithm. In the programming assignment you’ll implement Dijkstra’s algorithm. You can just implement the basic version, but those of you who want a bigger challenge are encouraged to devise a heapbased implementation.
SUGGESTED READINGS FOR WEEK 2:
 CLRS Chapter 24 (Sections 3,4)
 DPV Sections 4.4
 KT Section 4.4
 SW Section 4.4
Dijkstra’s ShortestPath Algorithm20 min
Dijkstra’s Algorithm: Examples12 min
Correctness of Dijkstra’s Algorithm19 min
Dijkstra’s Algorithm: Implementation and Running Time26 min
Problem Set #2
Quiz: Problem Set #25 questions
Optional Theory Problems (Week 2)10 min
In lecture we define the length of a path to be the sum of the lengths of its edges. Define the bottleneck of a path to be the maximum length of one of its edges. A mininumbottleneck path between two vertices s and t is a path with bottleneck no larger than that of any other st path. Show how to modify Dijkstra’s algorithm to compute a minimumbottleneck path between two given vertices. The running time should be O(mlogn), as in lecture.
We can do better. Suppose now that the graph is undirected. Give a lineartime (O(m)) algorithm to compute a minimumbottleneck path between two given vertices.
What if the graph is directed? Can you compute a minimumbottleneck path between two given vertices faster than O(mlogn)?
Programming Assignment #2
Quiz: Programming Assignment #21 question
Week 3
Heaps;
balanced binary search trees.
XII. HEAPS (Week 3)
Week 3 Overview10 min
SUMMARY: This week we start our discussion of data structures, a huge (and hugely useful) topic. This week covers heaps and balanced binary search trees. My primary goals here are to teach you the operations that these data structures support (along with their running times), and to develop your intuition about which data structures are useful for which sorts of problems.
THE VIDEOS: Part XII begins with a short overview video in which I explain my approach to teaching data structures in this course. The next two videos discuss heaps and some of their applications (required), and some details about how they are typically implemented (optional, recommended for hardcore programmer/computer science types). Part XIII discusses balanced binary search trees — the supported operations and canonical uses (required) and a glimpse at what’s “under the hood” (optional).
THE HOMEWORK: Problem Set #3 should solidify your understanding of heaps and search trees. In the programming assignment you’ll implement a slick data structure application covered in the lectures (median maintenance).
SUGGESTED READINGS FOR WEEK 3:
 CLRS Chapter 6,11,12,13
 DPV Section 4.5
 SW Section 3.3, 3.4
Data Structures: Overview4 min
Heaps: Operations and Applications18 min
Heaps: Implementation Details [Advanced  Optional]20 min
XIII. BALANCED BINARY SEARCH TREES (Week 3)
Balanced Search Trees: Operations and Applications10 min
Binary Search Tree Basics, Part I13 min
Binary Search Tree Basics, Part II30 min
RedBlack Trees21 min
Rotations [Advanced  Optional]7 min
Insertion in a RedBlack Tree [Advanced]14 min
Problem Set #3
Quiz: Problem Set #35 questions
Programming Assignment #3
Quiz: Programming Assignment #31 question
Week 4
Hashing;
bloom filters.
XIV. HASHING: THE BASICS (Week 4)
Week 4 Overview10 min
SUMMARY: This week wraps up our discussion of data structures. We’ll finish with a bang by covering hash tables — certainly one of the most important data structures — in detail. First (Part XIV) we discuss hash table basics — the supported operations, the types of problems they’re useful for, and an overview of implementation approaches. Part XV has one required video about “pathological data sets” for hash functions, and three optional videos that cover great stuff — how randomization dodges pathological data sets, a construction of simple hash functions that are guaranteed to (in expectation over the choice of a random hash function) spread every data set out evenly, and performance analyses of hashing with both chaining and open addressing. Part XVI contains two required videos on the implementation and performance of bloom filters, which are like hash tables except that they are insanely spaceefficient and occasionally suffer from false positives.
THE HOMEWORK: Problem Set #4 should solidify your understanding of hash tables and bloom filters. In the programming assignment you’ll implement another slick data structure application covered in the lectures (solving the 2SUM problem using hash tables).
SUGGESTED READINGS FOR WEEK 4:
CLRS Chapter 11
KT Chapter 13 (Section 13.6)
SW Section 3.5
Hash Tables: Operations and Applications19 min
Hash Tables: Implementation Details, Part I18 min
Hash Tables: Implementation Details, Part II22 min
XV. UNIVERSAL HASHING (Week 4)
Pathological Data Sets and Universal Hashing Motivation21 min
Universal Hashing: Definition and Example [Advanced  Optional]25 min
Universal Hashing: Analysis of Chaining [Advanced  Optional]18 min
Hash Table Performance with Open Addressing [Advanced  Optional]15 min
XVI. BLOOM FILTERS (Week 4)
Bloom Filters: The Basics15 min
Bloom Filters: Heuristic Analysis13 min
Problem Set #4
Quiz: Problem Set #45 questions
Optional Theory Problems (Week 4)10 min
Recall that a set H of hash functions (mapping the elements of a universe U to the buckets {0,1,2,…,n−1}) is universal if for every distinct x,y∈U, the probability Prob[h(x)=h(y)] that x and y collide, assuming that the hash function h is chosen uniformly at random from H, is at most 1/n. In this problem you will prove that a collision probability of 1/n is essentially the best possible. Precisely, suppose that H is a family of hash functions mapping U to {0,1,2,…,n−1}, as above. Show that there must be a pair x,y∈U of distinct elements such that, if h is chosen uniformly at random from H, then Prob[h(x)=h(y)]≥1n−1U.
Programming Assignment #4
Quiz: Programming Assignment #41 question
Final Exam (1 attempt per 24 hours)
Info and FAQ for final exam10 min
 Unlike the problem sets, you have only ONE attempt per 24 hours. Thus DO NOT START THE EXAM until you are ready.
 There are 10 questions, and each is worth 1 point. Many are of the “select all that apply” type, rather than the strict multiple choice format that is more common on the problem sets. You will receive partial credit for each option that you leave correctly checked or correctly unchecked. (So if you mark 4 out of the 5 options for a problem correctly, you’ll receive 0.8 out of the 1 point.) You need 7 points total to pass the exam. All questions are about material covered in the required (i.e., nonoptional) videos.
 Roughly half of the questions follow fairly directly from the lecture material (assuming that you understand it thoroughly). Roughly a quarter of the questions are variations on problem set questions. Roughly a quarter of the questions demand more thought, for example by asking you to consider whether certain results from lecture do or do not extend to more general situations.
 Good luck!
Quiz: Final Exam10 questions
Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming
Course can be found here
Lecture slides and solutions can be found here
About this course: The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).
Who is this class for: Learners with at least a little bit of programming experience who want to learn the essentials of algorithms. In a University computer science curriculum, this course is typically taken in the third year.
Week 1
Two motivating applications;
selected review;
introduction to greedy algorithms;
a scheduling application;
Prim’s MST algorithm.
XVII. TWO MOTIVATING APPLICATIONS (Week 1)
Week 1 Overview 10 min
WELCOME: Welcome to Part 3 of the Algorithms Specialization: Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming! Like the previous two parts, the course will have four weeks of lectures and assignments, followed by a final exam. Here are the highlights of the course’s first week.
TWO MOTIVATING APPLICATIONS: We begin with a fairly nontechnical discussion of two motivating applications — distributed shortestpath routing in the Internet, and sequence alignment — to build excitement for the tools that you’ll acquire later in this course (and also in Part 4 of the specialization).
INTRODUCTION TO GREEDY ALGORITHMS: The focus of this week and the next is the greedy algorithm design paradigm. These two nontechnical videos discuss the pros and cons of this paradigm and describe a cool application to the optimal management of the contents of a cache.
A SCHEDULING APPLICATION: Scheduling problems come up all the time (e.g., how should a shared resource be allocated?) and greedy algorithms are often useful for them. We’ll discuss a specific success story — minimizing the weighted sum of completion times of a bunch of tasks — in detail. The correctness proof furnishes a particularly clean example of an “exchange argument.”
PRIM’S MST ALGORITHM: The minimum spanning tree (MST) problem, in addition to enjoying several applications, is a uniquely great problem for the study of greedy algorithms. Unusually, several different greedy algorithms always compute an optimal solution. We begin here with the Dijkstraesque Prim’s algorithm. The correctness proof requires understanding the subtly beautiful structure of cuts in graphs, while its blazingly fast implementation relies on a deft application of the heap data structure.
VIDEOS AND SLIDES: A reminder that videos can be streamed or downloaded and watched offline (recommended for commutes, etc.). We are also providing PDF lecture slides (typed versions of what’s written in the lecture videos), as well as subtitle files (in English and in some cases other languages as well). And if you find yourself wishing that I spoke more quickly or more slowly, note that you can adjust the video speed to accommodate your preferred pace.
HOMEWORK #1: The first problem set consists of 5 problems, about greedy scheduling algorithms and minimum spanning trees. The first programming assignment asks you to implement some of the algorithms that we’ve covered, run them on large inputs, and enter the answer. For the seasoned programmers out there looking for an additional challenge, try doing the programming assignments in a programming language that you don’t already know!
DISCUSSION FORUMS: Discussion forums play an absolutely crucial role in massive online courses. If you have trouble understanding a lecture or completing an assignment, you should turn to the forums for help. After you’ve mastered the lectures and assignments for a given week, I hope you’ll contribute to the forums and help out your fellow learners. While I won’t have time to carefully monitor the discussion forums, I’ll check in and answer questions whenever I find the time.
SUGGESTED READINGS FOR WEEK 1: Abbreviations in suggested readings refer to the following textbooks:
 CLRS  Cormen, Leiserson, Rivest, and Stein, Introdution to Algorithms (3rd edition)
 DPV  Dasgupta, Papadimitriou, and Vazirani, Algorithms
 KT  Kleinberg and Tardos, Algorithm Design
 SW  Sedgewick and Wayne, Algorithms (4th edition)
 CLRS: Chapter 16 (Sections 1 and 2) and Chapter 23
 DPV: Sections 5.1.1, 5.1.2, and 5.1.5
 KT: Sections 4.1, 4.2, 4.3, and 4.5
 SW: Section 4.3
Overview, Resources, and Policies 10 min
Course overview
Algorithms are the heart of computer science, and the subject has countless practical applications as well as intellectual depth. This specialization is an introduction to algorithms for learners with at least a little programming experience. The specialization is rigorous but emphasizes the big picture and conceptual understanding over lowlevel implementation and mathematical details. After completing this specialization, you will have a greater mastery of algorithms than almost anyone without a graduate degree in the subject. Specific topics in Part 3 of the specialization include: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).
Resources
Mathematics for Computer Science (by Eric Lehman and Tom Leighton): https://www.cs.princeton.edu/courses/archive/fall06/cos341/handouts/mathcs.pdf
Course Policies
Video Lectures
 Lectures will be made available weekly, for a total of four weeks. In a given week, there will generally be two hours of required material, plus additional optional material (some review, some advanced topics).
 Below each lecture video there is a PDF of typed versions of the slides.
 You can download the videos of the lectures for offline viewing.
 The video player supports speeding up and slowing down the pace.
Weekly Programming Assignments and Problem Sets
 Each week there will be a new problem set and a new programming assignment.
 For each problem set you are allowed a maximum of two attempts in a 12hour period (we’ll use the best score).
 For each programming assignment you’re allowed a maximum of 10 attempts in a 12hour period (we’ll use the best score).
 For the final exam, you’re allowed one attempt per 24 hours.
Grading
 To pass a problem set, you must get at least 4 of the 5 questions correct (80%).
 To pass a programming assignment, you must get all of the answers correct (100%).
 To pass the final exam, you must get at least 70% of the total points (14 out of 20).
 To pass the course, you must pass all of the problem sets, all of the programming assignments, and the final exam.
Theory Problems
 These are totally optional theory questions (no deadlines or credit given).
 We encourage you to attempt these questions and discuss them in the forums to develop a deeper understanding of the design and analysis of algorithms.
Lecture slides 10 min
See the zip file below (which includes slides for both Part 3 and Part 4 of the specialization). Also available from the video lectures on a lecturebylecture basis.
Application: Internet Routing 10 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/0VcrE/applicationinternetrouting
Dijkstra’s algorithm
BellmanFord algorithm
Application: Sequence Alignment 8 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/ekVkk/applicationsequencealignment
Bruteforce search: Try all possible alignments, remember the best
one.
XVIII. INTRODUCTION TO GREEDY ALGORITHMS (Week 1)
Introduction to Greedy Algorithms 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/WHe2b/introductiontogreedyalgorithms
DANGER: Most greedy algorithms are NOT correct. (Even if your
intuition says otherwise!)
Application: Optimal Caching 10 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/VMnNW/applicationoptimalcaching
The “furthestinfuture” algorithm is optimal (i.e., minimizes the number of cache misses).
Least Recently Used (LRU)
XIX. A SCHEDULING APPLICATION (Week 1)
Problem Definition 5 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/ZvZGZ/problemdefinition
Goal: Minimize the weighted sum of completion times:
A Greedy Algorithm 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/Jo6gK/agreedyalgorithm
Guess (1): Order jobs by decreasing value of wj  lj .
Guess (2): Order wj/lj .
Alg#1 not (always) correct.
Alg#2 (order by decreasing ratio wj=lj ‘s) is always correct. not obvious!  proof coming up next
Correctness Proof  Part I 6 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/rFQ5P/correctnessproofparti
Proof: By an Exchange Argument.
Correctness Proof  Part II 4 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/mrXwO/correctnessproofpartii
CostBenefit Analysis,
Upshot:
 Cost of exchange wi lj . [Ci goes up by lj ]
 Benefit of exchange is wj li . [Cj goes down by li ]
Note: i > j $\rightarrow$ wi/li < wj/lj $\rightarrow$ wi lj < wj li $\rightarrow$ COST < BENEFIT $\rightarrow$ Swap improves $\sigma^{\star}$, contradicts optimality of $\sigma^{\star}$.
Handling Ties [Advanced  Optional] 7 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/YuoAV/handlingtiesadvancedoptional
XX. PRIM’S MINIMUM SPANNING TREE ALGORITHM (Week 1)
MST Problem Definition 11 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/9D5ML/mstproblemdefinition
spanning tree
Prim’s MST Algorithm 7 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/tQ6gK/primsmstalgorithm
Prim’s MST Algorithm
Correctness Proof I 15 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/15UXn/correctnessproofi
Proof of Prim’s algorithm outputs a spanning tree.
(1) Algorithm maintains invariant that T spans X
(2) Can’t get stuck with X $\neq$ V
(3) No cycles ever get created in T.
Correctness Proof II 8 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/hYzal/correctnessproofii
Proof of Prim’s algorithm always outputs a minimumcost
spanning tree.
The Cut Property
Proof of Cut Property [Advanced  Optional] 11 min
Fast Implementation I 14 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/bYMq1/fastimplementationi
why O(m) time per iteration?
Because computer does not know which vertice is in or not in X and VX, so computer iterate through all edges which v in X and w in VX.
Check: Can initialize heap with O( m + n log n ) = O(mlog n)
preprocessing.
m: iterate through m edges and put it into heap
nlogn: might be time of heap insert operation
Fast Implementation II 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/qzdR8/fastimplementationii
O(mlog n) time
m: (n  1) Inserts during preprocessing
(n  1) ExtractMins (one per iteration of while loop)
O[(n1) + (n1) log n] = O(m+mlog n) = O(mlog n)
Problem Set #1
Quiz: Problem Set #1 5 questions
QUIZ
Problem Set #1
5 questions
To Pass80% or higher
Attempts2 every 12 hours
Deadline
July 23, 11:59 PM PDT
1 point
1.We are given as input a set of $n$ requests (e.g., for the use of an auditorium), with a known start time $s_i$ and finish time $t_i$ for each request $i$. Assume that all start and finish times are distinct. Two requests conflict if they overlap in time — if one of them starts between the start and finish times of the other. Our goal is to select a maximumcardinality subset of the given requests that contains no conflicts. (For example, given three requests consuming the intervals [0,3], [2,5], and [4,7], we want to return the first and third requests.) We aim to design a greedy algorithm for this problem with the following form: At each iteration we select a new request i, including it in the solutionsofar and deleting from future consideration all requests that conflict with i.
Which of the following greedy rules is guaranteed to always compute an optimal solution?
This should not be selected
Exercise: find a counterexample that uses only three requests.
This should not be selected
There are counterexamples, though finding one requires a bit of work!
Correct
Let Rj denote the requests with the j earliest finish times. Prove by induction on j that this greedy algorithm selects the maximumnumber of nonconflicting requests from Sj.
This should not be selected
What if there is a request with start time 0 and an extremely large finish time?
1 point
2.We are given as input a set of $n$ jobs, where job $j$ has a processing time $p_j$ and a deadline $d_j$. Recall the definition of completion times $C_j$ from the video lectures. Given a schedule (i.e., an ordering of the jobs), we define the lateness $l_j$ of job $j$ as the amount of time $C_j−d_j$ after its deadline that the job completes, or as 0 if Cj≤dj. Our goal is to minimize the maximum lateness, maxj$l_j$.
Which of the following greedy rules produces an ordering that minimizes the maximum lateness? You can assume that all processing times and deadlines are distinct.
This should not be selected
One of the proposed greedy rules works. Try an exchange argument.
Correct
Proof by an exchange argument, analogous to minimizing the weighted sum of completion times.
1 point
3.In this problem you are given as input a graph T=(V,E) that is a tree (that is, T is undirected, connected, and acyclic). A perfect matching of T is a subset F$\subset$E of edges such that every vertex v∈V is the endpoint of exactly one edge of F. Equivalently, F matches each vertex of T with exactly one other vertex of T. For example, a path graph has a perfect matching if and only if it has an even number of vertices.
Consider the following two algorithms that attempt to decide whether or not a given tree has a perfect matching. The degree of a vertex in a graph is the number of edges incident to it. (The two algorithms differ only in the choice of v in line 5.)
Algorithm A:


Algorithm B:


Is either algorithm correct?
This should not be selected
Consider a threehop path.
This should not be selected
Try to prove correctness by induction.
This should not be selected
Consider a threehop path.
Correct
Algorithm A can fail, for example, on a threehop path. Correctness of algorithm B can be proved by induction on the number of vertices in T. Note that the tree property is used to argue that there must be a vertex with degree 1; if there is a perfect matching, it must include the edge incident to this vertex.
1 point
4.Consider an undirected graph G=(V,E) where every edge e∈E has a given cost ce. Assume that all edge costs are positive and distinct. Let T be a minimum spanning tree of G and P a shortest path from the vertex s to the vertex t. Now suppose that the cost of every edge e of G is increased by 1 and becomes ce+1. Call this new graph G′. Which of the following is true about G′?
Correct
The positive statement has many proofs (e.g., via the Cut Property). For the negative statement,
think about two different paths from s to t that contain a different number of edges.
This should not be selected
Think about two different paths from s to t that contain a different number of edges.
This should not be selected
What does the Cut Property tell you about G′ vs. G?
1 point
5.Suppose T is a minimum spanning tree of the connected graph G. Let H be a connected induced subgraph of G. (I.e., H is obtained from G by taking some subset S$\subseteq$V of vertices, and taking all edges of E that have both endpoints in S. Also, assume H is connected.) Which of the following is true about the edges of T that lie in H? You can assume that edge costs are distinct, if you wish. [Choose the strongest true statement.]
This should not be selected
Suppose G is a triangle and H is an edge.
This should not be selected
How do you know that they form a connected subgraph of H?
Correct
Proof via the Cut Property (cuts in G correspond to cuts in H with only fewer crossing edges).
This should not be selected
Think about the Cut Property.
Optional Theory Problems (Week 1)10 min
The following problems are for those of you looking to challenge yourself beyond the required problem sets and programming questions. They are completely optional and will not be graded. While they vary in level, many are pretty challenging, and we strongly encourage you to discuss ideas and approaches with your fellow students on the “Theory Problems” discussion forum.
 Consider a connected undirected graph G with not necessarily distinct edge costs. Consider two different minimumcost spanning trees of G, T and T′. Is there necessarily a sequence of minimumcost spanning trees T=T0,T1,T2,…,Tr=T′ with the property that each consecutive pair Ti,Ti+1 of MSTs differ by only a single edge swap? Prove the statement or exhibit a counterexample.
 Consider the following algorithm. The input is a connected undirected graph with edge costs (distinct, if you prefer). The algorithm proceeds in iterations. If the current graph is a spanning tree, then the algorithm halts. Otherwise, it picks an arbitrary cycle of the current graph and deletes the most expensive edge on the cycle. Is this algorithm guaranteed to compute a minimumcost spanning tree? Prove it or exhibit a counterexample.
 Consider the following algorithm. The input is a connected undirected graph with edge costs (distinct, if you prefer). The algorithm proceeds in phases. Each phase adds some edges to a treesofar and reduces the number of vertices in the graph (when there is only 1 vertex left, the MST is just the empty set). In a phase, we identify the cheapest edge ev incident on each vertex v of the current graph. Let F={ev} be the collection of all such edges in the current phase. Obtain a new (smaller) graph by contracting all of the edges in F — so that each connected component of F becomes a single vertex in the new graph — discarding any selfloops that result. Let T denote the union of all edges that ever get contracted in a phase of this algorithm. Is T guaranteed to be a minimumcost spanning tree? Prove it or exhibit a counterexample.
Programming Assignment #1
Quiz: Programming Assignment #1 3 questions
QUIZ
Programming Assignment #1
3 questions
To Pass100% or higher
Attempts10 every 12 hours
Deadline
July 23, 11:59 PM PDT
code can be found here
1 point
1.In this programming problem and the next you’ll code up the greedy algorithms from lecture for minimizing the weighted sum of completion times..
Download the text file below.
jobs.txt
This file describes a set of jobs with positive and integral weights and lengths. It has the format
[number_of_jobs]
[job_1_weight] [job_1_length]
[job_2_weight] [job_2_length]
…
For example, the third line of the file is “74 59”, indicating that the second job has weight 74 and length 59.
You should NOT assume that edge weights or lengths are distinct.
Your task in this problem is to run the greedy algorithm that schedules jobs in decreasing order of the difference (weight  length). Recall from lecture that this algorithm is not always optimal. IMPORTANT: if two jobs have equal difference (weight  length), you should schedule the job with higher weight first. Beware: if you break ties in a different way, you are likely to get the wrong answer. You should report the sum of weighted completion times of the resulting schedule — a positive integer — in the box below.
ADVICE: If you get the wrong answer, try out some small test cases to debug your algorithm (and post your test cases to the discussion forum).
1 point
2.For this problem, use the same data set as in the previous problem.
Your task now is to run the greedy algorithm that schedules jobs (optimally) in decreasing order of the ratio (weight/length). In this algorithm, it does not matter how you break ties. You should report the sum of weighted completion times of the resulting schedule — a positive integer — in the box below.
1 point
3.In this programming problem you’ll code up Prim’s minimum spanning tree algorithm.
Download the text file below.
edges.txt
This file describes an undirected graph with integer edge costs. It has the format
[number_of_nodes] [number_of_edges]
[one_node_of_edge_1] [other_node_of_edge_1] [edge_1_cost]
[one_node_of_edge_2] [other_node_of_edge_2] [edge_2_cost]
…
For example, the third line of the file is “2 3 8874”, indicating that there is an edge connecting vertex #2 and vertex #3 that has cost 8874.
You should NOT assume that edge costs are positive, nor should you assume that they are distinct.
Your task is to run Prim’s minimum spanning tree algorithm on this graph. You should report the overall cost of a minimum spanning tree — an integer, which may or may not be negative — in the box below.
IMPLEMENTATION NOTES: This graph is small enough that the straightforward O(mn) time implementation of Prim’s algorithm should work fine. OPTIONAL: For those of you seeking an additional challenge, try implementing a heapbased version. The simpler approach, which should already give you a healthy speedup, is to maintain relevant edges in a heap (with keys = edge costs). The superior approach stores the unprocessed vertices in the heap, as described in lecture. Note this requires a heap that supports deletions, and you’ll probably need to maintain some kind of mapping between vertices and their positions in the heap.
Week 2
Kruskal’s MST algorithm and applications to clustering;
advanced unionfind (optional).
XXI. KRUSKAL’S MINIMUM SPANNING TREE ALGORITHM (Week 2)
Week 2 Overview 10 min
KRUSKAL’S MST ALGORITHM: Last week we covered Prim’s MST algorithm and a blazingly fast implementation of it. There are several reasons for studying a second greedy MST algorithm, due to Kruskal. First, it’s a beautiful algorithm, worthy of a greatest hits compilation. Second, to obtain a superfast implementation of it, we’ll need to learn about the simple but fundamental “UnionFind” data structure. The third reason is covered in the next section…
CLUSTERING: Clustering is an important form of unsupervised learning (i.e., extracting patterns from unlabeled data). These two videos discuss how Kruskal’s MST algorithm suggests flexible and useful greedy approaches to clustering problems.
UNIONFIND LECTURES: This is purely optional material about advanced implementations and analysis of the unionfind data structure. For those of you looking for some seriously nextlevel (but beautiful) material, check it out when you get a chance.
HOMEWORK #2: The second problem set is all about MSTs. The second programming assignment asks you to implement the greedy clustering algorithm from lecture. For part (a), a straightforward implementation should suffice. Part (b) involves a graph that is likely too big to fit in your computer’s memory, so answering this part might take some ingenuity.
SUGGESTED READINGS FOR WEEK 2:
CLRS Chapter 21 and Chapter 23 (Section 2)
DPV Sections 5.1.3 and 5.1.4
KT Sections 4.54.7
SW Sections 1.5 and 4.3
Kruskal’s MST Algorithm 7 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/PLdBf/kruskalsmstalgorithm
Kruskal’s MST Algorithm
Correctness of Kruskal’s Algorithm 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/U3ukN/correctnessofkruskalsalgorithm
Implementing Kruskal’s Algorithm via UnionFind I 9 min
FIND(X): Return name of group that X belongs to.
UNION(Ci , Cj ): Fuse groups Ci , Cj into a single one.
Implementing Kruskal’s Algorithm via UnionFind II 13 min
O(mlog n) total (Matching Prim’s algorithm)
MSTs: StateoftheArt and Open Questions [Advanced  Optional] 9 min
XXII. CLUSTERING (Week 2)
Application to Clustering 11 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/QWubN/applicationtoclustering
Call points p & q separated if they’re assigned to different clusters.
 Initially, each point in a separate cluster
 Repeat until only k clusters:
 Let p, q = closest pair of separated points (determines the
current spacing)
Merge the clusters containing p & q into a single cluster.
Correctness of Clustering Algorithm 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/7lWTf/correctnessofclusteringalgorithm
XXIII. ADVANCED UNIONFIND (Week 2)
Lazy Unions [Advanced  Optional] 10 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/WX3tg/lazyunionsadvancedoptional
UnionbyRank [Advanced  Optional] 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/RF7aH/unionbyrankadvancedoptional
Analysis of UnionbyRank [Advanced  Optional] 14 min
Path Compression [Advanced  Optional]14 min
Path Compression: The HopcroftUllman Analysis I [Advanced  Optional]9 min
Path Compression: The HopcroftUllman Analysis II [Advanced  Optional]11 min
The Ackermann Function [Advanced  Optional]16 min
Path Compression: Tarjan’s Analysis I [Advanced  Optional]14 min
Path Compression: Tarjan’s Analysis II [Advanced  Optional]13 min
Problem Set #2
Quiz: Problem Set #2 5 questions Due in 6 days
QUIZ
Problem Set #2
5 questions
To Pass80% or higher
Attempts2 every 12 hours
Deadline
July 30, 11:59 PM PDT
1 point
1.Suppose we are given a directed graph G=(V,E) in which every edge has a distinct positive edge weight. A directed graph is acyclic if it has no directed cycle. Suppose that we want to compute the maximumweight acyclic subgraph of G (where the weight of a subgraph is the sum of its edges’ weights). Assume that G is weakly connected, meaning that there is no cut with no edges crossing it in either direction.
Here is an analog of Prim’s algorithm for directed graphs. Start from an arbitrary vertex s, initialize S={s} and F=∅. While S≠V, find the maximumweight edge (u,v) with one endpoint in S and one endpoint in V−S. Add this edge to F, and add the appropriate endpoint to S.
Here is an analog of Kruskal’s algorithm. Sort the edges from highest to lowest weight. Initialize F=∅. Scan through the edges; at each iteration, add the current edge i to F if and only if it does not create a directed cycle.
Which of the following is true?
This should not be selected
Four vertices suffice for a counterexample.
Correct
Indeed. Any ideas for a correct algorithm?
1 point
2.Consider a connected undirected graph G with edge costs that are not necessarily distinct. Suppose we replace each edge cost $c_e$ by $−c_e$; call this new graph G′. Consider running either Kruskal’s or Prim’s minimum spanning tree algorithm on G′, with ties between edge costs broken arbitrarily, and possibly differently, in each algorithm. Which of the following is true?
Correct
Different tiebreaking rules generally yield different spanning trees.
This should not be selected
Does the spanning tree produced by either algorithm depend on how ties between edge costs are broken?
1 point
3.Consider the following algorithm that attempts to compute a minimum spanning tree of a connected undirected graph G with distinct edge costs. First, sort the edges in decreasing cost order (i.e., the opposite of Kruskal’s algorithm). Initialize T to be all edges of G. Scan through the edges (in the sorted order), and remove the current edge from T if and only if it lies on a cycle of T.
Which of the following statements is true?
Correct
During the iteration in which an edge is removed, it was on a cycle C of T. By the sorted ordering, it must be the maximumcost edge of C. By an exchange argument, it cannot be a member of any minimum spanning tree. Since every edge deleted by the algorithm belongs to no MST, and its output is a spanning tree (no cycles by construction, connected by the Lonely Cut Corollary), its output must be the (unique) MST.
1 point
4.Consider an undirected graph G=(V,E) where edge e∈E has cost $c_e$. A minimum bottleneck spanning tree T is a spanning tree that minimizes the maximum edge cost $max_{e∈T}c_e$. Which of the following statements is true? Assume that the edge costs are distinct.
Correct
For the positive statement, recall the following (from correctness of Prim’s algorithm): for every edge e of the MST, there is a cut (A,B) for which e is the cheapest one crossing it. This implies that every other spanning tree has maximum edge cost at least as large. For the negative statement, use a triangle with one extra highcost edge attached.
This should not be selected
Is the minimum bottleneck spanning tree unique (even with distinct edge costs)?
1 point
5.You are given a connected undirected graph G with distinct edge costs, in adjacency list representation. You are also given the edges of a minimum spanning tree T of G. This question asks how quickly you can recompute the MST if we change the cost of a single edge. Which of the following are true? [RECALL: It is not known how to deterministically compute an MST from scratch in O(m) time, where m is the number of edges of G.] [Check all that apply.]
Correct
Let A,B be the two connected components of T−{e}. Edge e no longer belongs to the new MST if and only if it is no longer the cheapest edge crossing the cut (A,B) (this can be checked in O(m) time). If f is the new cheapest edge crossing (A,B), then the new MST is T−{e}∪{f}.
Correct
The MST does not change (by the Cut Property), so no recomputation is needed.
Correct
Let C be the cycle of T∪{e}. Edge e belongs to the new MST if and only if it is no longer the most expensive edge of C (this can be checked in O(n) time). If f is the new most expensive edge of C, then the new MST is T∪{e}−{f}.
Correct
The MST does not change (by the Cycle Property of the previous problem), so no recomputation is needed.
Optional Theory Problems (Week 2) 10 min
The following problems are for those of you looking to challenge yourself beyond the required problem sets and programming questions. They are completely optional and will not be graded. While they vary in level, many are pretty challenging, and we strongly encourage you to discuss ideas and approaches with your fellow students on the “Theory Problems” discussion forum.
 Consider a connected undirected graph G with edge costs, which need not be distinct. Prove the following statement or provide a counterexample: for every MST T of G, there exists a way to sort G’s edges in nondecreasing order of cost so that Kruskal’s algorithm outputs the tree T.
 Recall the definition of a minimum bottleneck spanning tree from Problem Set #2. Give a lineartime (i.e., O(m)) algorithm for computing a minimum bottleneck spanning tree of a connected undirected graph. [Hint: make use of a nontrivial lineartime algorithm discussed in Part 1 of the specialization.]
 Consider a connected undirected graph G with distinct edge costs that are positive integers between 1 and n3, where n is the number of vertices of G. How fast can you compute the MST of G?
 Read about matroids correctly computes a maximumweight basis. For the matroid of spanning trees of a graph, this algorithm becomes Kruskal’s algorithm. Can you formulate an analog of Prim’s MST algorithm for matroids?
 Prove that our analysis of unionfind with lazy unions and union by rank (but without path compression) is asymptotically optimal (i.e., there are sequences of operations where you do Θ(logn) work on most of the operations).
 Prove that in our unionfind data structure with lazy unions, union by rank, and path compression, some operations might require Θ(logn) time.
Programming Assignment #2
Quiz: Programming Assignment #2 2 questions Due in 6 days
QUIZ
Programming Assignment #2
2 questions
To Pass100% or higher
Attempts10 every 12 hours
Deadline
July 30, 11:59 PM PDT
1 point
1.In this programming problem and the next you’ll code up the clustering algorithm from lecture for computing a maxspacing kclustering.
Download the text file below.
This file describes a distance function (equivalently, a complete graph with edge costs). It has the following format:
[number_of_nodes]
[edge 1 node 1] [edge 1 node 2] [edge 1 cost]
[edge 2 node 1] [edge 2 node 2] [edge 2 cost]
…
There is one edge (i,j) for each choice of 1≤i<j≤n, where n is the number of nodes.
For example, the third line of the file is “1 3 5250”, indicating that the distance between nodes 1 and 3 (equivalently, the cost of the edge (1,3)) is 5250. You can assume that distances are positive, but you should NOT assume that they are distinct.
Your task in this problem is to run the clustering algorithm from lecture on this data set, where the target number k of clusters is set to 4. What is the maximum spacing of a 4clustering?
ADVICE: If you’re not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
1 point
2.In this question your task is again to run the clustering algorithm from lecture, but on a MUCH bigger graph. So big, in fact, that the distances (i.e., edge costs) are only defined implicitly, rather than being provided as an explicit list.
The data set is below.
The format is:
[# of nodes] [# of bits for each node’s label]
[first bit of node 1] … [last bit of node 1]
[first bit of node 2] … [last bit of node 2]
…
For example, the third line of the file “0 1 1 0 0 1 1 0 0 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1” denotes the 24 bits associated with node #2.
The distance between two nodes u and v in this problem is defined as the Hamming distance— the number of differing bits — between the two nodes’ labels. For example, the Hamming distance between the 24bit label of node #2 above and the label “0 1 0 0 0 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 0 1” is 3 (since they differ in the 3rd, 7th, and 21st bits).
The question is: what is the largest value of k such that there is a kclustering with spacing at least 3? That is, how many clusters are needed to ensure that no pair of nodes with all but 2 bits in common get split into different clusters?
NOTE: The graph implicitly defined by the data file is so big that you probably can’t write it out explicitly, let alone sort the edges by cost. So you will have to be a little creative to complete this part of the question. For example, is there some way you can identify the smallest distances without explicitly looking at every pair of nodes?
https://github.com/dbarabanov/coursera/blob/master/algorithms_2/assignment_2/question_1.py
Week 3
Huffman codes; introduction to dynamic programming.
XXIV. HUFFMAN CODES (Week 3)
Week 3 Overview 10 min
HUFFMAN CODES: Everybody loves compression. It means more songs on your smartphone, faster downloads, and so on. Huffman coding is a fundamental type of lossless compression, used for example in the MP3 standard. In these videos we’ll learn about the optimality of Huffman codes, and a blazingly fast greedy algorithm for computing them.
DYNAMIC PROGRAMMING: This week also introduces the dynamic programming design paradigm; next week we’ll see a selection of killer applications. For now, we content ourselves with the development of a lineartime algorithm for a relatively simple problem, computing a maximumweight independent set of a path graph. We conclude the week by zooming out and identifying the key principles of dynamic programming.
HOMEWORK #3: The third problem set and programming assignment reinforce the concepts and algorithms studied in the lectures, and the programming assignment asks you to implement the greedy algorithm for Huffman coding and the dynamic programming algorithm for finding a maximumweight independent set of a path.
SUGGESTED READINGS FOR WEEK 3:
 CLRS Chapter 16 (Section 3) and Chapter 15 (Section 3)
 DPV Sections 5.2 and 6.7
 KT Sections 4.8, 6.1, 6.2
 SW Section 5.5
Introduction and Motivation 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/plgXS/introductionandmotivation
fixedlength
variablelength(prefix free)
Problem Definition 10 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/IWxVe/problemdefinition
A binary tree T minimizing the average encoding length
L.
A Greedy Algorithm 16 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/ZIJwv/agreedyalgorithm
Huffman’s Algorithm
A More Complex Example 4 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/rTB4s/amorecomplexexample
Correctness Proof I 10 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/eSz8f/correctnessproofi
Correctness Proof II 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/l3Ss5/correctnessproofii
XXV. INTRODUCTION TO DYNAMIC PROGRAMMING (Week 3)
Introduction: Weighted Independent Sets in Path Graphs 7 min
maxweight independent set
WIS in Path Graphs: Optimal Substructure 9 min
Upshot: A maxweight IS must be either
(i) a maxweight IS of G%\prime% or
(ii) $v_n$ + a maxweight IS of G%\prime\prime%
WIS in Path Graphs: A LinearTime Algorithm 9 min
A[0] = 0; A[1] = w1
For i = 2, 3, … n:
A[i] = max{ A[i  1] , A[i  2] + wi }
WIS in Path Graphs: A Reconstruction Algorithm 6 min
 Let S = Null;
 While i >= 1 [scan through array from right to left]
 If A[i  1] >= A[i  2] + wi [i.e. case 1 wins]
 Decrease i by 1
 Else [i.e., case 2 wins]
 Add vi to S, decrease i by 2
 Return S
Principles of Dynamic Programming 7 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/VEc7L/principlesofdynamicprogramming
Problem Set #3
Quiz: Problem Set #35 questions
QUIZ
Problem Set #3
5 questions
To Pass80% or higher
Attempts2 every 12 hours
Deadline
August 6, 11:59 PM PDT
1 point
1.Consider an alphabet with five letters, {a,b,c,d,e}, and suppose we know the frequencies fa=0.32, fb=0.25, fc=0.2, fd=0.18, and fe=0.05. What is the expected number of bits used by Huffman’s coding scheme to encode a 1000letter document?
a:0.32, b:0.25, c:0.2, de:0.23
a:0.32, b:0.25, cde: 0.43
ab: 0.57, cde: 0.43
abcde
a:00, b:01, c:10, d: 110, e:111
0.322 + 0.252 + 0.22 + 0.183 + 0.053 = 2.23
2.231000 = 2230
1 point
2.Under a Huffman encoding of n symbols, how long (in terms of number of bits) can a codeword possibly be?
This should not be selected
This is the bestcase scenario for the length of the longest codeword; in general the longest codeword can be longer than this.
Correct
For the lower bound, take frequencies proportional to powers of 2. For the upper bound, note that the total number of merges is exactly n−1.
1 point
3.Which of the following statements holds for Huffman’s coding scheme?
Such a letter will endure a merge in at least two iterations: the last one (which involves all letters), and at least one previous iteration. In the penultimate iteration, if the letter has not yet endured a merge, at least one of the two other remaining subtrees has cumulative frequency at least (1−.33)/2>.33, so the letter will get merged in this iteration.
This should not be selected
Counterexample: frequencies 40%, 40%, and 20%.
This should not be selected
Counterexample: frequencies 40%, 30%, and 30%.
1 point
4.Which of the following is true for our dynamic programming algorithm for computing a maximumweight independent set of a path graph? (Assume there are no ties.)
Correct
By induction, since the optimal solution to a subproblem depends only on the solutions of the previous two subproblems.
1 point
5.Recall our dynamic programming algorithm for computing the maximumweight independent set of a path graph. Consider the following proposed extension to more general graphs. Consider an undirected graph with positive vertex weights. For a vertex v, obtain the graph G′(v) by deleting v and its incident edges from G, and obtain the graph G′′(v) from G by deleting v, its neighbors, and all of the corresponding incident edges from G. Let OPT(H) denote the value of a maximumweight independent set of a graph H. Consider the formula OPT(G)=max{OPT(G′(v)),wv+OPT(G′′(v))}, where v is an arbitrary vertex of G of weight wv. Which of the following statements is true?
This should not be selected
Do weighted independent sets in trees have nice optimal substructure?
Correct
Indeed. What running time can you get?
Programming Assignment #3
Quiz: Programming Assignment #33 questions
QUIZ
Programming Assignment #3
3 questions
To Pass100% or higher
Attempts 10 every 12 hours
Deadline
August 6, 11:59 PM PDT
1 point
1.In this programming problem and the next you’ll code up the greedy algorithm from the lectures on Huffman coding.
Download the text file below.
huffman.txt
This file describes an instance of the problem. It has the following format:
[number_of_symbols]
[weight of symbol #1]
[weight of symbol #2]
…
For example, the third line of the file is “6852892,” indicating that the weight of the second symbol of the alphabet is 6852892. (We’re using weights instead of frequencies, like in the “A More Complex Example” video.)
Your task in this problem is to run the Huffman coding algorithm from lecture on this data set. What is the maximum length of a codeword in the resulting Huffman code?
ADVICE: If you’re not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
1 point
2.Continuing the previous problem, what is the minimum length of a codeword in your Huffman code?
1 point
3.In this programming problem you’ll code up the dynamic programming algorithm for computing a maximumweight independent set of a path graph.
Download the text file below.
mwis.txt
This file describes the weights of the vertices in a path graph (with the weights listed in the order in which vertices appear in the path). It has the following format:
[number_of_vertices]
[weight of first vertex]
[weight of second vertex]
…
For example, the third line of the file is “6395702,” indicating that the weight of the second vertex of the graph is 6395702.
Your task in this problem is to run the dynamic programming algorithm (and the reconstruction procedure) from lecture on this data set. The question is: of the vertices 1, 2, 3, 4, 17, 117, 517, and 997, which ones belong to the maximumweight independent set? (By “vertex 1” we mean the first vertex of the graph—there is no vertex 0.) In the box below, enter a 8bit string, where the ith bit should be 1 if the ith of these 8 vertices is in the maximumweight independent set, and 0 otherwise. For example, if you think that the vertices 1, 4, 17, and 517 are in the maximumweight independent set and the other four vertices are not, then you should enter the string 10011010 in the box below.
Week 4
Advanced dynamic programming:
the knapsack problem,
sequence alignment,
and optimal binary search trees.
XXVI. THE KNAPSACK PROBLEM (Week 4)
Week 4 Overview 10 min
DYNAMIC PROGRAMMING BOOT CAMP: This week is devoted to the dynamic programming design paradigm and a selection of killer applications: the famous Knapsack problem in Part XXVI, the sequence alignment problem in Part XXVII, and computing optimal binary search trees in Part XXVIII.
HOMEWORK #4: The fourth problem set and programming assignment reinforce the concepts and algorithms studied in the lectures, and the programming assignment asks you to implement a dynamic programming algorithm for the Knapsack problem.
SUGGESTED READINGS FOR WEEK 3:
 CLRS Chapter 15
 DPV Chapter 6
 KT Sections 6.36.6
The Knapsack Problem 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/LIgLJ/theknapsackproblem
Developing a Dynamic Programming Algorithm
Step 1: Formulate recurrence [optimal solution as function of
solutions to smaller subproblems] based on a structure of an
optimal solution.
case 1 and case 2 with proofed
A Dynamic Programming Algorithm 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/0n68L/adynamicprogrammingalgorithm
knapsack algorithm
Example [Review  Optional] 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/LADQc/examplereviewoptional
XXVII. SEQUENCE ALIGNMENT (Week 4)
Optimal Substructure 13 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/QJkyp/optimalsubstructure
A Dynamic Programming Algorithm 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/tNmae/adynamicprogrammingalgorithm
XXVIII. OPTIMAL BINARY SEARCH TREES (Week 4)
Problem Definition 12 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/GKCeN/problemdefinition
Optimal Substructure 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/rUDLu/optimalsubstructure
Proof of Optimal Substructure 6 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/0qjbs/proofofoptimalsubstructure
A Dynamic Programming Algorithm I 9 min
https://www.coursera.org/learn/algorithmsgreedy/lecture/3wrTN/adynamicprogrammingalgorithmi
A Dynamic Programming Algorithm II9 min
Problem Set #4
Quiz: Problem Set #4 5 questions
QUIZ
Problem Set #4
5 questions
To Pass80% or higher
Attempts2 every 12 hours
Deadline
August 13, 11:59 PM PDT
1 point
1.Consider a variation of the Knapsack problem where we have two knapsacks, with integer capacities W1 and W2. As usual, we are given n items with positive values and positive integer weights. We want to pick subsets S1,S2 with maximum total value (i.e., ∑i∈S1vi+∑i∈S2vi) such that the total weights of S1 and S2 are at most W1 and W2, respectively. Assume that every item fits in either knapsack (i.e., wi≤min{W1,W2} for every item i). Consider the following two algorithmic approaches. (1) Use the algorithm from lecture to pick a a maxvalue feasible solution S1 for the first knapsack, and then run it again on the remaining items to pick a maxvalue feasible solution S2 for the second knapsack. (2) Use the algorithm from lecture to pick a maxvalue feasible solution for a knapsack with capacity W1+W2, and then split the chosen items into two sets S1,S2 that have size at most W1 and W2, respectively. Which of the following statements is true?
Correct
Indeed. Can you devise from scratch a dynamic programming algorithm that correctly solves the problem?
This should not be selected
There’s a counterexample with 4 items.
1 point
2.Recall the dynamic programming algorithms from lecture for the Knapsack and sequence alignment problems. Both fill in a twodimensional table using a doublefor loop. Suppose we reverse the order of the two for loops. (I.e., cut and paste the second for loop in front of the first for loop, without otherwise changing the text in any way.) Are the resulting algorithms still well defined and correct?
Correct
The necessary subproblem solutions are still available for constanttime lookup.
1 point
3.Consider an instance of the optimal binary search tree problem with 7 keys (say 1,2,3,4,5,6,7 in sorted order) and frequencies w1=.05,w2=.4,w3=.08,w4=.04,w5=.1,w6=.1,w7=.23. What is the minimumpossible average search time of a binary search tree with these keys?
Correct
1 point
4.The following problems all take as input two strings X and Y, of length m and n, over some alphabet Σ. Which of them can be solved in O(mn) time? [Check all that apply.]
Correct
Similar dynamic programming to sequence alignment, with one subproblem for each Xi and Yj. Alternatively, this reduces to sequence alignment by setting the gap penalty to 1 and making the penalty of matching two different characters to be very large.
Correct
This problem can be solved in O(n) time, without dynamic programming. Just count the frequency of each symbol in each string. The permutation f exists if and only if every symbol occurs exactly the same number of times in each string.
Correct
Variation on the original sequence alignment dynamic program. With each subproblem, you need to keep track of what gaps you insert, since the costs you incur in the current position depend on whether or not the previous subproblems inserted gaps. Blows up the number of subproblems and running time by a constant factor.
Correct
Similar dynamic programming to sequence alignment, with one subproblem for each Xi and Yj.
1 point
5.Recall our dynamic programming algorithms for maximumweight independent set, sequence alignment, and optimal binary search trees. The space requirements of these algorithms are proportional to the number of subproblems that get solved: Θ(n) (where n is the number of vertices), Θ(mn) (where m,n are the lengths of the two strings), and Θ(n2) (where n is the number of keys), respectively.
Suppose we only want to compute the value of an optimal solution (the final answer of the first, forward pass) and don’t care about actually reconstructing an optimal solution (i.e., we skip the second, reverse pass over the table). How much space do you then really need to run each of three algorithms?
This should not be selected
By throwing away old subproblems that you won’t ever need again, you can save space.
This should not be selected
How do you avoid keeping track of Θ(n2) subproblem solutions when computing an optimal binary search tree?
Optional Theory Problems (Week 4) 10 min
Give a dynamic programming algorithm that computes an optimal binary search tree and runs in O(n2) time.
Programming Assignment #4
Quiz: Programming Assignment #4 2 questions
QUIZ
Programming Assignment #4
2 questions
To Pass100% or higher
Attempts10 every 12 hours
Deadline
August 13, 11:59 PM PDT
1 point
1.In this programming problem and the next you’ll code up the knapsack algorithm from lecture.
Let’s start with a warmup. Download the text file below.
knapsack1.txt
This file describes a knapsack instance, and it has the following format:
[knapsack_size][number_of_items]
[value_1] [weight_1]
[value_2] [weight_2]
…
For example, the third line of the file is “50074 659”, indicating that the second item has value 50074 and size 659, respectively.
You can assume that all numbers are positive. You should assume that item weights and the knapsack capacity are integers.
In the box below, type in the value of the optimal solution.
ADVICE: If you’re not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
1 point
2.This problem also asks you to solve a knapsack instance, but a much bigger one.
Download the text file below.
knapsack_big.txt
This file describes a knapsack instance, and it has the following format:
[knapsack_size][number_of_items]
[value_1] [weight_1]
[value_2] [weight_2]
…
For example, the third line of the file is “50074 834558”, indicating that the second item has value 50074 and size 834558, respectively. As before, you should assume that item weights and the knapsack capacity are integers.
This instance is so big that the straightforward iterative implemetation uses an infeasible amount of time and space. So you will have to be creative to compute an optimal solution. One idea is to go back to a recursive implementation, solving subproblems — and, of course, caching the results to avoid redundant work — only on an “as needed” basis. Also, be sure to think about appropriate data structures for storing and looking up solutions to subproblems.
In the box below, type in the value of the optimal solution.
ADVICE: If you’re not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
Final Exam (1 attempt per 24 hours)
Info and FAQ for final exam 10 min
Unlike the problem sets, you have only ONE attempt per 24 hours. Thus DO NOT START THE EXAM until you are ready.
There are 10 questions, and each is worth 1 point. Many are of the “select all that apply” type, rather than the strict multiple choice format that is more common on the problem sets. You will receive partial credit for each option that you leave correctly checked or correctly unchecked. (So if you mark 4 out of the 5 options for a problem correctly, you’ll receive 0.8 out of the 1 point.) You need 7 points total to pass the exam. All questions are about material covered in the required (i.e., nonoptional) videos.
Roughly half of the questions follow fairly directly from the lecture material (assuming that you understand it thoroughly). Roughly a quarter of the questions are variations on problem set questions. Roughly a quarter of the questions demand more thought, for example by asking you to consider whether certain results from lecture do or do not extend to more general situations.
Good luck!
Quiz: Final Exam 10 questions
QUIZ
Final Exam
10 questions
To Pass70% or higher
Attempts1 every 1 day
Deadline
August 13, 11:59 PM PDT
1 point
1.Consider a connected undirected graph with distinct edge costs. Which of the following are true? [Check all that apply.]
correct
1 point
2.You are given a connected undirected graph G with distinct edge costs, in adjacency list representation. You are also given the edges of a minimum spanning tree T of G. This question asks how quickly you can recompute the MST if we change the cost of a single edge. Which of the following are true? [RECALL: It is not known how to deterministically compute an MST from scratch in O(m) time, where m is the number of edges of G.] [Check all that apply.]
correct
1 point
3.Which of the following graph algorithms can be sped up using the heap data structure?
correct
1 point
4.Which of the following problems reduce, in a straightforward way, to the minimum spanning tree problem? [Check all that apply.]
correct
1 point
5.Recall the greedy clustering algorithm from lecture and the maxspacing objective function. Which of the following are true? [Check all that apply.]
correct
1 point
6.We are given as input a set of n jobs, where job j has a processing time pj and a deadline dj. Recall the definition of completion times Cj from the video lectures. Given a schedule (i.e., an ordering of the jobs), we define the lateness lj of job j as the amount of time Cj−dj after its deadline that the job completes, or as 0 if Cj≤dj.
Our goal is to minimize the total lateness,
∑jlj.
Which of the following greedy rules produces an ordering that minimizes the total lateness?
You can assume that all processing times and deadlines are distinct.
WARNING: This is similar to but not identical to a problem from Problem Set #1 (the objective function is different).
correct
1 point
7.Consider an alphabet with five letters, {a,b,c,d,e}, and suppose we know the frequencies fa=0.28, fb=0.27, fc=0.2, fd=0.15, and fe=0.1. What is the expected number of bits used by Huffman’s coding scheme to encode a 1000letter document?
correct
1 point
8.Which of the following extensions of the Knapsack problem can be solved in time polynomial in n, the number of items, and M, the largest number that appears in the input? [Check all that apply.]
correct
1 point
9.The following problems all take as input two strings X and Y, of length m and n, over some alphabet Σ. Which of them can be solved in O(mn) time? [Check all that apply.]
correct
1 point
10.Consider an instance of the optimal binary search tree problem with 7 keys (say 1,2,3,4,5,6,7 in sorted order) and frequencies w1=.2,w2=.05,w3=.17,w4=.1,w5=.2,w6=.03,w7=.25. What is the minimumpossible average search time of a binary search tree with these keys?
correct