Course Director: David K. A. Mordecai

Co-instructor: Michael O’Neil

Teaching assistants: Kentaro Hanaki and Rahel Jhirad

Lecture time: Tuesday 5:10pm – 7:00pm, 5 Washington Place, Room 101

Recitation time: Thursday 6:15pm – 7:05pm, 5 zoloft Washington Place, Room 101

Email the capstone course team at

Office hours

M. O’Neil: Tues 4pm-5pm & Fri 11am-12pm, 1122 abilify dosage WWH (and by appointment)

R. Jhirad: Tues 4pm-5pm, 726 Broadway, 7th floor (and by appointment)

K. Hanaki: Fri 3pm-5pm, 726 Broadway, 7th floor (and by appointment)

D. K.

A. Mordecai: By appointment

Important Announcements

  • Check out the Machine Learning seminar at Courant
  • Project proposals due: October 7
  • Midterm presentations: TBD
  • Final presentations: TBD

Course Description

The purpose of the Capstone Project is for the students to apply theoretical knowledge acquired during the Data Science program to a project involving actual data in a realistic setting. During the project, students engage in the entire process of solving a real-world data science project: from collecting and processing actual data, to applying suitable and appropriate analytic methods to the problem. Both the problem statements for the project assignments and the datasets originate from real-world domains similar to those that students might typically encounter within industry, government, non-governmental organizations (NGOs), or academic research.

Depending on the project’s complexity, students will work individually or in small teams on a problem statement, typically specified by a faculty, industry, or governmental sponsor. The sponsor will usually be responsible for supplying the relevant data set. Research groups (both from within, as well as external to NYU) may propose projects. A list of possible projects will be posted early in the semester, so students can align themselves with problems statements corresponding to their individual interests. Pending approval by the Course Director, students are free to design their own problem statement and construct their own data set. As the project and problem statements warrant, students may be permitted to organize into teams of two to three participants. Teams larger than three will be considered for approval on a case-by-case basis. Each project team will be supervised by the Course Director (in some cases with a relevant faculty advisor) and advised by a Project Coach assigned from the academic, governmental, NGO or industry sponsor. The final problem statements and the composition of the teams will be approved by the Course Director.

Download a copy of the syllabus.


Successful completion of the following courses within the Data Science Masters program curriculum:

  • Introduction to Data Science

  • Statistical and Mathematical Methods

  • Machine Learning and computational statistics

  • Big Data

Explicit permission of the course director is also sufficient, assuming that the student has previously completed similar course work or gained experience in hands-on projects.

Tentative project sponsors

At the beginning of the semester, possible projects will be proposed by several NYU faculty members as well as several members of industry. For the Fall 2014 semester students can look forward to hearing about possible data science projects from the following affiliates:

NYU affiliates: Vasant Dhar, David Hogg, Panos Ipeirotis, Roy Lowrance, Dennis Shasha, David Sontag, Josh Tucker, and others

Industry affiliates: Bayes Impact, Button, Datacoup, Keen Home, LMRKTS, pymetrics, Ufora, and others

IBM Watson option

During this semester, students will have the opportunity to work with an instance of IBM’s Watson computing system. This is not a required element of the course, but projects with an element of natural language processing and question-and-answering interfaces may benefit greatly. For more information, speak with the instructors or read more here.

Many aspects of the IBM Watson and natural language processing (NLP) elements of the course, will be handled by Shekar Pradhan, an expert in NLP and adjunct faculty in computer science at the NYU School of Engineering.