The Center for Data Science is excited to announce a new PhD Program in Data Science. The program will prepare students to advance the state-of-the-art in data science research and prepare them for outstanding careers in academia or industry. Admitted students are guaranteed financial support in the form of tuition and a stipend in the fall and spring semesters for up to five years. This support allows students to focus on their research goals independently of any grants that may become available through their research advisors.
The Committee welcomes applications from candidates with relevant undergraduate/master’s degrees and candidates with work or research experience in data science. Relevant degrees include: mathematics, statistics, computer science, engineering, and other scientific disciplines that develop skills in drawing inferences or making predictions using data. Coursework or equivalent experience in calculus, probability,
statistics, and programming is required.
Visit the GSAS admissions website for more information on application requirements and to apply. Following is the admissions timeline for Fall 2017:
- January 4, 2017 – Application deadline.
- February 2017 – On-campus visit for invited students.
- End-February 2017 – The PhD Admissions Committee will announce decisions.
- April 15, 2017 – Decisions deadline.
To be awarded the PhD in Data Science, students must, within 10 years of first enrolling:
- Complete 72 credit hours while maintaining a cumulative grade point average of 3.0 (out of 4.0).
- Pass a Comprehensive Exam.
- Pass the Depth Qualifying Exam (DQE) by May 15 of their fourth semester.
- Complete all the steps for approval of their PhD dissertation.
Credit Hour Requirements
Students must pass courses with the indicated number of credit hours in each of these categories:
- 5 required courses, with 15 credit hours in total.
- An additional 57 credit hours of elective courses.
Five required courses
Students must successfully complete the following five courses by the end of their third semester, or show evidence that they have taken equivalent coursework elsewhere.
- DSGA-1001 – Introduction to Data Science. This course introduces students to the fundamental principles of data science that underlie data science algorithms, processes, methods, and data-analytic thinking. It introduces students to algorithms and tools based on these principles and to frameworks to support problem-focused data-analytic thinking. It is offered in the fall semester.
- DSGA-1002 – Probability and Statistics for Data Science. This course introduces basic probabilistic and statistical methods needed in the practice of data science. It is offered in the fall semester.
- DSGA-1003 – Machine Learning and Computational Statistics. This courses covers a wide variety of topics in machine learning, pattern recognition, statistical modeling, and neural computation. It covers the mathematical methods and theoretical aspects, but primarily focuses on algorithmic and practical issues. It is offered in the spring semester.
- DSGA-1004 – Big Data. This course studies the state-of-the-art in big data management: algorithms, techniques, and tools. This course is offered in the spring.
- DSGA-1005 – Inference and Representation. This course covers graphical models, causal inference, and advanced topics in statistical machine learning. It is offered in the fall semester.
57 credit hours of elective courses
Students must successfully complete 57 credit hours of elective courses. Faculty at the Center for Data Science are experts in a broad range of data science topics, and the Center’s course offerings reflect that diversity. For example, students will be able to take courses in Deep Learning, Optimization, and Natural Language Processing.
Some of the pre-approved courses are:
- Deep Learning (DSGA-1008). The course covers a wide variety of topics in deep learning, feature learning and neural computation. It covers the mathematical methods and theoretical aspects as well as algorithmic and practical issues. Deep Learning is at the core of many recent advances in AI, particularly in audio, image, video, and language analysis and understanding.
- Optimization-based Data Analysis (MATHGA-2840). This course covers data-analysis methods that exploit low-dimensional structure, captured by sparse or low-rank models, to extract information from data using optimization.
- Mathematics of Data Science (MATHGA-2830). A course designed for PhD students with an interest in doing research in theoretical aspects of algorithms that aim to extract information from data.
- Natural Language Understanding with Distributed Representations (DSGA-3001). This course examines some of the modern computational approaches, mainly using deep learning, to understanding, processing and using natural languages.
- Research Rotation Courses (DSGA-2001-10, multiple sections). A research rotation is a semester-long guided research experience in which the student will have an opportunity to design and carry out original research in a collaborative setting. The idea is to help students identify research interests. Students undertaking research rotations should sign up for a section of the course DSGA-1010 Research Rotation in Data Science, a three-credit course. PhD students normally take this elective 6 times.
- Preparation for Teaching Data Science (DSGA 2001-11). In this class, students learn effective teaching skills for teaching data science topics to university students. They will help prepare and deliver an assigned course.
- Practical Training for Data Science (DSGA-1009). Practical Training offers course credit for academically relevant internship experience. This is an integral part of the PhD Program curriculum and facilitates students academic and professional development. The course allows students to apply their academic and research knowledge to real-world problems.
Students can take courses that are not on the pre-approved course list with permission from the Director of Graduate Studies (DGS).
Typically, a student will follow a schedule like the one outlined here:
- First year, fall: 2 required courses and 1 elective/research rotation courses
- Intro to Data Science
- Probability and Statistics
- Pre-approved Elective
- First year, spring: 2 required courses and 1 elective/research rotation course
- Big Data
- Machine Learning and Computational Statistics
- Pre-approved Elective
- Second year, fall: 1 required courses and 2 electives/research rotation courses
- Inference and Representation
- Second year, spring: 3 electives/research rotation courses, pass the Depth Qualifying Exam
The competency exam is designed to determine whether the candidate displays the requisite data science knowledge in the areas of machine learning and big data. The exam consists of the final exams from the courses DSGA-1003 Machine Learning and Computational Statistics and DSGA-1004 Big Data. The passing grade is A.
PhD students may sit for the final exam in these courses without registering for the courses.
Depth Qualifying Exam (DQE)
No later than the end of the third semester, each student must:
- Agree on a research advisor. The student is responsible for finding a research advisor, obtaining an agreement to advise the student, and informing the Director of Graduate Studies (DGS) of the agreement. Students must reach agreement with the DGS and the Program Administrator if they wish to change research advisors. If a research advisor determines that she no longer wishes to advise a student, the research advisors informs the DGS who will begin working with the student to find another research advisor.
- Agree with his research advisor on a research project an exam topic, and a Depth Qualifying Exam (DQE) committee.
- Obtain the approval of the DGS on the research project, exam topic, and DQE committee, as well as the date of the DQE exam.
No later than the end of his fourth semester, the student must pass the depth qualifying exam (DQE). The exam may be taken no more than twice. The content of the exam is defined by the student’s DQE Committee, which must present a syllabus to the student at least 2 months before the date of the exam.
The exam itself consists of two parts. The first part is a written or oral examination of the topics in the syllabus. The goal is to confirm the student’s knowledge of a research area that is distinct from the student’s own research area.The second part is a presentation by the student on original research carried out independently or in collaboration with faculty, research staff, or other students.
Dissertation Proposal Approval
No later than May 15 of their third year, students must have their thesis proposal approved. The student works with their research advisor to select a thesis approval committee, obtains approval of this committee from the DGS, submits a written thesis proposal to the committee, and obtains the approval of the committee. The committee consists of at least three members , which may consist of individuals with similar standing outside of CDS. At least one member must be a CDS Faculty member or CDS Affiliated Faculty member.
Each student’s dissertation must be approved by all of the readers on the student’s defense committee. The PhD committee must have at least four members, including the advisor, three of whom must be core CDS faculty or affiliated faculty. The membership of the defense committee is proposed by the student and approved by the DGS.
Approval of each reader is required. Their approvals are indicated by their signatures on a form provided by the Program Administrator. Their signatures are solicited by the student after the defense of her dissertation. The defense is a presentation and question-answering session in which the student presents her work. The NYU public is invited as are the members of the defense committee. The student works with the Program Administrator to arrange a date for the defense and to publicize the defense.
In addition, students must comply with all of the procedures of NYU’s Graduate of School of Arts and Science related to submission of their dissertation.