Tracks (For students starting program in Fall 2017 and later)
Data Science Track
In the Data Science track, students take six required courses and six elective courses from a diverse list of courses.
Data Science Big Data Track
The Data Science Big Data track focuses on methods and techniques required to acquire, manage, analyze and visualize large volumes of data. Student will acquire deep understanding of algorithms and their complexity and gain hands-on experience on how to build end-to-end solutions to computational problems.
Ordinarily, students pursuing this track take 3 of the following courses –
- DS-GA 1012: Natural Language Understanding and Computational Semantics
- CS-GY 6313: Information Visualization
- CS-GY 6323 Large-Scale Visual Analytics
- One of the following:
- CS-GY 6083 Principles of Database Systems (Engineering School) or CSCI-GA 2433 Database Systems (Courant Computer Science)
- CS-GY 6093: Advanced Database Systems (Engineering School) or CSCI-GA 2434 Advanced Database Systems (Courant Computer Science)
Data Science Mathematics and Data Track
The Data Science Mathematics and Data track provides the mathematical background to understand and analyze modern data-analysis methods in areas such as deep learning, compressed sensing, high-dimensional statistics and graph signal processing. In addition, the track will provide exposure to fundamental research problems inspired by newly-developed data-science techniques.
Ordinarily, students pursuing this track take 2 of the following courses –
- DS-GA 1013: Mathematical Tools for Data Science
- DS-GA 1005: Inference and Representation
- Spring 2018 DS-GA 3001.010/.011 Special Topics in Data Science – Mathematics of Data Science: Graphs and Networks
- CSCI-GA 2945/MATH-GA 2012 Convex and Nonsmooth Optimization
- CSCI-GA 3033/DS-GA 3001 Special Topics: Mathematics of Deep Learning
Data Science Natural Language Processing Track
The Data Science Natural Language Processing Track will give students the skills to build machine learning models that can understand, manipulate, or produce data expressed in natural language text.
Ordinarily, students pursuing this track take the following 2 courses in order:
- DS-GA 1011 Natural Language Processing with Representation Learning
- DS-GA 1012 Natural Language Understanding and Computational Semantics
It is acceptable to substitute CSCI-GA 3033 Statistical NLP for DS-GA 1011.
- DS-GA 1005 Inference and Representation
- DS-GA 1008 Deep Learning
- DS-GA 3001 Text as Data
- CSCI-GA 2590 Natural Language Processing
- Advanced linguistics courses with consent of instructor. Contact Sam Bowman (firstname.lastname@example.org) for advice.
Data Science Physics Track
The Data Science Physics Track provides the same solid foundation in data science and further develops modeling and inference skills in the context of compelling, data-intensive physics research topics. This track is ideal for applicants who have some physics background, are interested in transitioning into a career in data science, and wish to leverage those skills for a competitive advantage.
Ordinarily, students pursuing this track take the following courses –
- DS-GA 1005 Inference and Representation
- Physics Research: Select 2 of the following –
- PHYS-GA 2091 Experimental Physics Research
- PHYS-GA 2093 Theoretical Physics Research
- PHYS-GA 2095 Research Reading
- Physics Electives: Select 2 of the following –
- PHYS-GA 2000 Computational Physics
- PHYS-GA 2002 Statistical Physics
- PHYS-GA 2022 Biophysics
- PHYS-GA 2059 Special Topics
- PHYS-GA 2061 Special Topics
- PHYS-GA 2053 Special Topics in Astrophysics
- PHYS-GA 2054 Special Topics in Astrophysics
- PHYS-GA 2017 Phase Transitions and Critical Phenomena
Data Science Biology Track (For students starting Fall 2018 and later)
Large datasets are revolutionizing our understanding of basic biology as well as of human health and disease. The Data Science Biology track is for students who want to further develop their computational skills and apply them to the biomedical sciences.
The capstone project for this track will be biology-based and completed with a biology or biotech mentor. Ordinarily, students pursuing this track take the following courses –
- 5 Biology electives
- Year 1 Fall – Either BIOL-GA 1001 Biocore I: Molecular Systems or BIOL-GA 1128 Systems Biology
- Year 1 Spring – Either BIOL-GA 1002 Biocore II: Cellular Systems or BIOL-GA 1130 Applied Genomics
- Year 2 Fall – Either BIOL-GA 1128 Systems Biology or BIOL-GA 1040 Genomic Innovation (effective Spring 2019: BIOL-GA 1009 Biological Databases and Datamining is no longer a Biology track elective)
- Year 2 Spring
- Either BIOL-GA 3304 Research or BIOL-GA 2005 Current Topics in Biology
- Either BIOL-GA 1127 Bioinformatics and Genomes or BIOL-GA 1131 Biophysical Modeling or BIOL-GA 3304 Research in Biology
Data Science – Biomedical Informatics (Medical School) Track (For students starting Fall 2019 and later)
The Data Science – Biomedical Informatics (Medical School) track is for students who are interested in the rapidly growing field of biomedical informatics, which has influenced many recent healthcare developments, including new opportunities for personalized medicine. These innovations, along with a recent growth in high-throughput genomics technologies, have created a high demand for skilled biomedical informatics professionals.
The capstone project for this track will be biomedicine-based and completed with a biomedicine mentor. In addition to the 6 required courses, ordinarily, students pursuing this track take the following courses –
- Biomedicine Research: Select 2 of the following –
- BMIN-GA 0003 Introduction to Biomedicine
- BMIN-GA 1003 Introduction to Health Informatics
- BMIN-GA 3001 Topics in Bioinformatics
- Biomedicine Electives: Select 2 of the following –
- BMIN-GA 3009 Advanced Integrative Omics
- BMIN-GA 3002 Clinical Decision Support Systems
- BMIN-GA 3007 Deep Learning for Biomedical Data
- BMIN-GA 3008 Evaluation Methods in Health IT
- BMIN-GA 3010 Microbiome Informatics
- BMIN-GA 3004 Next Generation Sequencing
- BMIN-GA 3003 Proteomics Informatics