ABOUT

The course will teach programming for applications in data science. Students will study the Python language and packages including tools for array operations, table manipulations, visualization, and data extraction. Through a focus on examples, students will learn about query languages, version control systems, and web frameworks. Experience with debugging, testing and documenting programs will enable students to code in integrated development environments and command line interfaces.

The instructor is Christopher Policastro.  Lecture will be held Thursdays 3:30-5:10pm at 60th 5th Avenue, Room 150. The section leader is Jason Phang. Lab will be held on Mondays 7:00-7:50pm at 60th 5th Avenue, Room 150.

While the course does not have prerequisites, students are expected to have experience with programming in at least one language such as R, Java, etc. For questions about registration please contact Tim Baker.

Throughout the semester, students will learn practices that will enable them to collaborate on projects. Implementing code will prepare them for courses in the data science program such as machine learning, natural language processing, and database engineering. Practical experience will empower students to apply their studies to a career in data science.

Working Collaboratively

Contributing to projects requires working in groups. The members of the group must develop code that integrates into the system. While the code might run outside of the system, incompatible versions might prevent the code from running inside the system.  Different contributions must merge together to incorporate modifications without disrupting other developments. We will learn… Read more Working Collaboratively

Developing Designs

We will learn about documenting code. Through documentation we can explain the components of a program. These comments could include information about input/output, exceptions, usage and dependencies.  While documentation will allow us to share information about a program, the program might be part of a system with many components. The relationship between the components determines… Read more Developing Designs

Manipulating Data

The acquisition of data for a project requires pulling together different sources in a consistent way. We will learn about tools for extracting, transforming, and loading data. By storing data in tabular formats, we can apply operations for reading and writing through transactions. The operations can include filtering, grouping, sorting, and joining.   Experience with query… Read more Manipulating Data