“Data Science is the study of the generalizable extraction of knowledge from data.”

“It is is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statisticsdata mining, and predictive analytics, similar to Knowledge Discovery in Databases (KDD).” – Vasant Dhar

Wikipedia uses this definition, which is cited from the Center for Data Science Affiliated Faculty member, Vasant Dhar.

Advancing the field of data science

New sources of data and new techniques for analysis and discovery are transforming science, medicine, and business. The Center for Data Science pursues research that extracts scientific knowledge from rich data through the development and use of new methodologies. The interdisciplinary interactions between the methodological domains of statistics, machine learning, computer science, and applied mathematics and the domains of natural science and social science are creating new discoveries, measurements, and insights.

The Center for Data Science is supporting the education and work of scientists and engineers who are working on data sets that are complex, heterogeneous, large, streaming (real time), or noisy. The Center has activities in the methods, and also in the new areas of natural science, medicine, social science, business, and the humanities that are emerging from unprecedented data sets and new data-analysis methods. In the natural sciences, these include new branches of biology, chemistry, astronomy, physics, and neuroscience. In the social sciences and humanities these include law, economics, sociology, political science, and history.