Foster Provost is a Professor of Information Systems, NEC Faculty Fellow and Paduano Fellow in Business Ethics (Emeritus) in the Department of Information, Operations and Management Sciences at the NYU Stern School of Business, where he has taught graduate-level data science courses for 15 years. He is also an Associate Faculty Member at the NYU Center for Data Science, and recently retired as Editor-in-Chief of the journal Machine Learning (2004-2010). Previously, Professor Provost was employed by what is now Verizon, winning a President’s Award for his work there. Throughout his career, Professor Provost has won numerous awards related to data science, including the INFORMS Design Science Award in 2009 for his work on Social Network-based Marketing Systems.
In addition to teaching, Professor Provost advises companies interested in extracting useful knowledge from data. He has co-founded several successful startup companies based on data science for advertising.
In July of this year, Professor Provost, along with Tom Fawcett, authored the book, “Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking,” published by O’Reilly, to enormous acclaim.
How the book “Data Science for Business” came to be
If necessity is the mother of invention, Foster Provost and Tom Fawcett are two top-flight inventors. After years of struggling to sync up a number of textbooks with what he wanted to teach MBA students, Masters of Science in Information Systems students, and undergraduates in his data mining classes at Stern School of Business, Foster simply stopped using them. “My goal wasn’t to write a book at all,” he says. “I just thought this was stuff that needed to be taught to this particular audience. I tried many different things, like making course packs of articles, and finally gave up because it was too much work and the feedback was mixed. So I decided to write notes for the class myself. The book grew out of these notes, plus some slides which my colleagues and PhD students donated or helped me with.”
Foster’s co-author on the book, Tom Fawcett, had been his colleague at Verizon for five years, the two of them working side by side there. “We wrote a lot together back then, including some scientific papers that are very highly cited, very well-known,” Foster says. “We complement each other well, both in skills and personalities. He lives in Mountain View, CA, and we never saw each other once during the entire writing of the book,” he adds.
Multiple publishers were interested in their manuscript, but what Foster and Tom had in mind was a version of O’Reilly animal books for business people. “That’s really what we were writing,” Foster explains. “A serious book for people who really want to learn broadly about data science. I didn’t want this to be a textbook, although it could be used as a textbook. In fact, I myself use it as a textbook, and 20 universities have adopted it.”
O’Reilly did indeed publish their book, and although currently on the market less than two months, the reception has been extremely positive. “We got some nice quotes from important people, a lot of great praise,” Foster says. “It has great reviews on Amazon and is already a bestseller in different categories.”
A large part of the book’s value is that it appeals to a wide range of stakeholders in data-oriented projects
According to Foster, there are three different audiences who need to understand the fundamentals of data science, all of whom are currently represented in his introductory Data Science class this fall at NYU’s Center for Data Science: CDS students, Stern MBAs, and MS in Information Systems students. “In the Data Science program, students should get this type of broad-based learning first,” he says. “They’re going to get a lot of deep education in math, in algorithms and so on. But in order to be a well-rounded person who’s going to succeed when going into business or scientific situations, and actually solve the problem, thinking broadly is an important place to start.”
More versions of “Data Science for Business” possibly on the horizon
For those who have finished the 384-page “Data Science for Business” and are wanting more, take heart! There may be more to come. “When Tom and I had our initial plan, this was one third of the book,” Foster says. “I think to really do this right, there needs to be a Volume 2 and a Volume 3. There is more to say. I’ve had fantastic feedback from my students on the book and what they’re learning from it.” So, data science students and practitioners, stay tuned.
Deal-breaker: putting out a book students want to buy at a price they can afford
Originally, Foster wanted to self-publish the book and make it available for around $25, because he vehemently disagrees with requiring students to pay $200 for a textbook. “I don’t think professors should give their stuff away but I think there should be a fair price,” he emphasizes. “I wanted to make this book inexpensive.” When interviewing publishers, he made it clear there was no point talking further unless they were willing to sell the book for less than $35—a price “where the publisher could make some money, but it wouldn’t be an issue for students. On Amazon, the price is even less,” he says. “And if another outlet is selling it for less than Amazon’s price, Amazon apparently will lower their price to match it.”
The future of the Center for Data Science? For Foster, very exciting.
When asked about the strategic thinking behind the founding of the Center for Data Science, Foster is adamant in his praise. “I think NYU did the right thing to make it a university-level thing rather than something within Courant. Stern, particularly the former statistics department that’s now a group within our big department, is very excited about the fact that this is happened. From the research side, we have very, very strong people across the university in data science, so it’s nice to have a nexus where we can hang out together and have Ph.D. students.”
Foster also underscores the necessity of specifically training people in data science, particularly in today’s increasingly data-dependent business climate. “Right now, there are people coming out of various different programs and working as data scientists who may not know the fundamentals of data science: the concepts, ideas, methods and principles,” he says. “I’m hoping that by establishing a precise academic program, the Center for Data Science can actually help define what data science really is, give it some grounding.”
By ML Ball