Assistant Professor of Mathematics Carlos Fernandez-Granda has recently joined the faculty of the Courant Institute of Mathematical Science and the Center for Data Science and is currently teaching “Statistical and Mathematical Methods,” a required course for the MS in Data Science. A native of Madrid, Fernandez-Granda earned his Ph.D. in Electrical Engineering from Stanford University, his M.S. in Applied Mathematics (Machine Learning and Computer Vision) from École Normale Supérieure de Cachan (Paris) and engineering degrees from Universidad Politécnica de Madrid and École des Mines (Paris).
This fall, you’re teaching an introductory course for the statistical and mathematical methods needed for data science. Are you enjoying it?
Yes, I’m enjoying teaching very much. The point of the course is to give students a background in probability, statistics, linear algebra and optimization so they can understand more advanced algorithms in machine learning and data science. Most of the students are going for their master’s so even though in the beginning the material is not that complicated, I try to treat it at a more rigorous level than what they have probably seen in undergrad. I give them a practical viewpoint of the subjects but also tell them about the more theoretical aspects so they will know both.
I got really lucky when I was assigned this course because it combines some of the subjects that I like the most — specifically, probability and linear algebra. Also, a lot of my research is in optimization so it’s a very good fit. I’m writing my own notes because there isn’t really a book that comprises all of the basic math you need for data science. I’m really enjoying teaching this subject which is basic but can be quite tricky. I think it’s important for students to get it right in order to understand more advanced algorithms.
What interests you about probability, statistics, linear algebra and optimization?
They are all crucial to data science. Probability, I find, is a very useful way of dealing with uncertainty and partial knowledge in a principled way. Linear algebra allows us to build linear models with intuitive geometric interpretations. Statistics enables us to extract conclusions from data and examine the accuracy of those conclusions. Regarding optimization, maybe ten to twenty years ago, not a lot of people doing a master’s in statistics would learn very much optimization but now it’s becoming increasingly important, as it allows us to use more flexible and complex models. Students need to have strong foundations in each of these four areas in order to understand the different approaches to data science which they will be exposed to in the program.
How did your background prepare you for your current work?
I grew up in Madrid; my father was an electrical engineer and my mother was a doctor. In high school, I was mainly interested in literature, but I ended up doing my undergrad in electrical engineering. In Spain, it’s common for good students to go into electrical engineering because it’s prestigious academically, like physics in Germany. Fortunately, I really liked the math and programming courses so it turned out alright. It also allowed me to do the last year and a half in Paris at l’École des Mines, which was a great opportunity.
École des Mines was mainly geared toward management or finance. Since I wanted to learn about machine learning, artificial intelligence and computer vision, I did a masters at École Normale at the same time. That’s when I discovered that I really like these areas. Then I did my master’s thesis at Philips in Germany on magnetic resonance imaging and really enjoyed that very much. I liked the application but realized I didn’t know as much as I wanted to about the theoretical framework that underlies some of the algorithms, which is why I decided to go for a Ph.D.
How much do math and data science methodology vary from country to country?
In France, students tend to have a very mathematical background. Spain is less theoretical, probably due to a lack of tradition. I can’t comment very much on Germany because when I was there, I was at a company doing research.
Regarding American universities, I have met many researchers who are very applied but still have an interest in theory. At the same time, more theoretical researchers often have a good knowledge of applications. I find this very appealing.
What was the subject of your Ph.D. at Stanford?
Before getting my Ph.D., I developed algorithms for magnetic resonance imaging based on optimization. I was very interested in this methods and as I mentioned earlier, I wanted to understand the underlying theory better. At Stanford, my Ph.D. was under Emmanuel Candès who has been very influential in optimization-based methods for inverse problems. I studied how these algorithms can be applied to problems such as super-resolution in fluorescence microscopy and derived theoretical guarantees that show under what conditions we can expect them to work.
What is the focus of your current research?
On the one hand, I am still interested in the theoretical analysis of data-analysis methods based on optimization. On the other, I have become interested in more practical applications, such as spike sorting in neuroscience. This problem entails processing data from sensors that pick up signals from different neurons. You want to un-mix these signals in order to decide which one is coming out of each neuron. There are many interesting challenges, such as the sheer amount of data. Because there are so many measurements, it is crucial to make sure you’re applying the algorithms to the parts of the data where there is activity; otherwise things become too computationally expensive.
In data science, where do you see the next big discoveries coming from?
Lately, there have been some amazing advances in machine learning applied to problems in computer vision and speech recognition. These advances have been made possible by the availability of huge datasets with annotated examples. I think a crucial challenge in the coming years is to be able to apply these techniques to other areas where we don’t have so much curated data. Some of the ideas should definitely transfer but you’re going to have to use data in a subtler way because it’s not going to be as well categorized. For example, medical diagnostics, the social sciences, economics, policy — you have a huge amount of data in these areas but it’s usually very heterogeneous, so you cannot form problems that are as clean as “Let’s identify some person’s face.”
Right now, we have exciting big data techniques that are working very well for things like computer vision but we still have to work out how to apply them to many other areas where they could have an impact. It’s not obvious how to pull the data together, identify the parts that are interesting and then combine them in sophisticated ways — which is what we are able to when we have huge curated data sets, as in computer vision. This is what I think is fascinating.
Do living in New York and working at Courant and CDS suit you?
I was very excited to come to New York. I think Courant is a great place to be. I’m more mathematically oriented so I’m happier here than in an engineering department. But at the same time, I feel there is a strong connection to applications. I’m now becoming interested in neuroscience and Courant/CDS is an ideal place to do those kinds of things.
Interview by ML Ball