Eero Simoncelli is a faculty member at the Center for Data Science, NYU’s Center for Neural Science, and a fellow at the Institute of Electrical and Electronics Engineers. His research spans several fields, including medicine, neural sciences, mathematics, & psychology, and this past fall, he received an Engineering Emmy award for his work in computational vision.
What did you study in school? How did you get to what you study now?
I had this notion in 8th grade that I wanted to understand how the brain worked, and by the time I finished high school, I was pretty set on going in this direction. As a Physics undergraduate, I worked in a “psycho-biology” lab that was looking for reward pathways in the rat brain, and then in a bio-engineering lab that studied locomotion and neural response dynamics. And as a graduate student, while most of my courses were Electrical Engineering, I was also learning about visual perception and physiology. When I started my faculty job at the University of Pennsylvania, I was working on computer and biological vision problems in parallel, looking for common principles that would serve as a framework for both.
When did you start to incorporate data science into your research?
My research has always involved data, whether from humans, animals, real-world signals (like images, or sounds) or computer simulations of models, so I guess this means it’s always been a form of data science, in a way.
Could you tell me about some of the projects you’re working on?
I’m pretty excited about some recent work aimed at hierarchical processing of visual inputs. This is a bit like the deep networks that have become so popular in recent years, but we’re not trying to learn them from labelled data. Rather, they are engineered using constraints from the statistical properties of natural images, as well as our knowledge of response properties of neurons in the human visual system.
You’re associated with a number of centers in different fields, including: medicine, neural sciences, mathematics, and psychology. Is there a thread or similarity that runs throughout your work in all these different fields?
Yes – the labels of traditional fields don’t really capture what we do in my group. The “glue” that holds it all together is that we are interested in common principles underlying the representation of visual information, whether in biological or machine systems.
How has the way in which you use data science in neural science changed over the years?
In my early work, I tested principles and predictions of theories by running simulations, and comparing these to existing data. But more and more, our work involves flipping this around: We make a prediction based on a particular theory or model, we run experiments (either ourselves, or in collaboration with experimental colleagues) to gather data that will constrain such a model, and then develop methods to fit models to those data and interpret the outcome. So there’s more emphasis on development of proper constraints and parameterization that will allow optimization.
What do you think is the single biggest affect that data science has had on neural science, or how we approach the field?
The importance of data science in neural science is growing rapidly, largely in response to huge changes in the technologies by which we measure brain activity. Fifty years ago, this was mostly limited to recordings of single cells, using single electrodes, and EEG measured through the scalp. Over the past 25 years, functional MRI, high-density multi-electrodes, and various kinds of specialized imaging and microscopy have all provided dramatic new measurement capabilities, rapidly outstripping our expertise and abilities to analyze and interpret the data. This is partly a computational issue (there’s too much data, and it is very high-dimensional), but it’s also conceptual: we need more sophisticated hypotheses and models for large scale neural activity. I think this theme can be found in many data-rich fields.
Are there any areas of neural science that aren’t being impacted by data science that should be?
I’d say that it’s crucial that a curriculum for training researchers in Data Science include not just methods/algorithms for handling and analyzing large amounts of data, but also tools for developing models, testing and comparing them, and interpreting and reasoning about the hypotheses from which they arise.
Can you talk about the Primetime Emmy Engineering award? How does your work in neural sciences fit into the work you did for the award?
About 12 years ago, I had an extremely talented postdoctoral fellow, Zhou Wang, in my group who was working on a method of assessing the quality of a photographic image. The goal was to do this in a way that matched what a human would report, which is quite different than the “mean squared error” that underlies most engineering. This problem of photographic image quality had been studied for more than a decade, but comparative tests showed that various methods that had been proposed did not offer significant improvements. Zhou had started working on this problem as a graduate student at U.T. Austin with his advisor Al Bovik. When Zhou arrived at NYU, we set about improving it, incorporating some ideas about how the human visual system analyzes and represents visual images, as well as some new methods of experimentally testing how well our quality measure captured human perception. We wrote up a thorough account in a paper published in 2004, documenting a substantial improvement over mean squared error, and the method took off — more than 10 years later, it’s still the most widely used method for assessing image quality (although we and others have developed solutions that are better!). A few years ago, the television industry started using it to test the qualify of broadcast video, and turned out to be quite useful… I had no idea it was even being used in this context, and so was quite surprised to hear we had won the award!
What drew you to the CDS program at NYU?
I’ve always felt the scientific world needs a better infrastructure for analyzing, testing, reasoning about data. This comes traditionally from the field of statistics – for example, many science departments require their students to take a course or two in statistics for analysis and presentation of data. But I think it’s abundantly clear that this traditional curriculum needs to be updated to prepare students to handle larger and more complex data sets, using more sophisticated mathematical computational methods. I’ve incorporated some of these things into my courses in Neural Science, but I think the issue cuts across many departments. I never knew what to call this new unified collection of topics, but I suppose “data science” is a good a name as any, and I’m happy to be participating in building it at NYU!