Juan Bello is an Associate Professor of Music Technology in New York University’s Steinhardt Program, and an Affiliated Faculty member at the Center for Data Science. He is the co-founder of the Music and Audio Research Lab (MARL) and his research focuses on digital signal processing, machine listening and music information retrieval.
What did you study in school?
I played music from a young age, but always had an interest in mathematics and technology. My dream career was to be an audio engineer, even a producer, but those types of college courses were not offered in my home country of Venezuela, so I enrolled in the closest thing I could find: electronic engineering. I discovered an affinity for signal processing and programming, which led me to Undergraduate thesis work in sound synthesis.
How did you get to what you study now?
I did my graduate studies in audio signal processing at Queen Mary, University of London. The lab I joined specialized in audio coding and digital effects, but was becoming more interested in music pattern analysis. I was fortunate enough to be around for the genesis of Music Information Retrieval as a field, and as a first-year PhD student, I attended the first conference ever on the subject. I haven’t looked back since.
What drew you to Music information retrieval?
The combination of engineering, computer science and music was hard to resist. And at the time, it was a totally wide open field. People had worked on speech processing, and there have been decades of work in computer music, but using computers for music analysis was rare. There was ample space for young doctoral students to cut their teeth on and explore.
Could you tell me about some of the projects you’re currently working on?
My lab is focused on extracting high-level information from music audio. Recent efforts include techniques for automatic melody extraction, downbeat tracking, chord estimation, instrument classification and music structure analysis. Some of this work is focused on machine learning, such as how to design a framework for data augmentation is specific to audio and music data. All of this work is informed by domain expertise in music and sound.
What are some of the real world implications of the research you’re doing?
We’re starting to pursue work in environmental sound analysis. One project, SONYC, seeks to combine sensor networks, machine listening and data science to monitor and identify patterns of noise pollution in New York City. Noise pollution has proven affects on health, education, and the economy, and is obviously a huge issue in most urban areas.
How has SONYC been received?
What is surprising is the extent to which cities are limited in their ability to monitor compliance and enforce such regulatory frameworks. Our project offers an alternative, scalable solution based on smart sensing technologies that can empower the work of city agencies and other stakeholders. In fact, we are already building partnerships with various departments in the NYC government. This is joint work with researchers across NYU, including in Steinhardt, CDS, Tandon and CUSP, as well as in Ohio State University.
Can you talk about the differences between working with environmental and musical sounds?
Music is often recorded in very controlled, clean conditions, with little interfering noise. Speech signals in most commercial applications rely on a high signal-to-noise radio. But environmental recordings are often full of background noise and the relative weakness of sources of interest.
How does your analysis of the two types of audio differ?
Environmental sound analysis is a window into a real-world phenomena: birds migrating, traffic, construction, etc, that we aim to understand in order to gain scientific knowledge or solve a problem. Music on the other hand is a heavily designed sonic phenomenon, the result of a carefully concocted composition of sounds resulting in structures nowhere to be found in the natural acoustic world. Sound is a manifestation but not necessarily the object of study per se, but music is the end result, the object of study.
What aspects of audio analysis have the largest untapped potential for work in the world of data science and machine learning?
We’ve only scratched the surface thus far, but I think environmental sound analysis is the next big frontier. The space of possibilities is huge: sport analytics, bioacoustics, intelligent homes, security & surveillance, failure detection in machinery, all the way to understanding food ingestion habits. Sound is a rich source of information about the world that can be sensed cheaply, constantly and without “field-of-vision” limitations, and the widespread availability of mobile and remote sensing devices will only increase the amount of data available.
What drew you to the CDS program at NYU?
The key motivation was becoming part of a heterogeneous community of scholars with shared interests and complementary expertise. This in turn creates unique opportunities to learn from both the state of the art in core areas of data science such as data visualization, machine learning, and citizen science, as well as from new and exciting applications across disciplines as varied as finance, politics, cosmology and medicine. This process is already helping me build new collaborations, understand better what is common and can be transferred across disciplines, what is unique about my field, and how we can contribute back to the core and beyond. In the end, I find that diversity is perhaps the most precious asset of the CDS community. Plus, I want musicians to be involved.
On a chalk-board at CDS we have the question: “What does it mean to be a data scientist?” Could you answer this question?
To me, a data scientist is defined by pursuit of unlocking meaning from vast quantities of data, as opposed to the specific tools or path one uses to find an end result.