August 3, 2015

CDS Professor of the Year and YP Mobile Labs Chief Scientist David Rosenberg Pushes the Boundaries of Machine Learning And Predictive Analytics

Recently named NYU Center for Data Science Professor of the Year, CDS Adjunct Associate Professor David Rosenberg is currently Chief Scientist of YP Mobile Labs at YP and a former Chief Scientist at Sense Networks (acquired by YP in 2014). Dr. Rosenberg specializes in machine learning, artificial intelligence, predictive analytics and statistical modeling.

Dr. Rosenberg earned his Bachelor of Science degree in mathematics from Yale University, his Master of Science degree in applied math (computer science focus) from Harvard University, and his Ph.D. in statistics from the University of California, Berkeley. During this year’s spring semester, he taught the CDS course “Machine Learning and Computational Statistics,” which he will again offer in Spring 2016.

The holder of four U.S. patents, Dr. Rosenberg is also a certified PCC strength training instructor and former ballroom dance instructor.

An early career marked by statistics, Hungarian and ballroom dancing

A member of both the varsity fencing and ballroom dancing teams while an undergraduate at Yale, Rosenberg took a year off between his junior and senior years to attend the Budapest Semesters in Mathematics Program in Hungary for an immersion in math and Hungarian. “It was a great way to focus on math while enjoying the study-abroad experience,” he recalls. Classes were conducted in English, although Rosenberg did take a probability theory class in Hungarian during the second semester. He concedes: “I had an English translation of the textbook which was extremely helpful.”

After returning to Yale and completing his last year of undergrad, Rosenberg relocated to Boston where he worked at The MITRE Corporation on problems in cryptanalysis (code breaking) and pattern recognition, while simultaneously pursuing his master’s degree from Harvard. During the six months before heading west to attend Berkeley for his doctoral degree, he joined the Tufts University ballroom dance team. “Then I hit the ground running with ballroom dancing when I got to Berkeley, but eventually shelved it to focus on my research,” he says.

The start-up Sense Networks and the lure of New York brought Rosenberg back East

The day after sending a completed draft of his dissertation to his Ph.D. advisor, Rosenberg joined the startup Sense Networks as a research scientist, rising over the next four years to Lead Scientist and then Chief Scientist.

“Sense Networks was exciting because its charter goal was to have the leading expertise in location data analytics,” states Rosenberg. “Almost seven years ago when I started there, location data was not that common. Now almost every smart phone and mobile device knows its location, and that information is often shared with companies through apps. There are companies that get location data from over 100 million devices.”

To Rosenberg, what was most interesting about working at Sense Networks was that the statistical study of location data at this scale had never been done before. The reason? Because it didn’t exist.

He explains: “When people did spatio-temporal data analysis in traditional statistics, they were looking at tracking fifty whales in their migration patterns or tagging twenty elk and seeing how they move in space and time. Now you have 100 million people that you’re tracking, not in the same level of detail but still, it’s a totally different type of dataset. So that was exciting to me — a chance to get at this new data first. Plus, I wanted to live in New York, where Sense Networks was based.”

From the time he was a young teenager, Rosenberg has liked the idea of artificial intelligence and robotics    

When Rosenberg was thirteen, he took a weeklong robotics summer course and was hooked. Later on at Yale, he enrolled in a college-level robotics class and realized that the way in which he wanted to approach problems was much more mathematical and rigorous than he had the skills for at the time. Over time, this steered him toward a math major. “That’s pretty common,” he reveals. “People who end up doing machine learning research often begin their studies in mathematics, or at least learn a lot of mathematics along the way.”

This early interest in robotics gradually evolved into a focus on machine learning and statistics. Evidenced in Rosenberg’s present work, it is the general problem of automating processes that drives his efforts for finding new solutions in machine learning. “I love the idea of having a machine do something that otherwise I or another person would have to do, which is basically the whole notion of machine learning: providing a training set and then turning it over to a machine,” he says.

“If I can solve a problem, I like the process of figuring out how to get a computer to solve the same problem.”

Rosenberg gives an example: “Say you have a whole list of files of photographs and you want to identify all the ones that have a picture of a person in them. You would go through the photographs and put all the ones with a person in them in one folder and all the ones without a person in another folder. With machine learning, you take those folders that you’ve separated and feed them into a learning algorithm, which figures out how to generalize the phenomena that you are identifying. Now when you feed the algorithm a new picture, it can sort it itself, saving you tons of time.”

Teaching the CDS course “Machine Learning and Computational Statistics” brought an award and lots of good questions

Rosenberg’s connection with CDS began at Berkeley. When David Sontag, currently an assistant professor of Computer Science and Data Science at NYU, was an undergraduate at Berkeley, he took the course Statistical Learning Theory, for which

Rosenberg was a teaching assistant. When Sontag joined the NYU faculty in 2013, he asked Rosenberg and a few others from Berkeley if they would agree to be project advisors for his class, Machine Learning and Computational Statistics. Last summer, Sontag again contacted that same group, asking if any of them knew somebody who might take over his class while he went on sabbatical. “I was interested so I applied and got it. I had never been a head instructor so that was pretty exciting,” David says.

Rosenberg received CDS’s Professor of the Year award, voted on by the students themselves. He will teach the course again in Spring 2016.

Asked what his debut in the teaching field was like, Rosenberg says, “Students asked a ton of great questions — many of them asking for more rigorous justification for the rules of thumb, conventional wisdom and standard ‘best practices’ that abound in practical machine learning. And these are exactly the right questions to ask, because conventional wisdom about ‘good approaches’ can change quickly as new techniques are developed. If you understand the reasoning behind the claims, you’ll be in a much better position to adapt as the field changes over time.”

Rosenberg’s advice to today’s students? “To learn well the fundamental principles and mathematics of machine learning and statistics, which will be relevant much longer than the particulars of any of today’s cutting edge methods.”

Rosenberg’s prediction: The future of data science is toward ease of use, enabling people to think beyond the problem at hand

Today when someone invents an innovative new solution to a challenging data science problem, “you can read about it in a research paper and it usually takes a high level of technical sophistication to understand and implement the method,” Rosenberg explains. “But as time goes on, the most successful of these methods are implemented in machine learning and statistics libraries, which makes them easy to use, even without much specialized training.”

After this commoditization process, what was previously a sophisticated solution to a challenging problem becomes just one building block in a potentially much more complicated solution. According to Rosenberg, the enormous benefit is that “this frees you to think beyond just solving the problem in front of you, enabling you to get further with the same effort because the initial steps have now become relatively easy.”

If this is indeed the case, no telling how far data science can go.

 

By ML Ball