Spring 2021
NYU Center for Data Science
Bulletin
Director’s Corner
A Note From Julia Kempe
Another semester under the sign of COVID, the third – something nobody would have imagined in early 2020. This spring, we continued to open our doors to our community, once again in a hybrid format due to the ongoing COVID-19 pandemic. Our students, staff, faculty, and researchers continued to demonstrate their resiliency, launched new efforts, welcomed newcomers, strengthened partnerships, and most notably, performed rigorous research despite unimaginable obstacles.
Our faculty continues to shine: Kyunghyun Cho received the prestigious Samsung Ho-Am Prize in Engineering; Pascal Wallisch was awarded an NYU 2021 Teaching Innovation Award, and Julia Stoyanovich, Co-Founder & Director of the Center for Responsible AI (R/AI), helped cement CDS’ data ethics efforts by bridging R/AI to CDS’ ongoing initiatives, and receiving the designation as Institute Associate Professor of the Tandon School of Engineering. We also launched a Guest Research Editorial series for our blog, in which CDS community members take to our platform to share new ideas, ask thought-provoking questions, and highlight exciting research insights.
Our academic programs thrived despite the difficulties posed by the pandemic. The undergraduate data science program, which launched in Fall 2019, saw incredible growth this year – 294 undergraduate students declared data science as their majors and minors, bringing the total number of undergraduate majors and minors to 349.
128 masters students graduated despite the immense obstacles remote and hybrid education have posed throughout the majority of their time at CDS. Our faculty went far and beyond to deliver high quality education and mentoring in this unusual hybrid setting. We also celebrated the graduation of our very first undergraduate cohort and our first two PhD students in May. The culmination of years of our community’s growth and progress is a momentous cause for celebration.
CDS made sure to continue cultivating relationships with our external partners: DeepMind, one of our earliest partners, worked closely with our external affairs team this year to ensure scholarship funding for underrepresented graduate students. Capital One supported a new and important initiative, the CDS Undergraduate Research Program (CURP), which came to fruition in the Spring 2021 semester, allowing for research internships of 19 talented, diverse undergraduate students from all over the country, who benefited from the expertise of 9 participating CDS faculty. We are fortunate to enter into a new partnership with Moody’s, who will provide support for CURP in the coming year.
As we remain hopeful that life may soon begin to return to normal, I am grateful to you all for your determination, engagement and persistence, and proud to be part of this inspiring community.
My very best wishes to you all.
Julia Kempe
Director, Center for Data Science
Professor of Computer Science and Mathematics
Spring 2021 CDS Highlights
Kyunghyun Cho awarded Samsung Ho-Am Prize in Engineering
CDS hosted “Genius Makers”: A conversation w/ NY Times reporter Cade Metz & Yann LeCun
CDS members launch NYU AI School, an online week-long education on artificial intelligence and machine learning
Inaugural CDS Undergraduate Research Program (CURP) spring 2021 cohort, sponsored by Capital One
Elena Sizikova named 2020 Rising Star in Engineering in Health by Columbia University
First two CDS PhD students graduated
Graduated first undergrad cohort of 31 majors and minors
Joint faculty member Julia Stoyanovich launched the Center for Responsible AI@NYU
294
Newly declared undergrad majors, joint majors, and minors
1,809
MS Applicants
521
PhD Applicants
Research Feature
Check out some of our recent ground-breaking research achievements and activities!
CDS Professor Co-Authors Paper on What Makes a Narcissist
CDS Professor Pascal Wallisch co-authored a paper in Spring 2021 titled, “Narcissism through the lens of performative self-elevation.” Wallisch’s work sets out to differentiate our understanding of narcissism by examining its links to insecurity and grandiosity. In Psychology, behaviors cannot be taken at face value. People might exhibit the same behavior — for instance name-dropping — out of genuine grandiosity or due to insecurities. To differentiate these possibilities, the authors created a measurement instrument which they termed FLEX (perFormative seLf-Elevation indeX) to capture self-elevating tendencies. The reliability and validity of this instrument was established with an omniverse analysis. By using these methods in a sufficiently large sample of participants, the authors were able to show that FLEX correlates highly with narcissism and insecurity, but not grandiosity and concluded that narcissism is not driven by an abundance of self love but actually self loathing… Read more of the “CDS Professor Co-Authors Paper on What Makes a Narcissist“ blog post.
CDS Team Members Lead AI Initiative to Enhance Climate Change Projections
CDS affiliated professor of atmosphere/ocean science and mathematics Laure Zanna, along with CDS professors Carlos Fernandez-Granda and Joan Bruna, are leading a project supported by Schmidt Futures, whose objective is to enhance climate-change projections by improving climate simulations using artificial intelligence (AI). Scientists rely on computer simulations (or climate models) to describe the evolution of atmosphere, ocean, and ice systems due to their complexity. These climate models separate the climate system into a series of grid boxes (or grid cells) to imitate how the ocean, atmosphere, and ice are changing and interact with each other. The problem is that the selected number of grid boxes is limited by computer power. Currently, climate models for multi-decade projections use grid box sizes measuring approximately 50 km to 100 km (roughly 30 to 60 miles). As a result, processes that occur on scales that are smaller than the grid cell (clouds, turbulence, and ocean mixing) are not well captured.The initiative essentially will use machine learning to more holistically capture physical ocean, ice, and atmosphere processes as a means to reduce the imprecision of existing models. Ultimately, machine learning will lead the development of interpretable, physics-guided representations of these complex processes directly from data for use in global climate simulations… Read more of the “CDS Team Members Lead AI Initiative to Enhance Climate Change Projections” blog post.
CDS Members Present at Academic Data Science Coast to Coast Seminar Series
CDS faculty fellow Angela Radulescu and CDS affiliated professor Laure Zanna presented their research this spring at the Data Science Coast to Coast (DS C2C) seminar series. Angela’s research focuses on how we learn to organize our experiences into internal representations that enable flexible behavior, varying from simple decision-making to goal-centric action in structured settings. Angela’s research combines computational approaches — in particular reinforcement learning and Bayesian inference — with behavioral experiments, eye-tracking, and neuroimaging. Her current focus is on developing methods to study goal-directed learning in virtual reality (VR). Laure’s research meanwhile focuses on the dynamics of the climate system. Her ultimate objective is to study the influence of the ocean on climate change on both local and global scales. Laure and her team’s research paves the way to the discovery of new physics from data as well as the improvement of numerical simulations of oceanic and atmospheric flows… Read more of the “CDS Members Present at Academic Data Science Coast to Coast Seminar Series” blog post.
CDS Student, Aditya Singhal, Awarded Dean’s Undergraduate Research Fund Grant
CDS undergraduate student, Aditya Singhal (Founder of the NYU Data Science Club), was recently given a Dean’s Undergraduate Research Fund grant to study machine learning and emotion acquisition. “In the natural language processing/chatbot creation world you can fine-tune existing models that can generate human-like speech on certain data sets which have a predominant emotion. So basically what we’re trying to do is take one of these existing models that was already trained on being empathetic and fine-tune it on this new data set to become more trustworthy to humans,” said Singhal. “The aim is to try to see if it’s possible and to try to see if the chatbot is quantifiably more empathetic. The implications would be, for example, in Telehealth. It’s an emerging field that’s extremely important and would be much more efficient with more empathetic chatbots. If patients don’t trust chatbots, if they don’t feel comfortable enough to share private information with them, even though everything is confidential, then things don’t work. But being able to teach machines how to interact like human beings — there’s like infinite applications to it. We’re just a tiny drop in the ocean of NLP, trying to make our chatbot more empathetic.” Read more of the “CDS Student, Aditya Singhal, Awarded Dean’s Undergraduate Research Fund Grant” blog post.
CDS PhD Student Co-Authors Paper on Deep Multispecies Network-Based Protein Function Prediction
CDS PhD student Meet Barot recently co-authored “NetQuilt: deep multispecies network-based protein function prediction using homology-informed network similarity,” along with CDS professors Kyunghyun Cho and Richard Bonneau. The paper details how sequences have been the central source of information protein function prediction — which is primarily due to their multitude and ease with which many models can incorporate large amounts of sequence data. They go on to explain, however, that in function prediction, sequence information does not provide the context of a protein in an organism — and this context can be immensely relevant in determining the protein’s function. The team introduces their method NetQuilt, which allows for the integration of sequences and networks, in turn allowing the limited knowledge of the homology between proteins to be augmented by knowledge of the network topology. NetQuilt also creates “protein features that are not tied to a single species and that include evolutionary and functional information.”Most importantly, the method “enables network-based function prediction even for species for which knowledge of their protein interaction networks is limited.”… Read more of the “CDS PhD Student Co-Authors Paper on Deep Multispecies Network-Based Protein Function Prediction” blog post.
CDS PhD Student Presents on Transfer Learning in NLP
Phu Mon Htut, CDS PhD student, gave a talk in January on transfer learning in NLP at the Hamburg Natural Language Processing Meetup. Htut described how to maximize the chance of success when using transfer learning, and provided pertinent information on useful resources for transfer learning such as libraries and open source tools. Her presentation clarified how transfer learning has led to improvements in NLP and why we should use transfer learning in NLP. As an example, she referenced the SuperGlue Benchmark. There are several different transfer learning methods but Htut focused specifically on sequential transfer learning, which refers to a method where a model is trained on a task and subsequently that pretrained model is used to train another task. Additionally, she discussed helpful resources such as jiant, which is a software toolkit built on PyTorch for NLP research… Read more of the “CDS PhD Student Presents on Transfer Learning in NLP” blog post.
Joint Faculty, Julia Stoyanovich
The Launch of Responsible AI@NYU
What began as a manifest, inspired a class, and has now prompted the launch of a new center at NYU. The NYU Center for Responsible AI (R/AI) formally launched last year and was co-founded by CDS Assistant Professor Julia Stoyanovich. A prominent data ethicist, Julia’s goal as Director of R/AI in partnership with CDS is to make “AI” synonymous with “responsible AI.” And furthermore, to make the data ethics field as accessible as possible.
One of Julia’s groundbreaking initiatives with R/AI is to produce a comic series, Mirror, Mirror, aimed at teaching data responsibility to broad audiences in a fun, easy-to-digest way. “The whole conversation about ethics and responsibility in AI is so joyless!” Stoyanovich remarked on the CDS blog earlier this spring, “To learn about a topic, we have to first feel that it’s within reach. And nothing is more helpful for this than a healthy dose of humor. The comic is helping us bring the necessary lightness into the conversation.”
Julia’s efforts have quickly gained attention and support across NYU and beyond. In addition to partnering with CDS and other NYU departments, in February Stoyanovich was invited to present a kickoff webinar for NYU’s MLK Week — a series of events commemorating the legacy of the Civil Rights leader — with her talk, Courageous Conversations in AI. R/AI’s efforts to make responsible AI more accessible have also been highlighted this spring by Le Monde, Toronto Star, Wired, and other publications. Meanwhile Julia and her team at R/AI are busy cementing the Center’s programming and new initiatives — including the exciting AI for Good Startup Incubator, which aims to provide support for “opportunities to apply artificial intelligence to societal problems that are otherwise overlooked in pursuit of broad capital market opportunities,” according to the R/AI webpage. Julia’s efforts in the creation of R/AI, Mirror, Mirror, and other educational data ethics efforts, will help ensure that data responsibility is not just incidental to the conversation, but integral to it.
Welcome Our New Faculty Affiliates
Medical Track PhD Students
Boyang Yu
Before joining CDS, Boyang Yu majored in probability and statistics at the School of Mathematical Science, University of Science and Technology of China (USTC). She is broadly interested in the intersection of machine learning and healthcare.
She has studied the association between body composition and mortality of lung cancer patients with Professor Junwei Lu at Harvard T. H. Chan School of Public Health. She has also worked on proteomics big data at Guomics, Westlake University. During her previous internship at ByteDance, she also had experience in learning to rank models and graph databases. In her spare time, she enjoys marathons, rock climbing, and cooking.
Weicheng Zhu
Weicheng Zhu is a first-year Ph.D. student at NYU Center for Data Science, co-advised by Professor Narges Razavian and Professor Carlos Fernandez-Granda. His research interests lie in machine learning for healthcare and model interpretability. His recent work includes learning graph representations from electronic health records and deep learning for medical images. Prior to the Ph.D. study, he received B.S. in Honors Mathematics at NYU Shanghai and M.S. in Data Science at NYU CDS.
Taro Makino
Taro Makino is a first-year PhD student at the NYU Center of Data Science advised by Kyunghyun Cho and Krzysztof Geras. He is interested in robustness and explainability in deep learning, and is researching these topics in the domain of medical images. He holds an M.S. in Artificial Intelligence from the University of Edinburgh where he worked with Amos Storkey, and a BA in Mathematics from Northwestern University.
Daniel Im
Daniel Im is a first year PhD student under the supervision of Kyunghyun Cho. Daniel was a research scientist at Howard Hughes Medical Institute, Janelia Research Campus under the advisory of Dr. Kristin Branson. Previously, he was a researcher at the University of Montreal, Mila machine learning institute under the advisory of Prof. Yoshua Bengio and Dr. Roland Memisevic. He also received his Masters degree from the University of Guelph’s School of Engineering under the advisory of Graham W. Taylor.
His research interests during Mila and UoGuelph were postulating deep generative models and analyzing unsupervised representation learning models under the view of a dynamical system or probabilistic perspective. He received his undergraduate degree from the University of Toronto in 2012. He also completed a specialist program in Computer Science with a specialization in Artificial Intelligence and pursued Mathematics and its application specialist program. He was also CEO and founder of AIFounded.inc, and CTO and co-founder of Coinscious.inc.
CDS Welcomes CURP Cohort
The CDS Undergraduate Research Program launched this year with the inaugural Spring 2021 Capital One cohort
The CDS Undergraduate Research Program (CURP) formally launched in Fall 2020 in partnership with the National Society of Black Physicists (NSBP) and our long-time partner, Capital One. CURP is a research mentorship program that welcomes a diverse group of undergraduate students who have completed at least two years of university-level courses and would like to conduct research in data science. CURP has two distinctive goals: To expose participating students to the research CDS faculty members conduct and provide an opportunity to develop the necessary skills and knowledge to participate in successful research collaborations. For the Spring 2021 Cohort, we welcomed 19 CURP students from across the country from Boston, to Puerto Rico, to Santa Cruz.
Throughout the program, CDS offers a series of programming that enriches the students’ research experience, provided them with information about careers in academia and industry, and provided vital opportunities to build networks that will serve them as they graduate from their undergraduate programs.
One such event was a faculty discussion panel where the faculty mentors shared their academic and research journey with the CURP students. CURP sponsor, Capital One, also offered an information session to discuss what it is like to work and conduct research at Capital One. At the end of the program, the students showcased their research in a series of lightning talks presented to the 2020 Cohort and faculty mentors.
The value of a diverse community in an intellectually challenging and inclusive educational environment is paramount. The CDS Undergraduate Research Program will continue to be a critical initiative in our commitment to creating a more diverse community not only within our walls but also in the data science field itself.
CDS CURP Students
Get to know some of our amazing Capital One CURP Spring cohort members!
Evanjelin Mahmoodi
Evanjelin Mahmoodi is a senior undergraduate student studying computer science and mathematics at UC Santa Cruz. Her project as part of NYU CURP involves using machine learning to evaluate generated medical images. Evanjelin hopes to apply to graduate programs and continue research at the intersection of computer science and healthcare.
Teanna Barrett
Teanna Barrett is currently a second year undergraduate student at Howard University, where she is pursuing a B.S. in Computer Science with a minor in Philosophy. Over the course of the Spring 2021 semester, Teanna Barrett is conducting a research project on online performative activism under the advising of CDS Moore-Sloan Faculty Fellow, Dr. Sarah Shugars. This research project uses historical Twitter data to further define and understand the mechanisms of performative content related to Black Lives Matter in 2020. Teanna Barrett’s research topic is a part of a larger research effort with fellow CURP participants, Junia Janvier and Khadija Jallow, to analyze the means in which social media is utilized for Black activism and advocacy in the United States. As Teanna Barrett continues her CURP project, she is excited to continue strengthening her Data Science skills and her understanding of Social Computing research.
Thank you CDS Partners
2021
2020-2021
Congrats to the Grads!
- PhD Graduates
- Leslie Huang
- Vladimir Kobzar
- MS Graduates
- Anjali Agrawal
- Ross Bernstein
- Apurva Bhargava
- Annika Brundyn
- Sujeong Cha
- Chengwei Chen
- Chuan Chen
- Weilong Chen
- Tianshu Chu
- Xiangyun Chu
- Aidan D Claffey
- Elizabeth Anne Combs
- Timothy Flynn Connor
- Aren George Dakessian
- Lauren Harring D’Arinzo
- Yadi Deng
- Zane David Dennis
- Yuan Ding
- Christina Therese Dominguez
- Alexander Weng Dong
- Ziyue Dong
- Steven Dornberg
- Yihang Du
- William Victor Egan
- Nyla Ennels
- Evaristus Chibuike Ezekwem
- Rong Feng
- Mufeng Gao
- Shuang Gao
- Jiyang Ge
- Anu-Ujin Gerelt-Od
- Hong Gong
- Nathan Levi Griffin
- Haotian Guan
- Yashowardhan Gupta
- Jeewon Ha
- Hongxu Hao
- Yuxuan He
- Alec Brendon Hon
- Wangrui Hou
- Gabriella Hurtado
- Karmen Alexis Hutchinson
- Mrinal Jain
- Khasi-Marc Jamieson
- Meenakshi Gaurishankar Jhalani
- Denglin Jiang
- Duo Jiang
- Zian Jiang
- Man Jin
- Hyun Jung
- Andrei Kapustin
- Noah Kasmanoff
- Lee Tze Kho
- Kamolchanok Kiatkamolwong
- Bichen Kou
- Amanda Nicolette Kuznecov
- Anthony Christopher Lanzisera
- Samantha K Lee
- Armanda Lewis
- Xinmeng Li
- Haoxue Li
- Xiaocheng Li
- Xintong Li
- Yan Li
- Yi Li
- Mu Li
- Jiayao Liu
- Kuan-Lin Liu
- Diwen Lu
- Yue Ma
- Lakshmi S. Menon
- Ray Mohabir
- Dhara A Mungra
- Anhthy Ngo
- Tinatin Nikvashvili
- Praxal Suresh Patel
- Anusha Ravi Patil
- Adrian Pearl
- Jonas Peeters
- Guido Petri
- My Tra Phung
- Aajan Quail-Dehesh
- Sumedha Rai
- Haresh Rengaraj Rajamohan
- Siddesh Ramesh
- Alene Kellogg Rhea
- Stephen Roy
- Yafu Ruan
- Abha Sahay
- Param Roshan Shah
- Parthvi Sanjay Shah
- Shuwen Shen
- Zhuoyang Shen
- Peter Simone
- Michael Hamer Stanley
- Nikhil Supekar
- Jesse Swanson
- Camille Taltas
- Angelamarie Teng
- Viraj Thakkar
- Sophia Maria Tsilerides
- Jeffrey Tumminia
- Daniel G Turkel
- Kannan Venkataramanan
- Eelis Virtanen
- Kanshuai Wang
- Xingyu Wang
- Yunya Wang
- Lizhong Wang
- Yanbing Wang
- Yuwei Wang
- Ilana Gail Weinstein
- Kevin Samuel Wilson
- Gaomin Wu
- Yucheng Xin
- Yanqi Xu
- Yi Xu
- Bolin Yang
- Zichang Ye
- Andrew Yeh
- Cheng Zeng
- Junrong Zha
- Anqi Zhang
- Yihang Zhang
- Mingdi Zheng
- Fanghao Zhong
- Yichen Zhou
- Yuyue Zhou
- DS Majors
- Duy Nguyen
- David Shimshoni
- Jiahong Zhou
- DS Joint Majors
- Anand Tyagi`
- Patricia Luk
- DS Minors
- Reema Amhaz
- Bennett Berlin
- Alexandru Bordanca
- Fangning (Tina) Cao
- Shuoxin Dai
- Richard Doherty
- Jack Feeko
- Shuting Feng
- Yuqi Guo
- Chunbo Jie
- Chanin Kitjatanapan
- Kyungmin Lee
- Aishwarya Manojkumar
- Aaditya Mehta
- Timothy Ng
- Natasha Nusantoro
- Pooja Patel
- Dharaa Rathi
- Daniel Richardson
- Gerald Steven
- Stefani Wiloejo
- Ziwei Xu
- Eva Xue
- Camilla Zhang
- Jiatong Zou
Center for Data Science
60 5th Avenue
New York, NY 10011
datascience-group@nyu.edu
cds.nyu.edu