Moore Sloan Poster Archives - NYU Center for Data Science

Prior-knowledge-based Mammalian Gene Regulatory Network Inference

During metazoan animal development, changes in chromatin states give rise to diverse patterns of gene expression that direct the differentiation of progenitor cells into various tissues and organs. We use information from TF binding motifs and chromatin accessibility data to infer cell-type specific TF occupancy, and incorporate inferred TF-target interactions as prior knowledge to learn … Continue reading Prior-knowledge-based Mammalian Gene Regulatory Network Inference

Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests

We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data. We obtain an … Continue reading Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests

Massive Multi-Species Function Prediction

The rate of new protein discovery has, in recent years, outpaced our ability to annotate and characterize new proteins and proteomes. In order to combat this functional annotation deficit, many groups have successfully turned to computational techniques, attempting to predict the function of proteins in order to guide experimental verification. The most prolific methods come from … Continue reading Massive Multi-Species Function Prediction

Interactive Visualization of Density Estimation Using Adaptive Bandwidths

This work presents a novel technique for real-time density estimation using adaptive bandwidth and GPUs. Density estimation and heatmaps are one of the most commonly used types of visualization for geo-referenced data; it allows the user to easily get insights from the data in a simple and straightforward way, by visualizing the density of a … Continue reading Interactive Visualization of Density Estimation Using Adaptive Bandwidths

Provably and Practically Learning Topic Models

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds … Continue reading Provably and Practically Learning Topic Models

A New Scoring Function for Bayesian Network Structure Learning Extended to Arbitrary Discrete Variables

Bayesian networks are graphs that represent probabilistic relationships between variables. Learning the graph structure that best fits the data is often framed as an optimization problem, where the search space is the set of all possible graph structures, and the objective function has some measure of nearness to the data, as well as some regularization … Continue reading A New Scoring Function for Bayesian Network Structure Learning Extended to Arbitrary Discrete Variables

Visual Inter-Comparison of Multifaceted Climate Models

Inter-comparison and similarity analysis to gauge consensus among multiple simulation models is a critical problem for understanding climate change patterns. Climate models represent time and space variable ecosystem processes, like, simulations of photosynthesis and respiration, using algorithms and driving variables such as climate and land use. It is widely accepted that effective use of visualization … Continue reading Visual Inter-Comparison of Multifaceted Climate Models

Visual Exploration of Big Spatio-temporal Urban Data: A Study of New York City Taxi Trips

As increasing volumes of urban data are captured and become available, new opportunities arise for data-driven analysis that can lead to improvements in the lives of citizens through evidence-based decision making and policies. In this work, we focus on a particularly important urban data set: taxi trips. Taxis are valuable sensors and information associated with … Continue reading Visual Exploration of Big Spatio-temporal Urban Data: A Study of New York City Taxi Trips

Using Topological Analysis to Support Event-Guided Exploration in Urban Data

The explosion in the volume of data about urban environments has opened up opportunities to inform both policy and administration and thereby help governments improve the lives of their citizens, increase the efficiency of public services, and reduce the environmental harms of development. However, cities are complex systems and exploring the data they generate is challenging. The interaction between the … Continue reading Using Topological Analysis to Support Event-Guided Exploration in Urban Data

Two Machine-Learning Models of Object Recognition Exhibit Key Features of Human Performance

We have implemented two machine-learning models of object recognition by human observers. Both models capture two hallmarks of human performance: (1) spatial frequency channels and (2) effects of font complexity. One model is a Convolutional Neural Network (ConvNet), and the other is a texture statistics model followed by a simple classifier. With appropriate training and … Continue reading Two Machine-Learning Models of Object Recognition Exhibit Key Features of Human Performance