Collaborating Authors

NIPS 2016: A survey of tutorials, papers, and workshops Two Sigma


Since its launch in 1987, the annual Conference on Neural Information Processing Systems (NIPS) has brought together researchers working on neural networks and related fields, but it later diversified to become one of the largest conferences in machine learning. In recent years, the trend towards deep learning has brought the conference closer to its roots. The 2016 program spanned six days (Dec 5 to 10) and included tutorials, oral and poster presentations, workshops, and invited talks on a broad range of research topics. Following their previous Insights post on ICML 2016, Two Sigma researchers Vinod Valsalam and Firdaus Janoos discuss below the notable advances in deep learning, optimization algorithms, Bayesian techniques, and time-series analysis presented at NIPS 2016. With 550 accepted papers and 50 workshops, the number of attendees more than doubled in the past two years (from more than 2500 in 2014 to over 5000 in 2016), demonstrating rapidly growing interest in machine learning and artificial intelligence. That included strong industry participation (Two Sigma was among the more than 60 sponsors), both for recruiting talent as well as for presenting advances in the field. Several interesting invited talks were given by researchers who are established in both academia and industry.

Translating biomarkers between multi-way time-series experiments Machine Learning

Translating potential disease biomarkers between multi-species 'omics' experiments is a new direction in biomedical research. The existing methods are limited to simple experimental setups such as basic healthy-diseased comparisons. Most of these methods also require an a priori matching of the variables (e.g., genes or metabolites) between the species. However, many experiments have a complicated multi-way experimental design often involving irregularly-sampled time-series measurements, and for instance metabolites do not always have known matchings between organisms. We introduce a Bayesian modelling framework for translating between multiple species the results from 'omics' experiments having a complex multi-way, time-series experimental design. The underlying assumption is that the unknown matching can be inferred from the response of the variables to multiple covariates including time.

Capturing Structure Implicitly from Time-Series having Limited Data Machine Learning

Scientific fields such as insider-threat detection and highway-safety planning often lack sufficient amounts of time-series data to estimate statistical models for the purpose of scientific discovery. Moreover, the available limited data are quite noisy. This presents a major challenge when estimating time-series models that are robust to overfitting and have well-calibrated uncertainty estimates. Most of the current literature in these fields involve visualizing the time-series for noticeable structure and hard coding them into pre-specified parametric functions. This approach is associated with two limitations. First, given that such trends may not be easily noticeable in small data, it is difficult to explicitly incorporate expressive structure into the models during formulation. Second, it is difficult to know $\textit{a priori}$ the most appropriate functional form to use. To address these limitations, a nonparametric Bayesian approach was proposed to implicitly capture hidden structure from time series having limited data. The proposed model, a Gaussian process with a spectral mixture kernel, precludes the need to pre-specify a functional form and hard code trends, is robust to overfitting and has well-calibrated uncertainty estimates.

Forecasting Sleep Apnea with Dynamic Network Models Artificial Intelligence

Dynamic network models (DNMs) are belief networks for temporal reasoning. The DNM methodology combines techniques from time series analysis and probabilistic reasoning to provide (1) a knowledge representation that integrates noncontemporaneous and contemporaneous dependencies and (2) methods for iteratively refining these dependencies in response to the effects of exogenous influences. We use belief-network inference algorithms to perform forecasting, control, and discrete event simulation on DNMs. The belief network formulation allows us to move beyond the traditional assumptions of linearity in the relationships among time-dependent variables and of normality in their probability distributions. We demonstrate the DNM methodology on an important forecasting problem in medicine. We conclude with a discussion of how the methodology addresses several limitations found in traditional time series analyses.

Director, Machine Learning & Data Science


Design and build personalization engines/learning systems using advanced machine learning and statistical techniques Help the company in identifying tools and components, and building the infrastructure for AI/ML Research and brainstorm with internal partners to identify advanced analytics opportunities to advance automation, help with knowledge discovery, support decision-making, gain insights from data, streamline business processes, and enable new capabilities Perform hands-on data exploration and modeling work on massive data sets. Perform feature engineering, train the algorithms, back-test models, compare model performances and communicate the results Work with senior leaders from all functions to explore opportunities for using advance analytics Provide technical leadership mentoring to talented data scientists and analytics professionals Guide data scientists and engineers in the use of advanced statistical, machine learning, and artificial intelligence methodologies Provide thought leadership by researching best practices, extending and building new machine learning and statistical methodologies, conducting experiments, and collaborating with cross functional teams Develop end-to-end efficient model solutions that drive measurable outcomes. These technical skills include, but not limited to, regression techniques, neural networks, decision trees, clustering, pattern recognition, probability theory, stochastic systems, Bayesian inference, statistical techniques, deep learning, supervised learning, unsupervised learning Solid understanding and hands on experience working with big data, and the related ecosystem, both relational and unstructured. Executing on complex projects, extracting, cleansing, and manipulating large, diverse structured and unstructured data sets on relational – SQL, NOSQL databases Working in an agile environment with iterative development & business feedback Providing insights to support strategic decisions, including offering and delivering insights and recommendations Experience in statistics & analytical modeling, time-series data analysis, forecasting modeling, machine learning algorithms, and deep learning approaches and frameworks. Deliver robust, scale and quality data analytical applications in a cloud environment.