Liess, Stefan
A Fast-Optimal Guaranteed Algorithm For Learning Sub-Interval Relationships in Time Series
Agrawal, Saurabh, Verma, Saurabh, Karpatne, Anuj, Liess, Stefan, Chatterjee, Snigdhansu, Kumar, Vipin
Traditional approaches focus on finding relationships between two entire time series, however, many interesting relationships exist in small sub-intervals of time and remain feeble during other sub-intervals. We define the notion of a sub-interval relationship (SIR) to capture such interactions that are prominent only in certain sub-intervals of time. To that end, we propose a fast-optimal guaranteed algorithm to find most interesting SIR relationship in a pair of time series. Lastly, we demonstrate the utility of our method in climate science domain based on a real-world dataset along with its scalability scope and obtain useful domain insights.
Mining Novel Multivariate Relationships in Time Series Data Using Correlation Networks
Agrawal, Saurabh, Steinbach, Michael, Boley, Daniel, Chatterjee, Snigdhansu, Atluri, Gowtham, Dang, Anh The, Liess, Stefan, Kumar, Vipin
In many domains, there is significant interest in capturing novel relationships between time series that represent activities recorded at different nodes of a highly complex system. In this paper, we introduce multipoles, a novel class of linear relationships between more than two time series. A multipole is a set of time series that have strong linear dependence among themselves, with the requirement that each time series makes a significant contribution to the linear dependence. We demonstrate that most interesting multipoles can be identified as cliques of negative correlations in a correlation network. Such cliques are typically rare in a real-world correlation network, which allows us to find almost all multipoles efficiently using a clique-enumeration approach. Using our proposed framework, we demonstrate the utility of multipoles in discovering new physical phenomena in two scientific domains: climate science and neuroscience. In particular, we discovered several multipole relationships that are reproducible in multiple other independent datasets and lead to novel domain insights.
Understanding Dominant Factors for Precipitation over the Great Lakes Region
Chatterjee, Soumyadeep (University of Minnesota, Twin Cities ) | Liess, Stefan (University of Minnesota, Twin Cities) | Banerjee, Arindam (University of Minnesota, Twin Cities) | Kumar, Vipin (University of Minnesota, Twin Cities)
Statistical modeling of local precipitation involves understanding local, regional and global factors informative of precipitation variability in a region. Modern machine learning methods for feature selection can potentially be explored for identifying statistically significant features from pool of potential predictors of precipitation. In this work, we consider sparse regression, which simultaneously performs feature selection and regression, followed by random permutation tests for selecting dominant factors. We consider average winter precipitation over Great Lakes Region in order to identify its dominant influencing factors.Experiments show that global climate indices, computed at different temporal lags, offer predictive information for winter precipitation. Further, among the dominant factors identified using randomized permutation tests, multiple climate indices indicate the influence of geopotential height patterns on winter precipitation.Using composite analysis, we illustrate that certain patterns are indeed typical in high and low precipitation years, and offer plausible scientific reasons for variations in precipitation.Thus, feature selection methods can be useful in identifying influential climate processes and variables, and thereby provide useful hypotheses over physical mechanisms affecting local precipitation.
A Novel and Scalable Spatio-Temporal Technique for Ocean Eddy Monitoring
Faghmous, James H. (The University of Minnesota) | Chamber, Yashu (The University of Minnesota) | Boriah, Shyam (The University of Minnesota) | Vikebø, Frode ( Institute of Marine Research ) | Liess, Stefan (The University of Minnesota) | Mesquita, Michel dos Santos (Bjerknes Centre for Climate Research) | Kumar, Vipin (The University of Minnesota)
Swirls of ocean currents known as ocean eddies are a crucial component of the ocean's dynamics. In addition to dominating the ocean's kinetic energy, eddies play a significant role in the transport of water, salt, heat, and nutrients. Therefore, understanding current and future eddy patterns is a central climate challenge to address future sustainability of marine ecosystems. The emergence of sea surface height observations from satellite radar altimeter has recently enabled researchers to track eddies at a global scale. The majority of studies that identify eddies from observational data employ highly parametrized connected component algorithms using expert filtered data, effectively making reproducibility and scalability challenging. In this paper, we frame the challenge of monitoring ocean eddies as an unsupervised learning problem. We present a novel change detection algorithm that automatically identifies and monitors eddies in sea surface height data based on heuristics derived from basic eddy properties. Our method is accurate, efficient, and scalable. To demonstrate its performance we analyze eddy activity in the Nordic Sea (60-80N and 20W-20E), an area that has received limited attention and has proven to be difficult to analyze using other methods.