AITopics

2110.14521

Country:

South America > Chile (0.04)
North America > United States > Rocky Mountains (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Singh, Meghna, Goel, Saksham, Mohan, Abhiraj, Kazaglis, Louis, Srivastava, Jaideep

PARIS: Personalized Activity Recommendation for Improving Sleep Quality

arXiv.org Artificial IntelligenceOct-26-2021

The quality of sleep has a deep impact on people's physical and mental health. People with insufficient sleep are more likely to report physical and mental distress, activity limitation, anxiety, and pain. Moreover, in the past few years, there has been an explosion of applications and devices for activity monitoring and health tracking. Signals collected from these wearable devices can be used to study and improve sleep quality. In this paper, we utilize the relationship between physical activity and sleep quality to find ways of assisting people improve their sleep using machine learning techniques. People usually have several behavior modes that their bio-functions can be divided into. Performing time series clustering on activity data, we find cluster centers that would correlate to the most evident behavior modes for a specific subject. Activity recipes are then generated for good sleep quality for each behavior mode within each cluster. These activity recipes are supplied to an activity recommendation engine for suggesting a mix of relaxed to intense activities to subjects during their daily routines. The recommendations are further personalized based on the subjects' lifestyle constraints, i.e. their age, gender, body mass index (BMI), resting heart rate, etc, with the objective of the recommendation being the improvement of that night's quality of sleep. This would in turn serve a longer-term health objective, like lowering heart rate, improving the overall quality of sleep, etc.

artificial intelligence, behavior mode, machine learning, (14 more...)

2110.13745

Country:

North America > United States > Minnesota (0.04)
North America > United States > North Carolina (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Alabama > Jefferson County > Birmingham (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Sleep (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Santos, Yuri, Tyska, Jônata, Bogorny, Vania

Mining frequency-based sequential trajectory co-clusters

arXiv.org Artificial IntelligenceOct-26-2021

Co-clustering is a specific type of clustering that addresses the problem of finding groups of objects without necessarily considering all attributes. This technique has shown to have more consistent results in high-dimensional sparse data than traditional clustering. In trajectory co-clustering, the methods found in the literature have two main limitations: first, the space and time dimensions have to be constrained by user-defined thresholds; second, elements (trajectory points) are clustered ignoring the trajectory sequence, assuming that the points are independent among them. To address the limitations above, we propose a new trajectory co-clustering method for mining semantic trajectory co-clusters. It simultaneously clusters the trajectories and their elements taking into account the order in which they appear. This new method uses the element frequency to identify candidate co-clusters. Besides, it uses an objective cost function that automatically drives the co-clustering process, avoiding the need for constraining dimensions. We evaluate the proposed approach using real-world a publicly available dataset. The experimental results show that our proposal finds frequent and meaningful contiguous sequences revealing mobility patterns, thereby the most relevant elements.

candidate co-cluster, sequence, trajectory, (15 more...)

2110.1411

Country:

South America > Brazil > Santa Catarina > Florianópolis (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Paul, Debolina, Chakraborty, Saptarshi, Das, Swagatam, Xu, Jason

Uniform Concentration Bounds toward a Unified Framework for Robust Clustering

arXiv.org Machine LearningOct-26-2021

Recent advances in center-based clustering continue to improve upon the drawbacks of Lloyd's celebrated $k$-means algorithm over $60$ years after its introduction. Various methods seek to address poor local minima, sensitivity to outliers, and data that are not well-suited to Euclidean measures of fit, but many are supported largely empirically. Moreover, combining such approaches in a piecemeal manner can result in ad hoc methods, and the limited theoretical results supporting each individual contribution may no longer hold. Toward addressing these issues in a principled way, this paper proposes a cohesive robust framework for center-based clustering under a general class of dissimilarity measures. In particular, we present a rigorous theoretical treatment within a Median-of-Means (MoM) estimation framework, showing that it subsumes several popular $k$-means variants. In addition to unifying existing methods, we derive uniform concentration bounds that complete their analyses, and bridge these results to the MoM framework via Dudley's chaining arguments. Importantly, we neither require any assumptions on the distribution of the outlying observations nor on the relative number of observations $n$ to features $p$. We establish strong consistency and an error rate of $O(n^{-1/2})$ under mild conditions, surpassing the best-known results in the literature. The methods are empirically validated thoroughly on real and synthetic datasets.

algorithm, outlier, theorem 3, (14 more...)

2110.14148

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Pathak, Reese, Sen, Rajat, Rao, Nikhil, Erichson, N. Benjamin, Jordan, Michael I., Dhillon, Inderjit S.

Cluster-and-Conquer: A Framework For Time-Series Forecasting

arXiv.org Machine LearningOct-26-2021

We propose a three-stage framework for forecasting high-dimensional time-series data. Our method first estimates parameters for each univariate time series. Next, we use these parameters to cluster the time series. These clusters can be viewed as multivariate time series, for which we then compute parameters. The forecasted values of a single time series can depend on the history of other time series in the same cluster, accounting for intra-cluster similarity while minimizing potential noise in predictions by ignoring inter-cluster effects. Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step. It is computationally efficient and embarrassingly parallel. We motivate our framework with a theoretical analysis in an idealized mixed linear regression setting, where we provide guarantees on the quality of the estimates. We accompany these guarantees with experimental results that demonstrate the advantages of our framework: when instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets, sometimes outperforming deep-learning-based approaches.

dataset, forecasting, time sery, (14 more...)

2110.14011

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Schratz, Patrick, Becker, Marc, Lang, Michel, Brenning, Alexander

Mlr3spatiotempcv: Spatiotemporal resampling methods for machine learning in R

arXiv.org Machine LearningOct-25-2021

Spatial and spatiotemporal prediction tasks are common in applications ranging from environmental sciences to archaeology and epidemiology. While sophisticated mathematical frameworks have long been developed in spatial statistics to characterize predictive uncertainties under well-defined mathematical assumptions such as intrinsic stationarity (e.g., Cressie 1993), computational estimation procedures have only been proposed more recently to assess predictive performances of spatial and spatiotemporal prediction models (Brenning 2005, 2012; Pohjankukka, Pahikkala, Nevalainen, and Heikkonen 2017; Roberts, Bahn, Ciuti, Boyce, Elith, Guillera-Arroita, Hauenstein, Lahoz-Monfort, Schröder, Thuiller, Warton, Wintle, Hartig, and Dormann 2017). Although alternatives such as the bootstrap exist since some decades (Efron and Gong 1983; Hand 1997), cross-validation (CV) is a particularly well-established, easy-to-implement algorithm for model assessment of supervised machine-learning models (Efron and Gong 1983, and next section) and model selection (Arlot and Celisse 2010). In its basic form, CV is based on resampling the data without paying attention to any possible dependence structure, which may arise from, e.g., grouped or structured data, or underlying environmental processes inducing some sort of spatial coherence at the landscape scale. In treating dependent observations as independent, or ignoring autocorrelation, CV test samples may in fact be heavily correlated with, or even pseudo-replicates of, the data used for training the model, which introduces a potentially severe bias in assessing the transferability of flexible machine-learning (ML) models.

model assessment, partition, spatiotemporal, (14 more...)

2110.12674

Country:

North America > United States > New York (0.04)
Europe > Spain > Aragón (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Mozaffari, M. Hamed, Tay, Li-Lin

Spectral unmixing of Raman microscopic images of single human cells using Independent Component Analysis

arXiv.org Artificial IntelligenceOct-25-2021

Application of independent component analysis (ICA) as an unmixing and image clustering technique for high spatial resolution Raman maps is reported. A hyperspectral map of a fixed human cell was collected by a Raman micro spectrometer in a raster pattern on a 0.5-µm grid. Unlike previously used unsupervised machine learning techniques such as principal component analysis, ICA is based on non-Gaussianity and statistical independence of data which is the case for mixture Raman spectra. Hence, ICA is a great candidate for assembling pseudo-colour maps from the spectral hypercube of Raman spectra. Our experimental results revealed that ICA is capable of reconstructing false colour maps of Raman hyperspectral data of human cells, showing the nuclear region constituents as well as subcellular organelle in the cytoplasm and distribution of mitochondria in the perinuclear region.

component analysis, raman spectra, spectra, (13 more...)

2110.13189

Country: North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Chehreghani, Morteza Haghir

Shift of Pairwise Similarities for Data Clustering

arXiv.org Artificial IntelligenceOct-25-2021

Several clustering methods (e.g., Normalized Cut and Ratio Cut) divide the Min Cut cost function by a cluster-dependent factor (e.g., the size or the degree of the clusters), in order to yield a more balanced partitioning. We, instead, investigate adding such regularizations to the original cost function. We first consider the case where the regularization term is the sum of the squared size of the clusters, and then generalize it to adaptive regularization of the pairwise similarities. This leads to shifting (adaptively) the pairwise similarities which might make some of them negative. We then study the connection of this method to Correlation Clustering and then propose an efficient local search optimization algorithm with fast theoretical convergence rate to solve the new clustering problem. In the following, we investigate the shift of pairwise similarities on some common clustering methods, and finally, we demonstrate the superior performance of the method by extensive experiments on different datasets.

clustering, cost function, pairwise similarity, (15 more...)

2110.13103

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Kansas (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Machine LearningOct-25-2021

Integrative Clustering of Multi-View Data by Nonnegative Matrix Factorization

Liu, Shuo Shuo, Lin, Lin

Learning multi-view data is an emerging problem in machine learning research, and nonnegative matrix factorization (NMF) is a popular dimensionality-reduction method for integrating information from multiple views. These views often provide not only consensus but also diverse information. However, most multi-view NMF algorithms assign equal weight to each view or tune the weight via line search empirically, which can be computationally expensive or infeasible without any prior knowledge of the views. In this paper, we propose a weighted multi-view NMF (WM-NMF) algorithm. In particular, we aim to address the critical technical gap, which is to learn both view-specific and observation-specific weights to quantify each view's information content. The introduced weighting scheme can alleviate unnecessary views' adverse effects and enlarge the positive effects of the important views by assigning smaller and larger weights, respectively. In addition, we provide theoretical investigations about the convergence, perturbation analysis, and generalization error of the WM-NMF algorithm. Experimental results confirm the effectiveness and advantages of the proposed algorithm in terms of achieving better clustering performance and dealing with the corrupted data compared to the existing algorithms.

algorithm, dataset, diag 2, (12 more...)

2110.1324

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Artificial IntelligenceOct-22-2021

Multi-view Contrastive Graph Clustering

Pan, Erlin, Kang, Zhao

With the explosive growth of information technology, multi-view graph data have become increasingly prevalent and valuable. Most existing multi-view clustering techniques either focus on the scenario of multiple graphs or multi-view attributes. In this paper, we propose a generic framework to cluster multi-view attributed graph data. Specifically, inspired by the success of contrastive learning, we propose multi-view contrastive graph clustering (MCGC) method to learn a consensus graph since the original graph could be noisy or incomplete and is not directly applicable. Our method composes of two key steps: we first filter out the undesirable high-frequency noise while preserving the graph geometric features via graph filtering and obtain a smooth representation of nodes; we then learn a consensus graph regularized by graph contrastive loss. Results on several benchmark datasets show the superiority of our method with respect to state-of-the-art approaches. In particular, our simple approach outperforms existing deep learning-based methods.

graph, information, representation, (15 more...)

2110.11842

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)