AITopics

2305.04359

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(12 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-6-2023

Transformer-Based Hierarchical Clustering for Brain Network Analysis

Dai, Wei, Cui, Hejie, Kan, Xuan, Guo, Ying, van Rooij, Sanne, Yang, Carl

Brain networks, graphical models such as those constructed from MRI, have been widely used in pathological prediction and analysis of brain functions. Within the complex brain system, differences in neuronal connection strengths parcellate the brain into various functional modules (network communities), which are critical for brain analysis. However, identifying such communities within the brain has been a nontrivial issue due to the complexity of neuronal interactions. In this work, we propose a novel interpretable transformer-based model for joint hierarchical cluster identification and brain network classification. Extensive experimental results on real-world brain network datasets show that with the help of hierarchical clustering, the model achieves increased accuracy and reduced runtime complexity while providing plausible insight into the functional organization of brain regions. The implementation is available at https://github.com/DDVD233/THC.

artificial intelligence, assignment, machine learning, (18 more...)

2305.04142

Country: Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Kumar, Mahender, Rani, Ruby, Botarelli, Mirko, Epiophaniou, Gregory, Maple, Carsten

Science and Technology Ontology: A Taxonomy of Emerging Topics

arXiv.org Artificial IntelligenceMay-6-2023

Ontologies play a critical role in Semantic Web technologies by providing a structured and standardized way to represent knowledge and enabling machines to understand the meaning of data. Several taxonomies and ontologies have been generated, but individuals target one domain, and only some of those have been found expensive in time and manual effort. Also, they need more coverage of unconventional topics representing a more holistic and comprehensive view of the knowledge landscape and interdisciplinary collaborations. Thus, there needs to be an ontology covering Science and Technology and facilitate multidisciplinary research by connecting topics from different fields and domains that may be related or have commonalities. To address these issues, we present an automatic Science and Technology Ontology (S&TO) that covers unconventional topics in different science and technology domains. The proposed S&TO can promote the discovery of new research areas and collaborations across disciplines. The ontology is constructed by applying BERTopic to a dataset of 393,991 scientific articles collected from Semantic Scholar from October 2021 to August 2022, covering four fields of science. Currently, S&TO includes 5,153 topics and 13,155 semantic relations. S&TO model can be updated by running BERTopic on more recent datasets.

artificial intelligence, machine learning, ontology, (14 more...)

2305.04055

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Riess, Lorenz, Beiglböck, Mathias, Temme, Johannes, Wolf, Andreas, Backhoff, Julio

The geometry of financial institutions -- Wasserstein clustering of financial data

The increasing availability of granular and big data on various objects of interest has made it necessary to develop methods for condensing this information into a representative and intelligible map. Financial regulation is a field that exemplifies this need, as regulators require diverse and often highly granular data from financial institutions to monitor and assess their activities. However, processing and analyzing such data can be a daunting task, especially given the challenges of dealing with missing values and identifying clusters based on specific features. To address these challenges, we propose a variant of Lloyd's algorithm that applies to probability distributions and uses generalized Wasserstein barycenters to construct a metric space which represents given data on various objects in condensed form. By applying our method to the financial regulation context, we demonstrate its usefulness in dealing with the specific challenges faced by regulators in this domain. We believe that our approach can also be applied more generally to other fields where large and complex data sets need to be represented in concise form.

artificial intelligence, barycenter, machine learning, (19 more...)

2305.03565

Country:

Europe > Austria > Vienna (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Contrastive Graph Clustering in Curvature Spaces

Sun, Li, Wang, Feiyang, Ye, Junda, Peng, Hao, Yu, Philip S.

Graph clustering is a longstanding research topic, and has achieved remarkable success with the deep learning methods in recent years. Nevertheless, we observe that several important issues largely remain open. On the one hand, graph clustering from the geometric perspective is appealing but has rarely been touched before, as it lacks a promising space for geometric clustering. On the other hand, contrastive learning boosts the deep graph clustering but usually struggles in either graph augmentation or hard sample mining. To bridge this gap, we rethink the problem of graph clustering from geometric perspective and, to the best of our knowledge, make the first attempt to introduce a heterogeneous curvature space to graph clustering problem. Correspondingly, we present a novel end-to-end contrastive graph clustering model named CONGREGATE, addressing geometric graph clustering with Ricci curvatures. To support geometric clustering, we construct a theoretically grounded Heterogeneous Curvature Space where deep representations are generated via the product of the proposed fully Riemannian graph convolutional nets. Thereafter, we train the graph clusters by an augmentation-free reweighted contrastive approach where we pay more attention to both hard negatives and hard positives in our curvature space. Empirical results on real-world graphs show that our model outperforms the state-of-the-art competitors.

artificial intelligence, graph, machine learning, (17 more...)

2305.03555

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Adaptive Graph Convolutional Subspace Clustering

Wei, Lai, Chen, Zhengwei, Yin, Jun, Zhu, Changming, Zhou, Rigui, Liu, Jin

Spectral-type subspace clustering algorithms have shown excellent performance in many subspace clustering applications. The existing spectral-type subspace clustering algorithms either focus on designing constraints for the reconstruction coefficient matrix or feature extraction methods for finding latent features of original data samples. In this paper, inspired by graph convolutional networks, we use the graph convolution technique to develop a feature extraction method and a coefficient matrix constraint simultaneously. And the graph-convolutional operator is updated iteratively and adaptively in our proposed algorithm. Hence, we call the proposed method adaptive graph convolutional subspace clustering (AGCSC). We claim that by using AGCSC, the aggregated feature representation of original data samples is suitable for subspace clustering, and the coefficient matrix could reveal the subspace structure of the original data set more faithfully. Finally, plenty of subspace clustering experiments prove our conclusions and show that AGCSC outperforms some related methods as well as some deep models.

artificial intelligence, data mining, machine learning, (20 more...)

2305.03414

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(16 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.55)

Deep Multi-View Semi-Supervised Clustering with Sample Pairwise Constraints

Chen, Rui, Tang, Yongqiang, Zhang, Wensheng, Feng, Wenlong

Multi-view clustering has attracted much attention thanks to the capacity of multi-source information integration. Although numerous advanced methods have been proposed in past decades, most of them generally overlook the significance of weakly-supervised information and fail to preserve the feature properties of multiple views, thus resulting in unsatisfactory clustering performance. To address these issues, in this paper, we propose a novel Deep Multi-view Semi-supervised Clustering (DMSC) method, which jointly optimizes three kinds of losses during networks finetuning, including multi-view clustering loss, semi-supervised pairwise constraint loss and multiple autoencoders reconstruction loss. Specifically, a KL divergence based multi-view clustering loss is imposed on the common representation of multi-view data to perform heterogeneous feature optimization, multi-view weighting and clustering prediction simultaneously. Then, we innovatively propose to integrate pairwise constraints into the process of multi-view clustering by enforcing the learned multi-view representation of must-link samples (cannot-link samples) to be similar (dissimilar), such that the formed clustering architecture can be more credible. Moreover, unlike existing rivals that only preserve the encoders for each heterogeneous branch during networks finetuning, we further propose to tune the intact autoencoders frame that contains both encoders and decoders. In this way, the issue of serious corruption of view-specific and view-shared feature space could be alleviated, making the whole training procedure more stable. Through comprehensive experiments on eight popular image datasets, we demonstrate that our proposed approach performs better than the state-of-the-art multi-view and single-view competitors.

artificial intelligence, constraint, machine learning, (18 more...)

2206.04949

Country:

Pacific Ocean > North Pacific Ocean > South China Sea (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Generic and Robust Root Cause Localization for Multi-Dimensional Data in Online Service Systems

Li, Zeyan, Chen, Junjie, Chen, Yihao, Luo, Chengyang, Zhao, Yiwei, Sun, Yongqian, Sui, Kaixin, Wang, Xiping, Liu, Dapeng, Jin, Xing, Wang, Qi, Pei, Dan

Localizing root causes for multi-dimensional data is critical to ensure online service systems' reliability. When a fault occurs, only the measure values within specific attribute combinations are abnormal. Such attribute combinations are substantial clues to the underlying root causes and thus are called root causes of multidimensional data. This paper proposes a generic and robust root cause localization approach for multi-dimensional data, PSqueeze. We propose a generic property of root cause for multi-dimensional data, generalized ripple effect (GRE). Based on it, we propose a novel probabilistic cluster method and a robust heuristic search method. Moreover, we identify the importance of determining external root causes and propose an effective method for the first time in literature. Our experiments on two real-world datasets with 5400 faults show that the F1-score of PSqueeze outperforms baselines by 32.89%, while the localization time is around 10 seconds across all cases. The F1-score in determining external root causes of PSqueeze achieves 0.90. Furthermore, case studies in several production systems demonstrate that PSqueeze is helpful to fault diagnosis in the real world.

information retrieval, machine learning, natural language, (20 more...)

2305.03331

Country:

Asia > China > Beijing > Beijing (0.06)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(8 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Zhang, Yihao, Severinsen, Odin A., Leonard, John J., Carlone, Luca, Khosoussi, Kasra

Data-Association-Free Landmark-based SLAM

arXiv.org Artificial IntelligenceMay-4-2023

We study landmark-based SLAM with unknown data association: our robot navigates in a completely unknown environment and has to simultaneously reason over its own trajectory, the positions of an unknown number of landmarks in the environment, and potential data associations between measurements and landmarks. This setup is interesting since: (i) it arises when recovering from data association failures or from SLAM with information-poor sensors, (ii) it sheds light on fundamental limits (and hardness) of landmark-based SLAM problems irrespective of the front-end data association method, and (iii) it generalizes existing approaches where data association is assumed to be known or partially known. We approach the problem by splitting it into an inner problem of estimating the trajectory, landmark positions and data associations and an outer problem of estimating the number of landmarks. Our approach creates useful and novel connections with existing techniques from discrete-continuous optimization (e.g., k-means clustering), which has the potential to trigger novel research. We demonstrate the proposed approaches in extensive simulations and on real datasets and show that the proposed techniques outperform typical data association baselines and are even competitive against an "oracle" baseline which has access to the number of landmarks and an initial guess for each landmark.

artificial intelligence, machine learning, optimization problem, (18 more...)

2302.13264

Country:

Oceania > Australia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Deng, Chengyuan, Gavva, Surya Teja, S., Karthik C., Patel, Parth, Srinivasan, Adarsh

Impossibility of Depth Reduction in Explainable Clustering

arXiv.org Artificial IntelligenceMay-4-2023

Over the last few years Explainable Clustering has gathered a lot of attention. Dasgupta et al. [ICML'20] initiated the study of explainable k-means and k-median clustering problems where the explanation is captured by a threshold decision tree which partitions the space at each node using axis parallel hyperplanes. Recently, Laber et al. [Pattern Recognition'23] made a case to consider the depth of the decision tree as an additional complexity measure of interest. In this work, we prove that even when the input points are in the Euclidean plane, then any depth reduction in the explanation incurs unbounded loss in the k-means and k-median cost. Formally, we show that there exists a data set X in the Euclidean plane, for which there is a decision tree of depth k-1 whose k-means/k-median cost matches the optimal clustering cost of X, but every decision tree of depth less than k-1 has unbounded cost w.r.t. the optimal cost of clustering. We extend our results to the k-center objective as well, albeit with weaker guarantees.

artificial intelligence, decision tree, machine learning, (17 more...)

2305.0285

Country:

North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)