AITopics

These methods analysis by performing integration and clustering are notorious for finding structure where no structure exists simultaneously. We adapt the VampPrior (Tomczak (Chari & Pachter, 2023). When the embedding function & Welling, 2018) into a Dirichlet process does not account for systematic shifts in expression profiling Gaussian mixture model, resulting in the Vamp-between datasets and/or batches that use different scRNAseq Prior Mixture Model (VMM), a novel prior for technologies, misleading structure can arise, confounding DLVMs. We propose an inference procedure that standard analysis pipelines. Accordingly, Lähnemann alternates between variational inference and Empirical et al. (2020) identify atlas-level integration as one of the Bayes to cleanly distinguish variational grand challenges of single-cell data science.

dataset, scvi, vmm, (16 more...)

2402.04412

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Data Science (0.66)

Label-Free Multivariate Time Series Anomaly Detection

Zhou, Qihang, He, Shibo, Liu, Haoyu, Chen, Jiming, Meng, Wenchao

Anomaly detection in multivariate time series (MTS) has been widely studied in one-class classification (OCC) setting. The training samples in OCC are assumed to be normal, which is difficult to guarantee in practical situations. Such a case may degrade the performance of OCC-based anomaly detection methods which fit the training distribution as the normal distribution. In this paper, we propose MTGFlow, an unsupervised anomaly detection approach for MTS anomaly detection via dynamic Graph and entity-aware normalizing Flow. MTGFlow first estimates the density of the entire training samples and then identifies anomalous instances based on the density of the test samples within the fitted distribution. This relies on a widely accepted assumption that anomalous instances exhibit more sparse densities than normal ones, with no reliance on the clean training dataset. However, it is intractable to directly estimate the density due to complex dependencies among entities and their diverse inherent characteristics. To mitigate this, we utilize the graph structure learning model to learn interdependent and evolving relations among entities, which effectively captures complex and accurate distribution patterns of MTS. In addition, our approach incorporates the unique characteristics of individual entities by employing an entity-aware normalizing flow. This enables us to represent each entity as a parameterized normal distribution. Furthermore, considering that some entities present similar characteristics, we propose a cluster strategy that capitalizes on the commonalities of entities with similar characteristics, resulting in more precise and detailed density estimation. We refer to this cluster-aware extension as MTGFlow_cluster. Extensive experiments are conducted on six widely used benchmark datasets, in which MTGFlow and MTGFlow cluster demonstrate their superior detection performance.

anomaly detection, detection, mtgflow cluster, (12 more...)

2312.11549

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Energy (0.68)
Water & Waste Management > Water Management > Lifecycle > Treatment (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Adaptive Denoising-Enhanced LiDAR Odometry for Degeneration Resilience in Diverse Terrains

Ji, Mazeyu, Shi, Wenbo, Cui, Yujie, Liu, Chengju, Chen, Qijun

The flexibility of Simultaneous Localization and Mapping (SLAM) algorithms in various environments has consistently been a significant challenge. To address the issue of LiDAR odometry drift in high-noise settings, integrating clustering methods to filter out unstable features has become an effective module of SLAM frameworks. However, reducing the amount of point cloud data can lead to potential loss of information and possible degeneration. As a result, this research proposes a LiDAR odometry that can dynamically assess the point cloud's reliability. The algorithm aims to improve adaptability in diverse settings by selecting important feature points with sensitivity to the level of environmental degeneration. Firstly, a fast adaptive Euclidean clustering algorithm based on range image is proposed, which, combined with depth clustering, extracts the primary structural points of the environment defined as ambient skeleton points. Then, the environmental degeneration level is computed through the dense normal features of the skeleton points, and the point cloud cleaning is dynamically adjusted accordingly. The algorithm is validated on the KITTI benchmark and real environments, demonstrating higher accuracy and robustness in different environments.

algorithm, cloud, point cloud, (16 more...)

2309.14641

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > China > Shanghai > Shanghai (0.05)
(4 more...)

Genre: Research Report (0.82)

Industry: Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

John, Jeen Mary, Shobayo, Olamilekan, Ogunleye, Bayode

An Exploration of Clustering Algorithms for Customer Segmentation in the UK Retail Market

Recently, peoples awareness of online purchases has significantly risen. This has given rise to online retail platforms and the need for a better understanding of customer purchasing behaviour. Retail companies are pressed with the need to deal with a high volume of customer purchases, which requires sophisticated approaches to perform more accurate and efficient customer segmentation. Customer segmentation is a marketing analytical tool that aids customer-centric service and thus enhances profitability. In this paper, we aim to develop a customer segmentation model to improve decision-making processes in the retail market industry. To achieve this, we employed a UK-based online retail dataset obtained from the UCI machine learning repository. The retail dataset consists of 541,909 customer records and eight features. Our study adopted the RFM (recency, frequency, and monetary) framework to quantify customer values. Thereafter, we compared several state-of-the-art (SOTA) clustering algorithms, namely, K-means clustering, the Gaussian mixture model (GMM), density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering, and balanced iterative reducing and clustering using hierarchies (BIRCH). The results showed the GMM outperformed other approaches, with a Silhouette Score of 0.80.

algorithm, dataset, segmentation, (11 more...)

doi: 10.3390/analytics2040042

2402.04103

Country:

Europe > United Kingdom (1.00)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Retail > Online (0.74)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Tinós, Renato, Zhao, Liang, Chicano, Francisco, Whitley, Darrell

NK Hybrid Genetic Algorithm for Clustering

The NK hybrid genetic algorithm for clustering is proposed in this paper. In order to evaluate the solutions, the hybrid algorithm uses the NK clustering validation criterion 2 (NKCV2). NKCV2 uses information about the disposition of $N$ small groups of objects. Each group is composed of $K+1$ objects of the dataset. Experimental results show that density-based regions can be identified by using NKCV2 with fixed small $K$. In NKCV2, the relationship between decision variables is known, which in turn allows us to apply gray box optimization. Mutation operators, a partition crossover, and a local search strategy are proposed, all using information about the relationship between decision variables. In partition crossover, the evaluation function is decomposed into $q$ independent components; partition crossover then deterministically returns the best among $2^q$ possible offspring with computational complexity $O(N)$. The NK hybrid genetic algorithm allows the detection of clusters with arbitrary shapes and the automatic estimation of the number of clusters. In the experiments, the NK hybrid genetic algorithm produced very good results when compared to another genetic algorithm approach and to state-of-art clustering algorithms.

algorithm, criterion, dataset, (13 more...)

doi: 10.1109/TEVC.2018.2828643

2402.03813

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Colorado > Larimer County > Fort Collins (0.04)
North America > United States > Arizona (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Ogunleye, Bayode, Maswera, Tonderai, Hirsch, Laurence, Gaudoin, Jotham, Brunsdon, Teresa

Comparison of Topic Modelling Approaches in the Banking Context

Topic modelling is a prominent task for automatic topic extraction in many applications such as sentiment analysis and recommendation systems. The approach is vital for service industries to monitor their customer discussions. The use of traditional approaches such as Latent Dirichlet Allocation (LDA) for topic discovery has shown great performances, however, they are not consistent in their results as these approaches suffer from data sparseness and inability to model the word order in a document. Thus, this study presents the use of Kernel Principal Component Analysis (KernelPCA) and K-means Clustering in the BERTopic architecture. We have prepared a new dataset using tweets from customers of Nigerian banks and we use this to compare the topic modelling approaches. Our findings showed KernelPCA and K-means in the BERTopic architecture-produced coherent topics with a coherence score of 0.8463.

coherence score, extraction, proceedings, (15 more...)

doi: 10.3390/app13020797

2402.03176

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
(31 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance (1.00)
Government (0.68)
Education > Educational Setting > Online (0.68)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Takagi, Yoshiki, Tabalba, Roderick, Kirshenbaum, Nurit, Leigh, Jason

Abstracted Trajectory Visualization for Explainability in Reinforcement Learning

Explainable AI (XAI) has demonstrated the potential to help reinforcement learning (RL) practitioners to understand how RL models work. However, XAI for users who do not have RL expertise (non-RL experts), has not been studied sufficiently. This results in a difficulty for the non-RL experts to participate in the fundamental discussion of how RL models should be designed for an incoming society where humans and AI coexist. Solving such a problem would enable RL experts to communicate with the non-RL experts in producing machine learning solutions that better fit our society. We argue that abstracted trajectories, that depicts transitions between the major states of the RL model, will be useful for non-RL experts to build a mental model of the agents. Our early results suggest that by leveraging a visualization of the abstracted trajectories, users without RL expertise are able to infer the behavior patterns of RL.

abstracted trajectory, participant, trajectory, (13 more...)

2402.07928

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Aronsson, Linus, Chehreghani, Morteza Haghir

Effective Acquisition Functions for Active Correlation Clustering

Correlation clustering is a powerful unsupervised learning paradigm that supports positive and negative similarities. In this paper, we assume the similarities are not known in advance. Instead, we employ active learning to iteratively query similarities in a cost-efficient way. In particular, we develop three effective acquisition functions to be used in this setting. One is based on the notion of inconsistency (i.e., when similarities violate the transitive property). The remaining two are based on information-theoretic quantities, i.e., entropy and information gain.

acquisition function, chehreghani, correlation, (14 more...)

2402.03587

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(8 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Text-Guided Image Clustering

Stephan, Andreas, Miklautz, Lukas, Sidak, Kevin, Wahle, Jan Philip, Gipp, Bela, Plant, Claudia, Roth, Benjamin

Image clustering divides a collection of images into meaningful groups, typically interpreted post-hoc via human-given annotations. Those are usually in the form of text, begging the question of using text as an abstraction for image clustering. Current image clustering methods, however, neglect the use of generated textual descriptions. We, therefore, propose Text-Guided Image Clustering, i.e., generating text using image captioning and visual question-answering (VQA) models and subsequently clustering the generated text. Further, we introduce a novel approach to inject task- or domain knowledge for clustering by prompting VQA models. Across eight diverse image clustering datasets, our results show that the obtained text representations often outperform image features. Additionally, we propose a counting-based cluster explainability method. Our evaluations show that the derived keyword-based explanations describe clusters better than the respective cluster accuracy suggests. Overall, this research challenges traditional approaches and paves the way for a paradigm shift in image clustering, using generated text.

dataset, keyword, representation, (16 more...)

2402.02996

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(6 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Sports (1.00)
Transportation > Ground > Road (0.68)
Transportation > Freight & Logistics Services (0.67)
Health & Medicine (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Unsupervised semantic segmentation of high-resolution UAV imagery for road scene parsing

Ma, Zihan, Li, Yongshang, Ma, Ronggui, Liang, Chen

Two challenges are presented when parsing road scenes in UAV images. First, the high resolution of UAV images makes processing difficult. Second, supervised deep learning methods require a large amount of manual annotations to train robust and accurate models. In this paper, an unsupervised road parsing framework that leverages recent advances in vision language models and fundamental computer vision model is introduced.Initially, a vision language model is employed to efficiently process ultra-large resolution UAV images to quickly detect road regions of interest in the images. Subsequently, the vision foundation model SAM is utilized to generate masks for the road regions without category information. Following that, a self-supervised representation learning network extracts feature representations from all masked regions. Finally, an unsupervised clustering algorithm is applied to cluster these feature representations and assign IDs to each cluster. The masked regions are combined with the corresponding IDs to generate initial pseudo-labels, which initiate an iterative self-training process for regular semantic segmentation. The proposed method achieves an impressive 89.96% mIoU on the development dataset without relying on any manual annotation. Particularly noteworthy is the extraordinary flexibility of the proposed method, which even goes beyond the limitations of human-defined categories and is able to acquire knowledge of new categories from the dataset itself.

arxiv, segmentation, semantic segmentation, (12 more...)

2402.02985

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Asia > Middle East > Iraq > Muthanna Governorate (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Infrastructure & Services (0.67)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)