AITopics

2501.10084

Country:

Asia (0.93)
Africa (0.68)
North America > United States > Hawaii (0.18)
North America > United States > Colorado (0.15)

Genre: Research Report (0.84)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Vardakas, Georgios, Karra, Antonia, Pitoura, Evaggelia, Likas, Aristidis

Counterfactual Explanations for k-means and Gaussian Clustering

arXiv.org Artificial IntelligenceJan-17-2025

Counterfactuals have been recognized as an effective approach to explain classifier decisions. Nevertheless, they have not yet been considered in the context of clustering. In this work, we propose the use of counterfactuals to explain clustering solutions. First, we present a general definition for counterfactuals for model-based clustering that includes plausibility and feasibility constraints. Then we consider the counterfactual generation problem for k-means and Gaussian clustering assuming Euclidean distance. Our approach takes as input the factual, the target cluster, a binary mask indicating actionable or immutable features and a plausibility factor specifying how far from the cluster boundary the counterfactual should be placed. In the k-means clustering case, analytical mathematical formulas are presented for computing the optimal solution, while in the Gaussian clustering case (assuming full, diagonal, or spherical covariances) our method requires the numerical solution of a nonlinear equation with a single parameter only. We demonstrate the advantages of our approach through illustrative examples and quantitative experimental comparisons.

artificial intelligence, counterfactual, machine learning, (15 more...)

2501.10234

Country: Europe > Greece (0.15)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Enhancing Lexicon-Based Text Embeddings with Large Language Models

Lei, Yibin, Shen, Tao, Cao, Yu, Yates, Andrew

Recent large language models (LLMs) have demonstrated exceptional performance on general-purpose text embedding tasks. While dense embeddings have dominated related research, we introduce the first Lexicon-based EmbeddiNgS (LENS) leveraging LLMs that achieve competitive performance on these tasks. Regarding the inherent tokenization redundancy issue and unidirectional attention limitations in traditional causal LLMs, LENS consolidates the vocabulary space through token embedding clustering, and investigates bidirectional attention and various pooling strategies. Specifically, LENS simplifies lexicon matching by assigning each dimension to a specific token cluster, where semantically similar tokens are grouped together, and unlocking the full potential of LLMs through bidirectional attention. Extensive experiments demonstrate that LENS outperforms dense embeddings on the Massive Text Embedding Benchmark (MTEB), delivering compact feature representations that match the sizes of dense counterparts. Notably, combining LENSE with dense embeddings achieves state-of-the-art performance on the retrieval subset of MTEB (i.e. BEIR).

dataset, language model, proceedings, (12 more...)

2501.09749

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Kiyama, Hajime, Aida, Taichi, Komachi, Mamoru, Ogiso, Toshinobu, Takamura, Hiroya, Mochihashi, Daichi

Analyzing Continuous Semantic Shifts with Diachronic Word Similarity Matrices

The meanings and relationships of words shift over time. This phenomenon is referred to as semantic shift.Research focused on understanding how semantic shifts occur over multiple time periods is essential for gaining a detailed understanding of semantic shifts.However, detecting change points only between adjacent time periods is insufficient for analyzing detailed semantic shifts, and using BERT-based methods to examine word sense proportions incurs a high computational cost.To address those issues, we propose a simple yet intuitive framework for how semantic shifts occur over multiple time periods by leveraging a similarity matrix between the embeddings of the same word through time.We compute a diachronic word similarity matrix using fast and lightweight word embeddings across arbitrary time periods, making it deeper to analyze continuous semantic shifts.Additionally, by clustering the similarity matrices for different words, we can categorize words that exhibit similar behavior of semantic shift in an unsupervised manner.

matrix, semantic shift, similarity matrix, (14 more...)

2501.09538

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
North America > United States > Ohio (0.04)
North America > United States > New York > New York County > New York City (0.04)
(14 more...)

Genre: Research Report (0.64)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection

Shi, Jianrui, Zhao, Yong, Cui, Zeyang, Shen, Xiaoming, Zeng, Minhang, Liu, Xiaojie

Object detection plays a crucial role in smart video analysis, with applications ranging from autonomous driving and security to smart cities. However, achieving real-time object detection on edge devices presents significant challenges due to their limited computational resources and the high demands of deep neural network (DNN)-based detection models, particularly when processing high-resolution video. Conventional strategies, such as input down-sampling and network up-scaling, often compromise detection accuracy for faster performance or lead to higher inference latency. To address these issues, this paper introduces RE-POSE, a Reinforcement Learning (RL)-Driven Partitioning and Edge Offloading framework designed to optimize the accuracy-latency trade-off in resource-constrained edge environments. Our approach features an RL-Based Dynamic Clustering Algorithm (RL-DCA) that partitions video frames into non-uniform blocks based on object distribution and the computational characteristics of DNNs. Furthermore, a parallel edge offloading scheme is implemented to distribute these blocks across multiple edge servers for concurrent processing. Experimental evaluations show that RE-POSE significantly enhances detection accuracy and reduces inference latency, surpassing existing methods.

accuracy, detection, rl-dca, (11 more...)

2501.09465

Country:

Asia > China > Hong Kong (0.05)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Discriminative Representation learning via Attention-Enhanced Contrastive Learning for Short Text Clustering

Yao, Zhihao

Contrastive learning has gained significant attention in short text clustering, yet it has an inherent drawback of mistakenly identifying samples from the same category as negatives and then separating them in the feature space (false negative separation), which hinders the generation of superior representations. To generate more discriminative representations for efficient clustering, we propose a novel short text clustering method, called Discriminative Representation learning via \textbf{A}ttention-\textbf{E}nhanced \textbf{C}ontrastive \textbf{L}earning for Short Text Clustering (\textbf{AECL}). The \textbf{AECL} consists of two modules which are the pseudo-label generation module and the contrastive learning module. Both modules build a sample-level attention mechanism to capture similarity relationships between samples and aggregate cross-sample features to generate consistent representations. Then, the former module uses the more discriminative consistent representation to produce reliable supervision information for assist clustering, while the latter module explores similarity relationships and consistent representations optimize the construction of positive samples to perform similarity-guided contrastive learning, effectively addressing the false negative separation issue. Experimental results demonstrate that the proposed \textbf{AECL} outperforms state-of-the-art methods. If the paper is accepted, we will open-source the code.

dataset, learning, representation, (14 more...)

2501.03584

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Cetingoz, Adil Rengim, Lehalle, Charles-Albert

Synthetic Data for Portfolios: A Throw of the Dice Will Never Abolish Chance

arXiv.org Machine LearningJan-16-2025

Simulation methods have always been instrumental in finance, and data-driven methods with minimal model specification, commonly referred to as generative models, have attracted increasing attention, especially after the success of deep learning in a broad range of fields. However, the adoption of these models in financial applications has not kept pace with the growing interest, probably due to the unique complexities and challenges of financial markets. This paper aims to contribute to a deeper understanding of the limitations of generative models, particularly in portfolio and risk management. To this end, we begin by presenting theoretical results on the importance of initial sample size, and point out the potential pitfalls of generating far more data than originally available. We then highlight the inseparable nature of model development and the desired use case by touching on a paradox: generic generative models inherently care less about what is important for constructing portfolios (in particular the long-short ones). Based on these findings, we propose a pipeline for the generation of multivariate returns that meets conventional evaluation standards on a large universe of US equities while being compliant with stylized facts observed in asset returns and turning around the pitfalls we previously identified. Moreover, we insist on the need for more delicate evaluation methods, and suggest, through an example of mean-reversion strategies, a method designed to identify poor models for a given application based on regurgitative training, i.e. retraining the model using the data it has itself generated, which is commonly referred to in statistics as identifiability.

asset return, generative model, market generator, (14 more...)

arXiv.org Machine Learning

2501.03993

Country:

North America > United States (0.34)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.45)

Castellana, Daniele, Bianchi, Filippo Maria

BN-Pool: a Bayesian Nonparametric Approach to Graph Pooling

We introduce BN-Pool, the first clustering-based pooling method for Graph Neural Networks (GNNs) that adaptively determines the number of supernodes in a coarsened graph. By leveraging a Bayesian non-parametric framework, BN-Pool employs a generative model capable of partitioning graph nodes into an unbounded number of clusters. During training, we learn the node-to-cluster assignments by combining the supervised loss of the downstream task with an unsupervised auxiliary term, which encourages the reconstruction of the original graph topology while penalizing unnecessary proliferation of clusters. This adaptive strategy allows BN-Pool to automatically discover an optimal coarsening level, offering enhanced flexibility and removing the need to specify sensitive pooling ratios. We show that BN-Pool achieves superior performance across diverse benchmarks.

artificial intelligence, graph, machine learning, (17 more...)

2501.09821

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJan-16-2025

Village-Net Clustering: A Rapid approach to Non-linear Unsupervised Clustering of High-Dimensional Data

Ballal, Aditya, Datta, Esha, DePaul, Gregory A., Carlsson, Erik, Chen-Izu, Ye, López, Javier E., Izu, Leighton T.

Clustering large high-dimensional datasets with diverse variable is essential for extracting high-level latent information from these datasets. Here, we developed an unsupervised clustering algorithm, we call "Village-Net". Village-Net is specifically designed to effectively cluster high-dimension data without priori knowledge on the number of existing clusters. The algorithm operates in two phases: first, utilizing K-Means clustering, it divides the dataset into distinct subsets we refer to as "villages". Next, a weighted network is created, with each node representing a village, capturing their proximity relationships. To achieve optimal clustering, we process this network using a community detection algorithm called Walk-likelihood Community Finder (WLCF), a community detection algorithm developed by one of our team members. A salient feature of Village-Net Clustering is its ability to autonomously determine an optimal number of clusters for further analysis based on inherent characteristics of the data. We present extensive benchmarking on extant real-world datasets with known ground-truth labels to showcase its competitive performance, particularly in terms of the normalized mutual information (NMI) score, when compared to other state-of-the-art methods. The algorithm is computationally efficient, boasting a time complexity of O(N*k*d), where N signifies the number of instances, k represents the number of villages and d represents the dimension of the dataset, which makes it well suited for effectively handling large-scale datasets.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2501.10471

Country: North America > United States > California (0.28)

Genre: Research Report (0.84)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Neural Information Processing SystemsJan-15-2025, 05:29:07 GMT

Fair Hierarchical Clustering

As machine learning has become more prevalent, researchers have begun to recognize the necessity of ensuring machine learning systems are fair. Recently, there has been an interest in defining a notion of fairness that mitigates over-representation in traditional clustering. In this paper we extend this notion to hierarchical clustering, where the goal is to recursively partition the data to optimize a specific objective. For various natural objectives, we obtain simple, efficient algorithms to find a provably good fair hierarchical clustering. Empirically, we show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.

algorithm, fair hierarchical clustering, objective

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)