AITopics

2312.10932

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > India (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.75)

arXiv.org Artificial IntelligenceDec-17-2023

Semi-Supervised Clustering via Structural Entropy with Different Constraints

Zeng, Guangjie, Peng, Hao, Li, Angsheng, Liu, Zhiwei, Yang, Runze, Liu, Chunyang, He, Lifang

Semi-supervised clustering techniques have emerged as valuable tools for leveraging prior information in the form of constraints to improve the quality of clustering outcomes. Despite the proliferation of such methods, the ability to seamlessly integrate various types of constraints remains limited. While structural entropy has proven to be a powerful clustering approach with wide-ranging applications, it has lacked a variant capable of accommodating these constraints. In this work, we present Semi-supervised clustering via Structural Entropy (SSE), a novel method that can incorporate different types of constraints from diverse sources to perform both partitioning and hierarchical clustering. Specifically, we formulate a uniform view for the commonly used pairwise and label constraints for both types of clustering. Then, we design objectives that incorporate these constraints into structural entropy and develop tailored algorithms for their optimization. We evaluate SSE on nine clustering datasets and compare it with eleven semi-supervised partitioning and hierarchical clustering methods. Experimental results demonstrate the superiority of SSE on clustering accuracy with different types of constraints. Additionally, the functionality of SSE for biological data analysis is demonstrated by cell clustering experiments conducted on four single-cell RNAseq datasets.

constraint, dataset, sse, (13 more...)

2312.10917

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Li, Chengzhang, Peng, Zhenkang, Rong, Ying

Mostly Beneficial Clustering: Aggregating Data for Operational Decision Making

arXiv.org Artificial IntelligenceDec-17-2023

With increasingly volatile market conditions and rapid product innovations, operational decision-making for large-scale systems entails solving thousands of problems with limited data. Data aggregation is proposed to combine the data across problems to improve the decisions obtained by solving those problems individually. We propose a novel cluster-based Shrunken-SAA approach that can exploit the cluster structure among problems when implementing the data aggregation approaches. We prove that, as the number of problems grows, leveraging the given cluster structure among problems yields additional benefits over the data aggregation approaches that neglect such structure. When the cluster structure is unknown, we show that unveiling the cluster structure, even at the cost of a few data points, can be beneficial, especially when the distance between clusters of problems is substantial. Our proposed approach can be extended to general cost functions under mild conditions. When the number of problems gets large, the optimality gap of our proposed approach decreases exponentially in the distance between the clusters. We explore the performance of the proposed approach through the application of managing newsvendor systems via numerical experiments. We investigate the impacts of distance metrics between problem instances on the performance of the cluster-based Shrunken-SAA approach with synthetic data. We further validate our proposed approach with real data and highlight the advantages of cluster-based data aggregation, especially in the small-data large-scale regime, compared to the existing approaches.

beneficial clustering, cluster structure, cluster-based data, (16 more...)

2311.17326

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance (0.67)
Health & Medicine (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Ramalingam, Srikumar, Awasthi, Pranjal, Kumar, Sanjiv

A Weighted K-Center Algorithm for Data Subset Selection

arXiv.org Artificial IntelligenceDec-16-2023

This makes us ponder the key factors behind this revolution: is it the availability of the large datasets, the actual learning algorithms, or both? ML models rely on very deep networks and enormous labeled datasets, requiring exorbitant computational and human labeling efforts. For example, the market for data annotation costs have crossed one billion US dollars in 2020, and it is estimated to hit seven billion in 2027. Human annotation of semantic segmentation labels takes about 45-60 minutes [BGC10] for a single image. In most vision and NLP applications, unlabeled data is unlimited and is usually available at no cost. To directly reduce the human annotation costs, this paper focuses on identifying smaller subsets of training data that can lead to accurate models with marginal or no loss in performance compared to the ones trained on the full dataset. Among the several approaches for subset selection, two are shown to achieve impressive results: (1) the classical margin sampling algorithm that selects points based on the uncertainty in the class prediction scores [RS06b], and (2) the k-center clustering algorithm [SS18] based on core sets. One may wonder about the natural extension of these two powerful algorithms: is there a principled method that jointly uses both these measures for computing more informative subsets?

active learning, algorithm, subset, (14 more...)

2312.10602

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

arXiv.org Artificial IntelligenceDec-16-2023

Exploiting Label Skews in Federated Learning with Model Concatenation

Diao, Yiqun, Li, Qinbin, He, Bingsheng

Federated Learning (FL) has emerged as a promising solution to perform deep learning on different data owners without exchanging raw data. However, non-IID data has been a key challenge in FL, which could significantly degrade the accuracy of the final model. Among different non-IID types, label skews have been challenging and common in image classification and other tasks. Instead of averaging the local models in most previous studies, we propose FedConcat, a simple and effective approach that concatenates these local models as the base of the global model to effectively aggregate the local knowledge. To reduce the size of the global model, we adopt the clustering technique to group the clients by their label distributions and collaboratively train a model inside each cluster. We theoretically analyze the advantage of concatenation over averaging by analyzing the information bottleneck of deep neural networks. Experimental results demonstrate that FedConcat achieves significantly higher accuracy than previous state-of-the-art FL methods in various heterogeneous label skew distribution settings and meanwhile has lower communication costs. Our code is publicly available at https://github.com/sjtudyq/FedConcat.

encoder round, fedconcat-id, label distribution, (13 more...)

2312.0629

Country:

Asia > Singapore (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

arXiv.org Artificial IntelligenceDec-16-2023

Towards Generalized Multi-stage Clustering: Multi-view Self-distillation

Wang, Jiatai, Xu, Zhiwei, Wang, Xin, Li, Tao

Existing multi-stage clustering methods independently learn the salient features from multiple views and then perform the clustering task. Particularly, multi-view clustering (MVC) has attracted a lot of attention in multi-view or multi-modal scenarios. MVC aims at exploring common semantics and pseudo-labels from multiple views and clustering in a self-supervised manner. However, limited by noisy data and inadequate feature learning, such a clustering paradigm generates overconfident pseudo-labels that mis-guide the model to produce inaccurate predictions. Therefore, it is desirable to have a method that can correct this pseudo-label mistraction in multi-stage clustering to avoid the bias accumulation. To alleviate the effect of overconfident pseudo-labels and improve the generalization ability of the model, this paper proposes a novel multi-stage deep MVC framework where multi-view self-distillation (DistilMVC) is introduced to distill dark knowledge of label distribution. Specifically, in the feature subspace at different hierarchies, we explore the common semantics of multiple views through contrastive learning and obtain pseudo-labels by maximizing the mutual information between views. Additionally, a teacher network is responsible for distilling pseudo-labels into dark knowledge, supervising the student network and improving its predictive capabilities to enhance the robustness. Extensive experiments on real-world multi-view datasets show that our method has better clustering performance than state-of-the-art methods.

distillation, information, proceedings, (14 more...)

2310.1889

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Beijing > Beijing (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.65)

Industry: Education (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.65)

Pixel-Superpixel Contrastive Learning and Pseudo-Label Correction for Hyperspectral Image Clustering

Guan, Renxiang, Li, Zihao, Li, Xianju, Tang, Chang

Hyperspectral image (HSI) clustering is gaining considerable attention owing to recent methods that overcome the inefficiency and misleading results from the absence of supervised information. Contrastive learning methods excel at existing pixel level and super pixel level HSI clustering tasks. The pixel-level contrastive learning method can effectively improve the ability of the model to capture fine features of HSI but requires a large time overhead. The super pixel-level contrastive learning method utilizes the homogeneity of HSI and reduces computing resources; however, it yields rough classification results. To exploit the strengths of both methods, we present a pixel super pixel contrastive learning and pseudo-label correction (PSCPC) method for the HSI clustering. PSCPC can reasonably capture domain-specific and fine-grained features through super pixels and the comparative learning of a small number of pixels within the super pixels. To improve the clustering performance of super pixels, this paper proposes a pseudo-label correction module that aligns the clustering pseudo-labels of pixels and super-pixels. In addition, pixel-level clustering results are used to supervise super pixel-level clustering, improving the generalization ability of the model. Extensive experiments demonstrate the effectiveness and efficiency of PSCPC.

algorithm, dataset, superpixel, (14 more...)

2312.0963

Country: Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.30)

Unsupervised Social Event Detection via Hybrid Graph Contrastive Learning and Reinforced Incremental Clustering

Guo, Yuanyuan, Zang, Zehua, Gao, Hang, Xu, Xiao, Wang, Rui, Liu, Lixiang, Li, Jiangmeng

Detecting events from social media data streams is gradually attracting researchers. The innate challenge for detecting events is to extract discriminative information from social media data thereby assigning the data into different events. Due to the excessive diversity and high updating frequency of social data, using supervised approaches to detect events from social messages is hardly achieved. To this end, recent works explore learning discriminative information from social messages by leveraging graph contrastive learning (GCL) and embedding clustering in an unsupervised manner. However, two intrinsic issues exist in benchmark methods: conventional GCL can only roughly explore partial attributes, thereby insufficiently learning the discriminative information of social messages; for benchmark methods, the learned embeddings are clustered in the latent space by taking advantage of certain specific prior knowledge, which conflicts with the principle of unsupervised learning paradigm. In this paper, we propose a novel unsupervised social media event detection method via hybrid graph contrastive learning and reinforced incremental clustering (HCRC), which uses hybrid graph contrastive learning to comprehensively learn semantic and structural discriminative information from social messages and reinforced incremental clustering to perform efficient clustering in a solidly unsupervised manner. We conduct comprehensive experiments to evaluate HCRC on the Twitter and Maven datasets. The experimental results demonstrate that our approach yields consistent significant performance boosts. In traditional incremental setting, semi-supervised incremental setting and solidly unsupervised setting, the model performance has achieved maximum improvements of 53%, 45%, and 37%, respectively.

hcrc, information, social message, (14 more...)

doi: 10.1016/j.knosys.2023.111225

2312.08374

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Qatar (0.04)
(21 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Social Events (0.41)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Jaffe, Sean, Singh, Ambuj K., Bullo, Francesco

IDKM: Memory Efficient Neural Network Quantization via Implicit, Differentiable k-Means

Compressing large neural networks with minimal performance loss is crucial to enabling their deployment on edge devices. (Cho et al., 2022) proposed a weight quantization method that uses an attention-based clustering algorithm called differentiable $k$-means (DKM). Despite achieving state-of-the-art results, DKM's performance is constrained by its heavy memory dependency. We propose an implicit, differentiable $k$-means algorithm (IDKM), which eliminates the major memory restriction of DKM. Let $t$ be the number of $k$-means iterations, $m$ be the number of weight-vectors, and $b$ be the number of bits per cluster address. IDKM reduces the overall memory complexity of a single $k$-means layer from $\mathcal{O}(t \cdot m \cdot 2^b)$ to $\mathcal{O}( m \cdot 2^b)$. We also introduce a variant, IDKM with Jacobian-Free-Backpropagation (IDKM-JFB), for which the time complexity of the gradient calculation is independent of $t$ as well. We provide a proof of concept of our methods by showing that, under the same settings, IDKM achieves comparable performance to DKM with less compute time and less memory. We also use IDKM and IDKM-JFB to quantize a large neural network, Resnet18, on hardware where DKM cannot train at all.

gradient, iteration, neural network, (12 more...)

2312.07759

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Single-Cell Deep Clustering Method Assisted by Exogenous Gene Information: A Novel Approach to Identifying Cell Types

Hu, Dayu, Liang, Ke, Yu, Hao, Liu, Xinwang

In recent years, the field of single-cell data analysis has seen a marked advancement in the development of clustering methods. Despite advancements, most of these algorithms still concentrate on analyzing the provided single-cell matrix data. However, in medical applications, single-cell data often involves a wealth of exogenous information, including gene networks. Overlooking this aspect could lead to information loss and clustering results devoid of significant clinical relevance. An innovative single-cell deep clustering method, incorporating exogenous gene information, has been proposed to overcome this limitation. This model leverages exogenous gene network information to facilitate the clustering process, generating discriminative representations. Specifically, we have developed an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells. Concurrently, we conducted a random walk on an exogenous Protein-Protein Interaction (PPI) network, thereby acquiring the gene's topological features. Ultimately, during the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation. Extensive experiments have validated the effectiveness of our proposed method. This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.

graph, information, matrix, (16 more...)

2311.17104

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Pennsylvania (0.04)
Asia > Mongolia (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.64)
Overview > Innovation (0.50)
Research Report > New Finding (0.47)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)