AITopics | co-clustering

Collaborating Authors

co-clustering

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning A Structured Optimal Bipartite Graph for Co-Clustering

Neural Information Processing SystemsNov-20-2025, 23:43:51 GMT

Co-clustering methods have been widely applied to document clustering and gene expression analysis. These methods make use of the duality between features and samples such that the co-occurring structure of sample and feature clusters can be extracted. In graph based co-clustering methods, a bipartite graph is constructed to depict the relation between features and samples. Most existing co-clustering methods conduct clustering on the graph achieved from the original data matrix, which doesn't have explicit cluster structure, thus they require a post-processing step to obtain the clustering results. In this paper, we propose a novel co-clustering method to learn a bipartite graph with exactly k connected components, where k is the number of clusters. The new bipartite graph learned in our model approximates the original graph but maintains an explicit cluster structure, from which we can immediately get the clustering results without post-processing. Extensive empirical results are presented to verify the effectiveness and robustness of our model.

co-clustering, name change, structured optimal bipartite graph, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Scalable Robust Bayesian Co-Clustering with Compositional ELBOs

Vinod, Ashwin, Bajaj, Chandrajit

arXiv.org Machine LearningApr-8-2025

Co-clustering exploits the duality of instances and features to simultaneously uncover meaningful groups in both dimensions, often outperforming traditional clustering in high-dimensional or sparse data settings. Although recent deep learning approaches successfully integrate feature learning and cluster assignment, they remain susceptible to noise and can suffer from posterior collapse within standard autoencoders. In this paper, we present the first fully variational Co-clustering framework that directly learns row and column clusters in the latent space, leveraging a doubly reparameterized ELBO to improve gradient signal-to-noise separation. Our unsupervised model integrates a Variational Deep Embedding with a Gaussian Mixture Model (GMM) prior for both instances and features, providing a built-in clustering mechanism that naturally aligns latent modes with row and column clusters. Furthermore, our regularized end-to-end noise learning Compositional ELBO architecture jointly reconstructs the data while regularizing against noise through the KL divergence, thus gracefully handling corrupted or missing inputs in a single training pipeline. To counteract posterior collapse, we introduce a scale modification that increases the encoder's latent means only in the reconstruction pathway, preserving richer latent representations without inflating the KL term. Finally, a mutual information-based cross-loss ensures coherent co-clustering of rows and columns. Empirical results on diverse real-world datasets from multiple modalities, numerical, textual, and image-based, demonstrate that our method not only preserves the advantages of prior Co-clustering approaches but also exceeds them in accuracy and robustness, particularly in high-dimensional or noisy settings.

artificial intelligence, machine learning, row and column, (18 more...)

arXiv.org Machine Learning

2504.04079

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Wisconsin (0.04)
Europe > Germany > Bavaria > Regensburg (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.94)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Co-clustering for Federated Recommender System

He, Xinrui, Liu, Shuo, Keung, Jackey, He, Jingrui

arXiv.org Artificial IntelligenceNov-3-2024

As data privacy and security attract increasing attention, Federated Recommender System (FRS) offers a solution that strikes a balance between providing high-quality recommendations and preserving user privacy. However, the presence of statistical heterogeneity in FRS, commonly observed due to personalized decision-making patterns, can pose challenges. To address this issue and maximize the benefit of collaborative filtering (CF) in FRS, it is intuitive to consider clustering clients (users) as well as items into different groups and learning group-specific models. Existing methods either resort to client clustering via user representations-risking privacy leakage, or employ classical clustering strategies on item embeddings or gradients, which we found are plagued by the curse of dimensionality. In this paper, we delve into the inefficiencies of the K-Means method in client grouping, attributing failures due to the high dimensionality as well as data sparsity occurring in FRS, and propose CoFedRec, a novel Co-clustering Federated Recommendation mechanism, to address clients heterogeneity and enhance the collaborative filtering within the federated framework. Specifically, the server initially formulates an item membership from the client-provided item networks. Subsequently, clients are grouped regarding a specific item category picked from the item membership during each communication round, resulting in an intelligently aggregated group model. Meanwhile, to comprehensively capture the global inter-relationships among items, we incorporate an additional supervised contrastive learning term based on the server-side generated item membership into the local training phase for each client. Extensive experiments on four datasets are provided, which verify the effectiveness of the proposed CoFedRec.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3589334.3645626

2411.0169

Country:

Asia > Singapore > Central Region > Singapore (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Reviews: Learning A Structured Optimal Bipartite Graph for Co-Clustering

Neural Information Processing SystemsOct-9-2024, 01:54:43 GMT

The authors propose a new method for co-clustering. The idea is to learn a bipartite graph with exactly k connected components. This way, the clusters can be directly inferred and no further preprocessing step (like executing k-means) is necessary. After introducing their approach the authors conduct experiments on a synthetic data set as well as on four benchmark data sets. I think that the proposed approach is interesting. However, there are some issues.

algorithm 1, structured optimal bipartite graph, synthetic data, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Scalable Co-Clustering for Large-Scale Data through Dynamic Partitioning and Hierarchical Merging

Wu, Zihan, Huang, Zhaoke, Yan, Hong

arXiv.org Artificial IntelligenceOct-9-2024

Co-clustering simultaneously clusters rows and columns, revealing more fine-grained groups. However, existing co-clustering methods suffer from poor scalability and cannot handle large-scale data. This paper presents a novel and scalable co-clustering method designed to uncover intricate patterns in high-dimensional, large-scale datasets. Specifically, we first propose a large matrix partitioning algorithm that partitions a large matrix into smaller submatrices, enabling parallel co-clustering. This method employs a probabilistic model to optimize the configuration of submatrices, balancing the computational efficiency and depth of analysis. Additionally, we propose a hierarchical co-cluster merging algorithm that efficiently identifies and merges co-clusters from these submatrices, enhancing the robustness and reliability of the process. Extensive evaluations validate the effectiveness and efficiency of our method. Experimental results demonstrate a significant reduction in computation time, with an approximate 83% decrease for dense matrices and up to 30% for sparse matrices.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Artificial Intelligence

2410.18113

Country:

Asia > China > Hong Kong (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > India > Telangana > Hyderabad (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.84)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Co-clustering based exploratory analysis of mixed-type data tables

Bouchareb, Aichetou, Boullé, Marc, Clérot, Fabrice, Rossi, Fabrice

arXiv.org Artificial IntelligenceDec-22-2022

Co-clustering is a class of unsupervised data analysis techniques that extract the existing underlying dependency structure between the instances and variables of a data table as homogeneous blocks. Most of those techniques are limited to variables of the same type. In this paper, we propose a mixed data co-clustering method based on a two-step methodology. In the first step, all the variables are binarized according to a number of bins chosen by the analyst, by equal frequency discretization in the numerical case, or keeping the most frequent values in the categorical case. The second step applies a co-clustering to the instances and the binary variables, leading to groups of instances and groups of variable parts. We apply this methodology on several data sets and compare with the results of a Multiple Correspondence Analysis applied to the same data.

artificial intelligence, machine learning, variable part, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-18129-1_2

2212.11728

Country:

Europe > France > Île-de-France > Paris > Paris (0.14)
Europe > Germany (0.04)
Asia > Philippines (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Elastic Coupled Co-clustering for Single-Cell Genomic Data

Zeng, Pengcheng, Lin, Zhixiang

arXiv.org Machine LearningMar-29-2020

The recent advances in single-cell technologies have enabled us to profile genomic features at unprecedented resolution and data sets from multiple domains are available, including data sets that profile different types of genomic features and data sets that profile the same type of genomic features across different species. These data sets typically have different powers in identifying the unknown cell types through clustering, and data integration can potentially lead to a better performance of clustering algorithms. In this work, we formulate the problem in an unsupervised transfer learning framework, which utilizes knowledge learned from auxiliary data set to improve the clustering performance of target data set. The degree of shared information among the target and auxiliary data sets can vary, and their distributions can also be different. To address these challenges, we propose an elastic coupled co-clustering based transfer learning algorithm, by elastically propagating clustering knowledge obtained from the auxiliary data set to the target data set. Implementation on single-cell genomic data sets shows that our algorithm greatly improves clustering performance over the traditional learning algorithms. The source code and data sets are available at https://github.com/cuhklinlab/elasticC3.

algorithm, auxiliary data, target data, (15 more...)

arXiv.org Machine Learning

2003.1297

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New Jersey > Bergen County > Englewood (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Learning A Structured Optimal Bipartite Graph for Co-Clustering

Nie, Feiping, Wang, Xiaoqian, Deng, Cheng, Huang, Heng

Neural Information Processing SystemsFeb-14-2020, 14:43:29 GMT

co-clustering, explicit cluster structure, structured optimal bipartite graph, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Robust Continuous Co-Clustering

He, Xiao, Moreira-Matias, Luis

arXiv.org Machine LearningFeb-14-2018

Clustering consists of grouping together samples giving their similar properties. The problem of modeling simultaneously groups of samples and features is known as Co-Clustering. This paper introduces ROCCO - a Robust Continuous Co-Clustering algorithm. ROCCO is a scalable, hyperparameter-free, easy and ready to use algorithm to address Co-Clustering problems in practice over massive cross-domain datasets. It operates by learning a graph-based two-sided representation of the input matrix. The underlying proposed optimization problem is non-convex, which assures a flexible pool of solutions. Moreover, we prove that it can be solved with a near linear time complexity on the input size. An exhaustive large-scale experimental testbed conducted with both synthetic and real-world datasets demonstrates ROCCO's properties in practice: (i) State-of-the-art performance in cross-domain real-world problems including Biomedicine and Text Mining; (ii) very low sensitivity to hyperparameter settings; (iii) robustness to noise and (iv) a linear empirical scalability in practice. These results highlight ROCCO as a powerful general-purpose co-clustering algorithm for cross-domain practitioners, regardless of their technical background.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1802.05036

Country:

Asia (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Co-Clustering Can Provide Industrial Data Pattern Discovery

#artificialintelligenceSep-14-2017, 17:05:28 GMT

In spite of the rapid development in data acquisition technology resulting in the explosive collection of acquired datasets, techniques such as data organization and classification, manipulation, and analysis of very large, diverse, heterogeneous datasets have only evolved modestly. This has led to hindrances in effective utility and better understanding of the acquired, large-scale data for knowledge discovery. In an industrial setting, an interesting visual from McKinsey illustrates that despite collecting data from tens of thousands of sensors, less than 1% is actually utilized. Data clustering is the classification of data objects into different groups (clusters) such that data objects in one group are similar together and dissimilar from another group. Typically, homogeneous data objects, i.e. data objects having the same data type, are grouped together using some of the well-known clustering algorithms.

artificial intelligence, data mining, machine learning, (14 more...)

#artificialintelligence

Industry: Media > Film (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.80)

Add feedback