AITopics

2405.1921

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Data Science > Data Quality > Data Cleaning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Sadeghi, Mohammadreza, Wang, Zihan, Armanfard, Narges

Forward-Backward Knowledge Distillation for Continual Clustering

arXiv.org Artificial IntelligenceMay-29-2024

Unsupervised Continual Learning (UCL) is a burgeoning field in machine learning, focusing on enabling neural networks to sequentially learn tasks without explicit label information. Catastrophic Forgetting (CF), where models forget previously learned tasks upon learning new ones, poses a significant challenge in continual learning, especially in UCL, where labeled information of data is not accessible. CF mitigation strategies, such as knowledge distillation and replay buffers, often face memory inefficiency and privacy issues. Although current research in UCL has endeavored to refine data representations and address CF in streaming data contexts, there is a noticeable lack of algorithms specifically designed for unsupervised clustering. To fill this gap, in this paper, we introduce the concept of Unsupervised Continual Clustering (UCC). We propose Forward-Backward Knowledge Distillation for unsupervised Continual Clustering (FBCC) to counteract CF within the context of UCC. FBCC employs a single continual learner (the ``teacher'') with a cluster projector, along with multiple student models, to address the CF issue. The proposed method consists of two phases: Forward Knowledge Distillation, where the teacher learns new clusters while retaining knowledge from previous tasks with guidance from specialized student models, and Backward Knowledge Distillation, where a student model mimics the teacher's behavior to retain task-specific knowledge, aiding the teacher in subsequent tasks. FBCC marks a pioneering approach to UCC, demonstrating enhanced performance and memory efficiency in clustering across various tasks, outperforming the application of clustering algorithms to the latent space of state-of-the-art UCL algorithms.

continual learning, learning, representation, (13 more...)

2405.19234

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry:

Education > Educational Technology > Educational Software (0.75)
Information Technology > Security & Privacy (0.68)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Seif, Mohamed, Chen, Yanxi

Clustering Mixtures of Discrete Distributions: A Note on Mitra's Algorithm

arXiv.org Machine LearningMay-29-2024

Clustering is a critical challenge in network science, pivotal for detecting underlying patterns and structures in unlabeled data. To explore the boundaries of this challenge, stochastic block models (SBMs) have been effectively utilized as a mathematical framework to assess the performance of clustering algorithms. Specifically, an SBM is a statistical model developed to reveal the structural dynamics of networks or graphs, where nodes represent individual entities and edges symbolize the connections between them. In a typical SBM, nodes are categorized into blocks or communities according to their connectivity patterns, with the probability of an edge existing between any two nodes depending on the blocks to which they belong [3]. For example, in a social network using an SBM, nodes might be organized by attributes such as age, gender, or geographic location, with friendship probabilities determined by their block memberships [1, 6]. The Bipartite Stochastic Block Model(B-SBM)[2] extends the conventional SBM to accommodate networks comprising two distinct node types, forming a bipartite graph structure. This adaptation is particularly beneficial in contexts such as recommendation systems, where nodes represent users and products, or in particular social networks, where nodes might denote individuals and the groups or events they participate in. In B-SBMs, the connections between nodes from different sets are governed by an "affinity matrix" that specifies the likelihood of linkage based on group affiliations. This matrix is integral to capturing interaction patterns within the network, allowing for a sophisticated estimation of model parameters from observed connections.

algorithm, high probability, probability, (15 more...)

arXiv.org Machine Learning

2405.19559

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Dasula, Akshat Mohan, Tigulla, Hrushitha, Bhukya, Preethika

Judgement Citation Retrieval using Contextual Similarity

Traditionally in the domain of legal research, the retrieval of pertinent citations from intricate case descriptions has demanded manual effort and keyword-based search applications that mandate expertise in understanding legal jargon. Legal case descriptions hold pivotal information for legal professionals and researchers, necessitating more efficient and automated approaches. We propose a methodology that combines natural language processing (NLP) and machine learning techniques to enhance the organization and utilization of legal case descriptions. This approach revolves around the creation of textual embeddings with the help of state-of-art embedding models. Our methodology addresses two primary objectives: unsupervised clustering and supervised citation retrieval, both designed to automate the citation extraction process. Although the proposed methodology can be used for any dataset, we employed the Supreme Court of The United States (SCOTUS) dataset, yielding remarkable results. Our methodology achieved an impressive accuracy rate of 90.9%. By automating labor-intensive processes, we pave the way for a more efficient, time-saving, and accessible landscape in legal research, benefiting legal professionals, academics, and researchers.

application, case description, vector, (12 more...)

2406.01609

Country:

North America > United States (0.89)
Asia > India > Telangana (0.05)

Genre: Research Report (0.64)

Industry:

Law > Government & the Courts (0.89)
Government > Regional Government > North America Government > United States Government (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Adapting Differential Molecular Representation with Hierarchical Prompts for Multi-label Property Prediction

Kang, Linjia, Zhou, Songhua, Fang, Shuyan, Liu, Shichao, Zhang, Wen

Accurate prediction of molecular properties is critical in the field of drug discovery. However, existing methods do not fully consider the fact that molecules in the real world usually possess multiple property labels, and complex high-order relationships may exist among these labels. Therefore, molecular representation learning models should generate differential molecular representations that consider multi-granularity correlation information among tasks. To this end, our research introduces a Hierarchical Prompted Molecular Representation Learning Framework (HiPM), which enhances the differential expression of tasks in molecular representations through task-aware prompts, and utilizes shared information among labels to mitigate negative transfer between different tasks. HiPM primarily consists of two core components: the Molecular Representation Encoder (MRE) and the Task-Aware Prompter (TAP). The MRE employs a hierarchical message-passing network architecture to capture molecular features at both the atomic and motif levels, while the TAP uses agglomerative hierarchical clustering to build a prompt tree that reflects the affinity and distinctiveness of tasks, enabling the model to effectively handle the complexity of multi-label property predictions. Extensive experiments demonstrate that HiPM achieves state-of-the-art performance across various multi-label datasets, offering a new perspective on multi-label molecular representation learning.

information, molecular representation, prediction, (14 more...)

2405.18724

Country: Asia > China > Hubei Province (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Horwitz, Eliahu, Shul, Asaf, Hoshen, Yedid

On the Origin of Llamas: Model Tree Heritage Recovery

The rapid growth of neural network models shared on the internet has made model weights an important data modality. However, this information is underutilized as the weights are uninterpretable, and publicly available models are disorganized. Inspired by Darwin's tree of life, we define the Model Tree which describes the origin of models i.e., the parent model that was used to fine-tune the target model. Similarly to the natural world, the tree structure is unknown. In this paper, we introduce the task of Model Tree Heritage Recovery (MoTHer Recovery) for discovering Model Trees in the ever-growing universe of neural networks. Our hypothesis is that model weights encode this information, the challenge is to decode the underlying tree structure given the weights. Beyond the immediate application of model authorship attribution, MoTHer recovery holds exciting long-term applications akin to indexing the internet by search engines. Practically, for each pair of models, this task requires: i) determining if they are related, and ii) establishing the direction of the relationship. We find that certain distributional properties of the weights evolve monotonically during training, which enables us to classify the relationship between two given models. MoTHer recovery reconstructs entire model hierarchies, represented by a directed tree, where a parent model gives rise to multiple child models through additional training. Our approach successfully reconstructs complex Model Trees, as well as the structure of "in-the-wild" model families such as Llama 2 and Stable Diffusion.

arxiv preprint arxiv, model graph, model tree, (12 more...)

2405.18432

Country:

Oceania > Australia (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Guo, Xiaobo, Desai, Jay, Sengamedu, Srinivasan H.

JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization

To generate summaries that include multiple aspects or topics for text documents, most approaches use clustering or topic modeling to group relevant sentences and then generate a summary for each group. These approaches struggle to optimize the summarization and clustering algorithms jointly. On the other hand, aspect-based summarization requires known aspects. Our solution integrates topic discovery and summarization into a single step. Given text data, our Joint Aspect Discovery and Summarization algorithm (JADS) discovers aspects from the input and generates a summary of the topics, in one step. We propose a self-supervised framework that creates a labeled dataset by first mixing sentences from multiple documents (e.g., CNN/DailyMail articles) as the input and then uses the article summaries from the mixture as the labels. The JADS model outperforms the two-step baselines. With pretraining, the model achieves better performance and stability. Furthermore, embeddings derived from JADS exhibit superior clustering capabilities. Our proposed method achieves higher semantic alignment with ground truth and is factual.

dataset, summarization, summary number, (14 more...)

2405.18642

Country:

Asia > Middle East > Iraq (0.14)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
North America > United States > New York (0.04)
(23 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Government > Military (1.00)
Leisure & Entertainment > Sports > Football (0.93)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation

Yin, Kejia, Rao, Varshanth R., Jiang, Ruowei, Liu, Xudong, Aarabi, Parham, Lindell, David B.

Self-supervised landmark estimation is a challenging task that demands the formation of locally distinct feature representations to identify sparse facial landmarks in the absence of annotated data. To tackle this task, existing state-of-the-art (SOTA) methods (1) extract coarse features from backbones that are trained with instance-level self-supervised learning (SSL) paradigms, which neglect the dense prediction nature of the task, (2) aggregate them into memory-intensive hypercolumn formations, and (3) supervise lightweight projector networks to naively establish full local correspondences among all pairs of spatial features. In this paper, we introduce SCE-MAE, a framework that (1) leverages the MAE, a region-level SSL method that naturally better suits the landmark prediction task, (2) operates on the vanilla feature map instead of on expensive hypercolumns, and (3) employs a Correspondence Approximation and Refinement Block (CARB) that utilizes a simple density peak clustering algorithm and our proposed Locality-Constrained Repellence Loss to directly hone only select local correspondences. We demonstrate through extensive experiments that SCE-MAE is highly effective and robust, outperforming existing SOTA methods by large margins of approximately 20%-44% on the landmark matching and approximately 9%-15% on the landmark detection tasks.

correspondence, landmark, representation, (16 more...)

2405.18322

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.83)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Odyurt, Uraz, Dobreva, Nadezhda, Wolffs, Zef, Zhao, Yue, Sánchez, Antonio Ferrer, Bazan, Roberto Ruiz de Austri, Martín-Guerrero, José D., Varbanescu, Ana-Lucia, Caron, Sascha

Novel Approaches for ML-Assisted Particle Track Reconstruction and Hit Clustering

arXiv.org Artificial IntelligenceMay-27-2024

Track reconstruction is a vital aspect of High-Energy Physics (HEP) and plays a critical role in major experiments. In this study, we delve into unexplored avenues for particle track reconstruction and hit clustering. Firstly, we enhance the algorithmic design effort by utilising a simplified simulator (REDVID) to generate training data that is specifically composed for simplicity. We demonstrate the effectiveness of this data in guiding the development of optimal network architectures. Additionally, we investigate the application of image segmentation networks for this task, exploring their potential for accurate track reconstruction. Moreover, we approach the task from a different perspective by treating it as a hit sequence to track sequence translation problem. Specifically, we explore the utilisation of Transformer architectures for tracking purposes. Our preliminary findings are covered in detail. By considering this novel approach, we aim to uncover new insights and potential advancements in track reconstruction. This research sheds light on previously unexplored methods and provides valuable insights for the field of particle track reconstruction and hit clustering in HEP.

architecture, model design, track parameter, (12 more...)

2405.17325

Country:

Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
North America > Cuba > Artemisa Province > Artemisa (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > Promising Solution (0.60)
Overview > Innovation (0.60)
Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Faye, Bilal, Lebbah, Mustapha, Azzag, Hanane

Supervised Batch Normalization

arXiv.org Artificial IntelligenceMay-27-2024

Batch Normalization (BN), a widely-used technique in neural networks, enhances generalization and expedites training by normalizing each mini-batch to the same mean and variance. However, its effectiveness diminishes when confronted with diverse data distributions. To address this challenge, we propose Supervised Batch Normalization (SBN), a pioneering approach. We expand normalization beyond traditional single mean and variance parameters, enabling the identification of data modes prior to training. This ensures effective normalization for samples sharing common features. We define contexts as modes, categorizing data with similar characteristics. These contexts are explicitly defined, such as domains in domain adaptation or modalities in multimodal systems, or implicitly defined through clustering algorithms based on data similarity. We illustrate the superiority of our approach over BN and other commonly employed normalization techniques through various experiments on both single and multi-task datasets. Integrating SBN with Vision Transformer results in a remarkable \textit{15.13}\% accuracy enhancement on CIFAR-100. Additionally, in domain adaptation scenarios, employing AdaMatch demonstrates an impressive \textit{22.25}\% accuracy improvement on MNIST and SVHN compared to BN.

batch normalization, dataset, normalization, (15 more...)

2405.17027

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)