AITopics

The ability of deep learning models to learn continuously is essential for adapting to new data categories and evolving data distributions. In recent years, approaches leveraging frozen feature extractors after an initial learning phase have been extensively studied. Many of these methods estimate per-class covariance matrices and prototypes based on backbone-derived feature representations. Within this paradigm, we introduce FeNeC (Feature Neighborhood Classifier) and FeNeC-Log, its variant based on the log-likelihood function. Our approach generalizes the existing concept by incorporating data clustering to capture greater intra-class variability. Utilizing the Mahalanobis distance, our models classify samples either through a nearest neighbor approach or trainable logit values assigned to consecutive classes. Our proposition may be reduced to the existing approaches in a special case while extending them with the ability of more flexible adaptation to data. We demonstrate that two FeNeC variants achieve competitive performance in scenarios where task identities are unknown and establish state-of-the-art results on several benchmarks.

artificial intelligence, deep learning, machine learning, (17 more...)

2503.14301

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Poland > Lesser Poland Province > Kraków (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Jayatilaka, Gihan, Shrivastava, Abhinav, Gwilliam, Matthew

Utilization of Neighbor Information for Image Classification with Different Levels of Supervision

We propose to bridge the gap between semi-supervised and unsupervised image recognition with a flexible method that performs well for both generalized category discovery (GCD) and image clustering. Despite the overlap in motivation between these tasks, the methods themselves are restricted to a single task -- GCD methods are reliant on the labeled portion of the data, and deep image clustering methods have no built-in way to leverage the labels efficiently. We connect the two regimes with an innovative approach that Utilizes Neighbor Information for Classification (UNIC) both in the unsupervised (clustering) and semisupervised (GCD) setting. State-of-the-art clustering methods already rely heavily on nearest neighbors. We improve on their results substantially in two parts, first with a sampling and cleaning strategy where we identify accurate positive and negative neighbors, and secondly by finetuning the backbone with clustering losses computed by sampling both types of neighbors. We then adapt this pipeline to GCD by utilizing the labelled images as ground truth neighbors. Our method yields state-of-the-art results for both clustering (+3% ImageNet-100, Imagenet200) and GCD (+0.8% ImageNet-100, +5% CUB, +2% SCars, +4% Aircraft).

artificial intelligence, machine learning, neighbor, (15 more...)

2503.145

Country:

North America > United States > Maryland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

On the clustering behavior of sliding windows

Alexeev, Boris, Luo, Wenyan, Mixon, Dustin G., Zhang, Yan X

Clustering is one of the most common tasks in data science, and given the ubiquity of timeseries data, one is naturally inclined to cluster it. In order to perform Euclidean clustering (such as k-means clustering) on timeseries data, one must first map the data into Euclidean space. This is traditionally accomplished with a sliding window.

artificial intelligence, centroid, machine learning, (17 more...)

2503.14393

Country:

North America > United States > Ohio > Franklin County > Columbus (0.05)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (0.69)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Oneto, Alfredo, Gjorgiev, Blazhe, Sansavini, Giovanni

Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Many data clustering applications must handle objects that cannot be represented as vector data. In this context, the bag-of-vectors representation can be leveraged to describe complex objects through discrete distributions, and the Wasserstein distance can effectively measure the dissimilarity between them. Additionally, kernel methods can be used to embed data into feature spaces that are easier to analyze. Despite significant progress in data clustering, a method that simultaneously accounts for distributional and vectorial dissimilarity measures is still lacking. To tackle this gap, this work explores kernel methods and Wasserstein distance metrics to develop a computationally tractable clustering framework. The compositional properties of kernels allow the simultaneous handling of different metrics, enabling the integration of both vectors and discrete distributions for object representation. This approach is flexible enough to be applied in various domains, such as graph analysis and image processing. The framework consists of three main components. First, we efficiently approximate pairwise Wasserstein distances using multiple reference distributions. Second, we employ kernel functions based on Wasserstein distances and present ways of composing kernels to express different types of information. Finally, we use the kernels to cluster data and evaluate the quality of the results using scalable and distance-agnostic validity indices. A case study involving two datasets of 879 and 34,920 power distribution graphs demonstrates the framework's effectiveness and efficiency.

artificial intelligence, machine learning, wasserstein distance, (19 more...)

2503.14357

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Shawon, Reza E Rabbi, Hasan, MD Rokibul, Rahman, Md Anisur, Ghandri, Mohamed, Lamari, Iman Ahmed, Kawsar, Mohammed, Akter, Rubi

Designing and Deploying AI Models for Sustainable Logistics Optimization: A Case Study on Eco-Efficient Supply Chains in the USA

arXiv.org Artificial IntelligenceMar-17-2025

The rapid evolution of Artificial Intelligence (AI) and Machine Learning (ML) has significantly transformed logistics and supply chain management, particularly in the pursuit of sustainability and eco-efficiency. This study explores AI-based methodologies for optimizing logistics operations in the USA, focusing on reducing environmental impact, improving fuel efficiency, and minimizing costs. Key AI applications include predictive analytics for demand forecasting, route optimization through machine learning, and AI-powered fuel efficiency strategies. Various models, such as Linear Regression, XGBoost, Support Vector Machine, and Neural Networks, are applied to real-world logistics datasets to reduce carbon emissions based on logistics operations, optimize travel routes to minimize distance and travel time, and predict future deliveries to plan optimal routes. Other models such as K-Means and DBSCAN are also used to optimize travel routes to minimize distance and travel time for logistics operations. This study utilizes datasets from logistics companies' databases. The study also assesses model performance using metrics such as mean absolute error (MAE), mean squared error (MSE), and R2 score. This study also explores how these models can be deployed to various platforms for real-time logistics and supply chain use. The models are also examined through a thorough case study, highlighting best practices and regulatory frameworks that promote sustainability. The findings demonstrate AI's potential to enhance logistics efficiency, reduce carbon footprints, and contribute to a more resilient and adaptive supply chain ecosystem.

artificial intelligence, emission, machine learning, (14 more...)

doi: 10.62754/joe.v4i2.6610

2503.14556

Country:

North America > United States > Pennsylvania > Erie County > Erie (0.04)
North America > United States > Illinois > McDonough County > Macomb (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Freight & Logistics Services (1.00)
Law (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Artificial IntelligenceMar-17-2025

PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior

Lee, Seunggwan, Jung, Hwanhee, Koh, Byoungsoo, Huang, Qixing, Yoon, Sangho, Kim, Sangpil

A fundamental challenge in conditional 3D shape generation is to minimize the information loss and maximize the intention of user input. Existing approaches have predominantly focused on two types of isolated conditional signals, i.e., user sketches and text descriptions, each of which does not offer flexible control of the generated shape. In this paper, we introduce PASTA, the flexible approach that seamlessly integrates a user sketch and a text description for 3D shape generation. The key idea is to use text embeddings from a vision-language model to enrich the semantic representation of sketches. Specifically, these text-derived priors specify the part components of the object, compensating for missing visual cues from ambiguous sketches. In addition, we introduce ISG-Net which employs two types of graph convolutional networks: IndivGCN, which processes fine-grained details, and PartGCN, which aggregates these details into parts and refines the structure of objects. Extensive experiments demonstrate that PASTA outperforms existing methods in part-level editing and achieves state-of-the-art results in sketch-to-3D shape generation.

large language model, machine learning, natural language, (19 more...)

2503.12834

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Altemeyer, Moritz, Eger, Steffen, Daxenberger, Johannes, Altendorf, Tim, Cimiano, Philipp, Schiller, Benjamin

Argument Summarization and its Evaluation in the Era of Large Language Models

arXiv.org Artificial IntelligenceMar-17-2025

Large Language Models (LLMs) have revolutionized various Natural Language Generation (NLG) tasks, including Argument Summarization (ArgSum), a key subfield of Argument Mining (AM). This paper investigates the integration of state-of-the-art LLMs into ArgSum, including for its evaluation. In particular, we propose a novel prompt-based evaluation scheme, and validate it through a novel human benchmark dataset. Our work makes three main contributions: (i) the integration of LLMs into existing ArgSum frameworks, (ii) the development of a new LLM-based ArgSum system, benchmarked against prior methods, and (iii) the introduction of an advanced LLM-based evaluation scheme. We demonstrate that the use of LLMs substantially improves both the generation and evaluation of argument summaries, achieving state-of-the-art results and advancing the field of ArgSum.

large language model, machine learning, natural language, (21 more...)

2503.00847

Country:

Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)
North America > Dominican Republic (0.04)
(17 more...)

Genre:

Research Report (0.90)
Overview (0.68)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area > Vaccines (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Huang, Jingzhou, Lu, Jiuyao, Tolbert, Alexander Williams

Causal Feature Learning in the Social Sciences

arXiv.org Artificial IntelligenceMar-16-2025

Variable selection poses a significant challenge in causal modeling, particularly within the social sciences, where constructs often rely on inter-related factors such as age, socioeconomic status, gender, and race. Indeed, it has been argued that such attributes must be modeled as macro-level abstractions of lower-level manipulable features, in order to preserve the modularity assumption essential to causal inference. This paper accordingly extends the theoretical framework of Causal Feature Learning (CFL). Empirically, we apply the CFL algorithm to diverse social science datasets, evaluating how CFL-derived macrostates compare with traditional microstates in downstream modeling tasks.

artificial intelligence, machine learning, macrostate, (14 more...)

2503.12784

Country: North America > United States > Pennsylvania (0.04)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Strength High (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

arXiv.org Artificial IntelligenceMar-16-2025

Towards Learnable Anchor for Deep Multi-View Clustering

Wang, Bocheng, Zeng, Chusheng, Chen, Mulin, Li, Xuelong

Deep multi-view clustering incorporating graph learning has presented tremendous potential. Most methods encounter costly square time consumption w.r.t. data size. Theoretically, anchor-based graph learning can alleviate this limitation, but related deep models mainly rely on manual discretization approaches to select anchors, which indicates that 1) the anchors are fixed during model training and 2) they may deviate from the true cluster distribution. Consequently, the unreliable anchors may corrupt clustering results. In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time. Concretely, the initial anchors are intervened by the positive-incentive noise sampled from Gaussian distribution, such that they can be optimized with a newly designed anchor learning loss, which promotes a clear relationship between samples and anchors. Afterwards, anchor graph convolution is devised to model the cluster structure formed by the anchors, and the mutual information maximization loss is built to provide cross-view clustering guidance. In this way, the learned anchors can better represent clusters. With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering. Extensive experiments on several datasets demonstrate the superior performance and efficiency of DMAC compared to state-of-the-art competitors.

anchor, artificial intelligence, machine learning, (16 more...)

2503.12427

Country: Asia > China (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

arXiv.org Artificial IntelligenceMar-15-2025

Impact of Data Patterns on Biotype identification Using Machine Learning

Yu, Yuetong, Ge, Ruiyang, Hacihaliloglu, Ilker, Rauscher, Alexander, Tam, Roger, Frangou, Sophia

Background: Patient stratification in brain disorders remains a significant challenge, despite advances in machine learning and multimodal neuroimaging. Automated machine learning algorithms have been widely applied for identifying patient subtypes (biotypes), but results have been inconsistent across studies. These inconsistencies are often attributed to algorithmic limitations, yet an overlooked factor may be the statistical properties of the input data. This study investigates the contribution of data patterns on algorithm performance by leveraging synthetic brain morphometry data as an exemplar. Methods: Four widely used algorithms-SuStaIn, HYDRA, SmileGAN, and SurrealGAN were evaluated using multiple synthetic pseudo-patient datasets designed to include varying numbers and sizes of clusters and degrees of complexity of morphometric changes. Ground truth, representing predefined clusters, allowed for the evaluation of performance accuracy across algorithms and datasets. Results: SuStaIn failed to process datasets with more than 17 variables, highlighting computational inefficiencies. HYDRA was able to perform individual-level classification in multiple datasets with no clear pattern explaining failures. SmileGAN and SurrealGAN outperformed other algorithms in identifying variable-based disease patterns, but these patterns were not able to provide individual-level classification. Conclusions: Dataset characteristics significantly influence algorithm performance, often more than algorithmic design. The findings emphasize the need for rigorous validation using synthetic data before real-world application and highlight the limitations of current clustering approaches in capturing the heterogeneity of brain disorders. These insights extend beyond neuroimaging and have implications for machine learning applications in biomedical research.

artificial intelligence, dataset, machine learning, (18 more...)

2503.12066

Country:

North America > Canada > British Columbia (0.05)
North America > United States > Massachusetts > Plymouth County > Hanover (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)