AITopics | twelfth international conference

Collaborating Authors

twelfth international conference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models

Xie, Guangyu, Zhang, Yice, Bao, Jianzhu, Wang, Qianlong, Sun, Yang, Wang, Bingbing, Xu, Ruifeng

arXiv.org Artificial IntelligenceNov-4-2025

Recent efforts leverage knowledge distillation techniques to develop lightweight and practical sentiment analysis models. These methods are grounded in human-written instructions and large-scale user texts. Despite the promising results, two key challenges remain: (1) manually written instructions are limited in diversity and quantity, making them insufficient to ensure comprehensive coverage of distilled knowledge; (2) large-scale user texts incur high computational cost, hindering the practicality of these methods. To this end, we introduce CompEffDist, a comprehensive and efficient distillation framework for sentiment analysis. Our framework consists of two key modules: attribute-based automatic instruction construction and difficulty-based data filtering, which correspondingly tackle the aforementioned challenges. Applying our method across multiple model series (Llama-3, Qwen-3, and Gemma-3), we enable 3B student models to match the performance of 20x larger teacher models on most tasks. In addition, our approach greatly outperforms baseline methods in data efficiency, attaining the same performance level with only 10% of the data.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.24425

Country:

North America > Mexico (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(3 more...)

Add feedback

VIFO: Visual Feature Empowered Multivariate Time Series Forecasting with Cross-Modal Fusion

Wang, Yanlong, Yu, Hang, Xu, Jian, Ma, Fei, Zhang, Hongkang, Feng, Tongtong, Zhang, Zijian, Huang, Shao-Lun, Sun, Danny Dongning, Zhang, Xiao-Ping

arXiv.org Artificial IntelligenceOct-7-2025

Large time series foundation models often adopt channel-independent architectures to handle varying data dimensions, but this design ignores crucial cross-channel dependencies. Concurrently, existing multimodal approaches have not fully exploited the power of large vision models (LVMs) to interpret spatiotemporal data. Additionally, there remains significant unexplored potential in leveraging the advantages of information extraction from different modalities to enhance time series forecasting performance. To address these gaps, we propose the VIFO, a cross-modal forecasting model. VIFO uniquely renders multivariate time series into image, enabling pre-trained LVM to extract complex cross-channel patterns that are invisible to channel-independent models. These visual features are then aligned and fused with representations from the time series modality. By freezing the LVM and training only 7.45% of its parameters, VIFO achieves competitive performance on multiple benchmarks, offering an efficient and effective solution for capturing cross-variable relationships in

data mining, forecasting, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.03244

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.90)
(2 more...)

Add feedback

Review of Hallucination Understanding in Large Language and Vision Models

Ho, Zhengyi, Liang, Siyuan, Tao, Dacheng

arXiv.org Artificial IntelligenceOct-2-2025

The widespread adoption of large language and vision models in real-world applications has made urgent the need to address hallucinations -- instances where models produce incorrect or nonsensical outputs. These errors can propagate misinformation during deployment, leading to both financial and operational harm. Although much research has been devoted to mitigating hallucinations, our understanding of it is still incomplete and fragmented. Without a coherent understanding of hallucinations, proposed solutions risk mitigating surface symptoms rather than underlying causes, limiting their effectiveness and generalizability in deployment. To tackle this gap, we first present a unified, multi-level framework for characterizing both image and text hallucinations across diverse applications, aiming to reduce conceptual fragmentation. We then link these hallucinations to specific mechanisms within a model's lifecycle, using a task-modality interleaved approach to promote a more integrated understanding. Our investigations reveal that hallucinations often stem from predictable patterns in data distributions and inherited biases. By deepening our understanding, this survey provides a foundation for developing more robust and effective solutions to hallucinations in real-world generative AI systems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.00034

Country:

North America > United States (0.46)
Europe > Austria (0.28)
North America > Mexico (0.28)
Asia > Japan (0.27)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (0.67)
Information Technology (0.67)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

A Gradient Flow Approach to Solving Inverse Problems with Latent Diffusion Models

Wang, Tim Y. J., Akyildiz, O. Deniz

arXiv.org Machine LearningSep-24-2025

Solving ill-posed inverse problems requires powerful and flexible priors. We propose leveraging pretrained latent diffusion models for this task through a new training-free approach, termed Diffusion-regularized Wasserstein Gradient Flow (DWGF). Specifically, we formulate the posterior sampling problem as a regularized Wasserstein gradient flow of the Kullback-Leibler divergence in the latent space. We demonstrate the performance of our method on standard benchmarks using StableDiffusion (Rombach et al., 2022) as the prior.

conference, diffusion model, inverse problem, (11 more...)

arXiv.org Machine Learning

2509.19276

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

FinZero: Launching Multi-modal Financial Time Series Forecast with Large Reasoning Model

Wang, Yanlong, Xu, Jian, Ma, Fei, Zhang, Hongkang, Yu, Hang, Gao, Tiantian, Wang, Yu, You, Haochen, Huang, Shao-Lun, Sun, Danny Dongning, Zhang, Xiao-Ping

arXiv.org Artificial IntelligenceSep-11-2025

Financial time series forecasting is both highly significant and challenging. Previous approaches typically standardized time series data before feeding it into forecasting models, but this encoding process inherently leads to a loss of important information. Moreover, past time series models generally require fixed numbers of variables or lookback window lengths, which further limits the scalability of time series forecasting. Besides, the interpretability and the uncertainty in forecasting remain areas requiring further research, as these factors directly impact the reliability and practical value of predictions. To address these issues, we first construct a diverse financial image-text dataset (FVLDB) and develop the Uncertainty-adjusted Group Relative Policy Optimization (UARPO) method to enable the model not only output predictions but also analyze the uncertainty of those predictions. We then proposed FinZero, a multimodal pre-trained model finetuned by UARPO to perform reasoning, prediction, and analytical understanding on the FVLDB financial time series. Extensive experiments validate that FinZero exhibits strong adaptability and scalability. After fine-tuning with UARPO, FinZero achieves an approximate 13.48\% improvement in prediction accuracy over GPT-4o in the high-confidence group, demonstrating the effectiveness of reinforcement learning fine-tuning in multimodal large model, including in financial time series forecasting tasks.

forecasting, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.08742

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Strategies for Improving Communication Efficiency in Distributed and Federated Learning: Compression, Local Training, and Personalization

Yi, Kai

arXiv.org Artificial IntelligenceSep-11-2025

Distributed and federated learning are essential paradigms for training models across decentralized data sources while preserving privacy, yet communication overhead remains a major bottleneck. This dissertation explores strategies to improve communication efficiency, focusing on model compression, local training, and personalization. We establish a unified framework for biased and unbiased compression operators with convergence guarantees, then propose adaptive local training strategies that incorporate personalization to accelerate convergence and mitigate client drift. In particular, Scafflix balances global and personalized objectives, achieving superior performance under both IID and non-IID settings. We further introduce privacy-preserving pruning frameworks that optimize sparsity while minimizing communication costs, with Cohort-Squeeze leveraging hierarchical aggregation to reduce cross-device overhead. Finally, SymWanda, a symmetric post-training pruning method, enhances robustness under high sparsity and maintains accuracy without retraining. Extensive experiments on benchmarks and large-scale language models demonstrate favorable trade-offs among accuracy, convergence, and communication, offering theoretical and practical insights for scalable, efficient distributed learning.

data mining, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2509.08233

Country:

North America > United States (1.00)
Europe (0.92)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Uncovering Emergent Physics Representations Learned In-Context by Large Language Models

Song, Yeongwoo, Bae, Jaeyong, Kim, Dong-Kyum, Jeong, Hawoong

arXiv.org Artificial IntelligenceAug-20-2025

Large language models (LLMs) exhibit impressive in-context learning (ICL) abilities, enabling them to solve wide range of tasks via textual prompts alone. As these capabilities advance, the range of applicable domains continues to expand significantly. However, identifying the precise mechanisms or internal structures within LLMs that allow successful ICL across diverse, distinct classes of tasks remains elusive. Physics-based tasks offer a promising testbed for probing this challenge. Unlike synthetic sequences such as basic arithmetic or symbolic equations, physical systems provide experimentally controllable, real-world data based on structured dynamics grounded in fundamental principles. This makes them particularly suitable for studying the emergent reasoning behaviors of LLMs in a realistic yet tractable setting. Here, we mechanistically investigate the ICL ability of LLMs, especially focusing on their ability to reason about physics. Using a dynamics forecasting task in physical systems as a proxy, we evaluate whether LLMs can learn physics in context. We first show that the performance of dynamics forecasting in context improves with longer input contexts. To uncover how such capability emerges in LLMs, we analyze the model's residual stream activations using sparse autoencoders (SAEs). Our experiments reveal that the features captured by SAEs correlate with key physical variables, such as energy. These findings demonstrate that meaningful physical concepts are encoded within LLMs during in-context learning. In sum, our work provides a novel case study that broadens our understanding of how LLMs learn in context.

correlation, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.12448

Country: Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Data and Parameter Efficiency of Neural Language Models Using Representation Analysis

Jukić, Josip

arXiv.org Artificial IntelligenceJul-17-2025

This thesis addresses challenges related to data and parameter efficiency in neural language models, with a focus on representation analysis and the introduction of new optimization techniques. The first part examines the properties and dynamics of language representations within neural models, emphasizing their significance in enhancing robustness and generalization. It proposes innovative approaches based on representation smoothness, including regularization strategies that utilize Jacobian and Hessian matrices to stabilize training and mitigate sensitivity to input perturbations. The second part focuses on methods to significantly enhance data and parameter efficiency by integrating active learning strategies with parameter-efficient fine-tuning, guided by insights from representation smoothness analysis. It presents smoothness-informed early-stopping techniques designed to eliminate the need for labeled validation sets and proposes innovative combinations of active learning and parameter-efficient fine-tuning to reduce labeling efforts and computational resources. Extensive experimental evaluations across various NLP tasks demonstrate that these combined approaches substantially outperform traditional methods in terms of performance, stability, and efficiency. The third part explores weak supervision techniques enhanced by in-context learning to effectively utilize unlabeled data, further reducing dependence on extensive labeling. It shows that using in-context learning as a mechanism for weak supervision enables models to better generalize from limited labeled data by leveraging unlabeled examples more effectively during training. Comprehensive empirical evaluations confirm significant gains in model accuracy, adaptability, and robustness, especially in low-resource settings and dynamic data environments.

large language model, machine learning, natural language, (27 more...)

arXiv.org Artificial Intelligence

2507.12004

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.92)
Law (0.67)
Education > Curriculum (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(6 more...)

Add feedback

A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations

Lan, Tian, Zhou, Yang-Hao, Ma, Zi-Ao, Sun, Fanshu, Sun, Rui-Qing, Luo, Junyu, Tu, Rong-Cheng, Huang, Heyan, Xu, Chen, Wu, Zhijing, Mao, Xian-Ling

arXiv.org Artificial IntelligenceJun-13-2025

Recent advances in deep learning have significantly enhanced generative AI capabilities across text, images, and audio. However, automatically evaluating the quality of these generated outputs presents ongoing challenges. Although numerous automatic evaluation methods exist, current research lacks a systematic framework that comprehensively organizes these methods across text, visual, and audio modalities. To address this issue, we present a comprehensive review and a unified taxonomy of automatic evaluation methods for generated content across all three modalities; We identify five fundamental paradigms that characterize existing evaluation approaches across these domains. Our analysis begins by examining evaluation methods for text generation, where techniques are most mature. We then extend this framework to image and audio generation, demonstrating its broad applicability. Finally, we discuss promising directions for future research in cross-modal evaluation methodologies.

generation negative sampling random sampling, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.10019

Country:

Europe (1.00)
Asia > China (0.68)
Asia > Middle East (0.67)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.67)
Information Technology > Security & Privacy (0.67)
Education > Assessment & Standards > Student Performance (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Add feedback

Optimized Local Updates in Federated Learning via Reinforcement Learning

Murad, Ali, Hui, Bo, Ku, Wei-Shinn

arXiv.org Artificial IntelligenceJun-10-2025

Federated Learning (FL) is a distributed framework for collaborative model training over large-scale distributed data, enabling higher performance while maintaining client data privacy. However, the nature of model aggregation at the centralized server can result in a performance drop in the presence of non-IID data across different clients. We remark that training a client locally on more data than necessary does not benefit the overall performance of all clients. In this paper, we devise a novel framework that leverages a Deep Reinforcement Learning (DRL) agent to select an optimized amount of data necessary to train a client model without oversharing information with the server. Starting without awareness of the client's performance, the DRL agent utilizes the change in training loss as a reward signal and learns to optimize the amount of training data necessary for improving the client's performance. Specifically, after each aggregation round, the DRL algorithm considers the local performance as the current state and outputs the optimized weights for each class, in the training data, to be used during the next round of local training. In doing so, the agent learns a policy that creates an optimized partition of the local training dataset during the FL rounds. After FL, the client utilizes the entire local training dataset to further enhance its performance on its own data distribution, mitigating the non-IID effects of aggregation. Through extensive experiments, we demonstrate that training FL clients through our algorithm results in superior performance on multiple benchmark datasets and FL frameworks. Our code is available at https://github.com/amuraddd/optimized_client_training.git.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2506.06337

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Education (0.77)
Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback