AITopics

Financial institutions are required by regulation to report suspicious financial transactions related to money laundering. Therefore, they need to constantly monitor vast amounts of incoming and outgoing transactions. A particular challenge in detecting money laundering is that money launderers continuously adapt their tactics to evade detection. Hence, detection methods need constant fine-tuning. Traditional machine learning models suffer from catastrophic forgetting when fine-tuning the model on new data, thereby limiting their effectiveness in dynamic environments. Continual learning methods may address this issue and enhance current anti-money laundering (AML) practices, by allowing models to incorporate new information while retaining prior knowledge. Research on continual graph learning for AML, however, is still scarce. In this review, we critically evaluate state-of-the-art continual graph learning approaches for AML applications. We categorise methods into replay-based, regularization-based, and architecture-based strategies within the graph neural network (GNN) framework, and we provide in-depth experimental evaluations on both synthetic and real-world AML data sets that showcase the effect of the different hyperparameters. Our analysis demonstrates that continual learning improves model adaptability and robustness in the face of extreme class imbalances and evolving fraud patterns. Finally, we outline key challenges and propose directions for future research.

artificial intelligence, deep learning, machine learning, (13 more...)

2503.24259

Country:

North America > United States (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

Wang, Rui, Wang, Hongru, Xue, Boyang, Pang, Jianhui, Liu, Shudong, Chen, Yi, Qiu, Jiahao, Wong, Derek Fai, Ji, Heng, Wong, Kam-Fai

Recent advancements in Large Language Models (LLMs) have significantly enhanced their ability to perform complex reasoning tasks, transitioning from fast and intuitive thinking (System 1) to slow and deep reasoning (System 2). While System 2 reasoning improves task accuracy, it often incurs substantial computational costs due to its slow thinking nature and inefficient or unnecessary reasoning behaviors. In contrast, System 1 reasoning is computationally efficient but leads to suboptimal performance. Consequently, it is critical to balance the trade-off between performance (benefits) and computational costs (budgets), giving rise to the concept of reasoning economy. In this survey, we provide a comprehensive analysis of reasoning economy in both the post-training and test-time inference stages of LLMs, encompassing i) the cause of reasoning inefficiency, ii) behavior analysis of different reasoning patterns, and iii) potential solutions to achieve reasoning economy. By offering actionable insights and highlighting open challenges, we aim to shed light on strategies for improving the reasoning economy of LLMs, thereby serving as a valuable resource for advancing research in this evolving area. We also provide a public repository to continually track developments in this fast-evolving field.

large language model, machine learning, natural language, (14 more...)

2503.24377

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Knab, Patrick, Marton, Sascha, Schlegel, Udo, Bartelt, Christian

Which LIME should I trust? Concepts, Challenges, and Solutions

As neural networks become dominant in essential systems, Explainable Artificial Intelligence (XAI) plays a crucial role in fostering trust and detecting potential misbehavior of opaque models. LIME (Local Interpretable Model-agnostic Explanations) is among the most prominent model-agnostic approaches, generating explanations by approximating the behavior of black-box models around specific instances. Despite its popularity, LIME faces challenges related to fidelity, stability, and applicability to domain-specific problems. Numerous adaptations and enhancements have been proposed to address these issues, but the growing number of developments can be overwhelming, complicating efforts to navigate LIME-related research. To the best of our knowledge, this is the first survey to comprehensively explore and collect LIME's foundational concepts and known limitations. We categorize and compare its various enhancements, offering a structured taxonomy based on intermediate steps and key issues. Our analysis provides a holistic overview of advancements in LIME, guiding future research and helping practitioners identify suitable approaches. Additionally, we provide a continuously updated interactive website (https://patrick-knab.github.io/which-lime-to-trust/), offering a concise and accessible overview of the survey.

explanation, machine learning, natural language, (20 more...)

2503.24365

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Phan, Hoang Hai, Vu, Nguyen Duc Minh, Phuong, Nam Dang

VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation

Neural Machine Translation (NMT) driven by Transformer architectures has advanced significantly, yet faces challenges with low-resource language pairs like Vietnamese-Japanese (Vi-Ja). Issues include sparse parallel data and handling linguistic/cultural nuances. Recent progress in Large Language Models (LLMs) with strong reasoning, often refined via Reinforcement Learning (RL), enables high-quality synthetic data generation. We introduce VNJPTranslate, a pipeline designed to systematically address the Vi-Ja translation task. It features a targeted data augmentation strategy using advanced LLMs with Chain-of-Thought prompting for challenging segments identified via corpus analysis. Subsequently, we employ efficient fine-tuning techniques (Unsloth with QLoRA) on a capable, low-parameter autoregressive model (specifically, a fine-tuned version of the 1.8B parameter Sailor model, which is based on the Qwen architecture) to create a practical and high-performing translation system. This integrated approach aims to improve Vi-Ja translation quality significantly over existing baselines.

large language model, machine learning, translation, (16 more...)

2504.00339

Country:

Europe > Finland > Uusimaa > Helsinki (0.06)
Asia > Vietnam > Thái Nguyên Province > Thái Nguyên (0.05)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(3 more...)

Genre:

Overview (0.48)
Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Srinivas, Sakhinana Sagar, Das, Akash, Gupta, Shivam, Runkana, Venkataramana

Agentic Multimodal AI for Hyperpersonalized B2B and B2C Advertising in Competitive Markets: An AI-Driven Competitive Advertising Framework

The growing use of foundation models (FMs) in real-world applications demands adaptive, reliable, and efficient strategies for dynamic markets. In the chemical industry, AI-discovered materials drive innovation, but commercial success hinges on market adoption, requiring FM-driven advertising frameworks that operate in-the-wild. We present a multilingual, multimodal AI framework for autonomous, hyper-personalized advertising in B2B and B2C markets. By integrating retrieval-augmented generation (RAG), multimodal reasoning, and adaptive persona-based targeting, our system generates culturally relevant, market-aware ads tailored to shifting consumer behaviors and competition. Validation combines real-world product experiments with a Simulated Humanistic Colony of Agents to model consumer personas, optimize strategies at scale, and ensure privacy compliance. Synthetic experiments mirror real-world scenarios, enabling cost-effective testing of ad strategies without risky A/B tests. Combining structured retrieval-augmented reasoning with in-context learning (ICL), the framework boosts engagement, prevents market cannibalization, and maximizes ROAS. This work bridges AI-driven innovation and market adoption, advancing multimodal FM deployment for high-stakes decision-making in commercial marketing.

large language model, machine learning, natural language, (24 more...)

2504.00338

Country:

Europe > Western Europe (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
Asia > India > NCT > New Delhi (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Overview > Innovation (0.34)

Industry:

Marketing (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

LLMs for Explainable AI: A Comprehensive Survey

Bilal, Ahsan, Ebert, David, Lin, Beiyu

Large Language Models (LLMs) offer a promising approach to enhancing Explainable AI (XAI) by transforming complex machine learning outputs into easy-to-understand narratives, making model predictions more accessible to users, and helping bridge the gap between sophisticated model behavior and human interpretability. AI models, such as state-of-the-art neural networks and deep learning models, are often seen as "black boxes" due to a lack of transparency. As users cannot fully understand how the models reach conclusions, users have difficulty trusting decisions from AI models, which leads to less effective decision-making processes, reduced accountabilities, and unclear potential biases. A challenge arises in developing explainable AI (XAI) models to gain users' trust and provide insights into how models generate their outputs. With the development of Large Language Models, we want to explore the possibilities of using human language-based models, LLMs, for model explainabilities. This survey provides a comprehensive overview of existing approaches regarding LLMs for XAI, and evaluation techniques for LLM-generated explanation, discusses the corresponding challenges and limitations, and examines real-world applications. Finally, we discuss future directions by emphasizing the need for more interpretable, automated, user-centric, and multidisciplinary approaches for XAI via LLMs.

explanation, large language model, machine learning, (16 more...)

2504.00125

Country:

North America > United States > Oklahoma > Cleveland County > Norman (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Kim, Yoonshik, Jung, Jaeyoon

KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language

The recent emergence of Large Vision-Language Models(VLMs) has resulted in a variety of different benchmarks for evaluating such models. Despite this, we observe that most existing evaluation methods suffer from the fact that they either require the model to choose from pre-determined responses, sacrificing open-endedness, or evaluate responses using a judge model, resulting in subjective and unreliable evaluation. In addition, we observe a lack of benchmarks for VLMs in the Korean language, which are necessary as a separate metric from more common English language benchmarks, as the performance of generative language models can differ significantly based on the language being used. Therefore, we present KOFFVQA, a general-purpose free-form visual question answering benchmark in the Korean language for the evaluation of VLMs. Our benchmark consists of 275 carefully crafted questions each paired with an image and grading criteria covering 10 different aspects of VLM performance. The grading criteria eliminate the problem of unreliability by allowing the judge model to grade each response based on a pre-determined set of rules. By defining the evaluation criteria in an objective manner, even a small open-source model can be used to evaluate models on our benchmark reliably. In addition to evaluating a large number of existing VLMs on our benchmark, we also experimentally verify that our method of using pre-existing grading criteria for evaluation is much more reliable than existing methods. Our evaluation code is available at https://github.com/maum-ai/KOFFVQA

benchmark, large language model, machine learning, (20 more...)

2503.2373

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Mexico > Mexico City > Mexico City (0.04)
Asia > South Korea (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving

Gao, Wei, Zhou, Xinyu, Sun, Peng, Zhang, Tianwei, Wen, Yonggang

Key-Value cache (\texttt{KV} \texttt{cache}) compression has emerged as a promising technique to optimize Large Language Model (LLM) serving. It primarily decreases the memory consumption of \texttt{KV} \texttt{cache} to reduce the computation cost. Despite the development of many compression algorithms, their applications in production environments are still not prevalent. In this paper, we revisit mainstream \texttt{KV} \texttt{cache} compression solutions from a practical perspective. Our contributions are three-fold. First, we comprehensively review existing algorithmic designs and benchmark studies for \texttt{KV} \texttt{cache} compression and identify missing pieces in their performance measurement, which could hinder their adoption in practice. Second, we empirically evaluate representative \texttt{KV} \texttt{cache} compression methods to uncover two key issues that affect the computational efficiency: (1) while compressing \texttt{KV} \texttt{cache} can reduce memory consumption, current implementations (e.g., FlashAttention, PagedAttention) do not optimize for production-level LLM serving, resulting in suboptimal throughput performance; (2) compressing \texttt{KV} \texttt{cache} may lead to longer outputs, resulting in increased end-to-end latency. We further investigate the accuracy performance of individual samples rather than the overall performance, revealing the intrinsic limitations in \texttt{KV} \texttt{cache} compression when handling specific LLM tasks. Third, we provide tools to shed light on future \texttt{KV} \texttt{cache} compression studies and facilitate their practical deployment in production. They are open-sourced in \href{https://github.com/LLMkvsys/rethink-kv-compression}{https://github.com/LLMkvsys/rethink-kv-compression}.

large language model, machine learning, natural language, (16 more...)

2503.24

Country:

North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset

Galvan-Sosa, Diana, Gaudeau, Gabrielle, Kavumba, Pride, Li, Yunmeng, gu, Hongyi, Yuan, Zheng, Sakaguchi, Keisuke, Buttery, Paula

The performance and usability of Large-Language Models (LLMs) are driving their use in explanation generation tasks. However, despite their widespread adoption, LLM explanations have been found to be unreliable, making it difficult for users to distinguish good from bad explanations. To address this issue, we present Rubrik's CUBE, an education-inspired rubric and a dataset of 26k explanations, written and later quality-annotated using the rubric by both humans and six open- and closed-source LLMs. The CUBE dataset focuses on two reasoning and two language tasks, providing the necessary diversity for us to effectively test our proposed rubric. Using Rubrik, we find that explanations are influenced by both task and perceived difficulty. Low quality stems primarily from a lack of conciseness in LLM-generated explanations, rather than cohesion and word choice. The full dataset, rubric, and code will be made available upon acceptance.

explanation, large language model, machine learning, (19 more...)

2503.23899

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Italy > Tuscany > Florence (0.04)
(20 more...)

Genre:

Overview (0.92)
Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Vartziotis, Tina, Schmidt, Maximilian, Dasoulas, George, Dellatolas, Ippolyti, Attademo, Stefano, Le, Viet Dung, Wiechmann, Anke, Hoffmann, Tim, Keckeisen, Michael, Kotsopoulos, Sotirios

Carbon Footprint Evaluation of Code Generation through LLM as a Service

arXiv.org Artificial IntelligenceMar-30-2025

Due to increased computing use, data centers consume and emit a lot of energy and carbon. These contributions are expected to rise as big data analytics, digitization, and large AI models grow and become major components of daily working routines. To reduce the environmental impact of software development, green (sustainable) coding and claims that AI models can improve energy efficiency have grown in popularity. Furthermore, in the automotive industry, where software increasingly governs vehicle performance, safety, and user experience, the principles of green coding and AI-driven efficiency could significantly contribute to reducing the sector's environmental footprint. We present an overview of green coding and metrics to measure AI model sustainability awareness. This study introduces LLM as a service and uses a generative commercial AI language model, GitHub Copilot, to auto-generate code. Using sustainability metrics to quantify these AI models' sustainability awareness, we define the code's embodied and operational carbon.

large language model, machine learning, natural language, (17 more...)

2504.01036

Country:

North America > United States (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Quebec > Montreal (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Automobiles & Trucks (0.87)
Information Technology > Services (0.69)
Energy > Renewable (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)