AITopics

Thanks to the remarkable human-like capabilities of machine learning (ML) models in perceptual and cognitive tasks, frameworks integrating ML within rational agent architectures are gaining traction. Yet, the landscape remains fragmented and incoherent, often focusing on embedding ML into generic agent containers while overlooking the expressive power of rational architectures--such as Belief-Desire-Intention (BDI) agents. This paper presents a fine-grained systematisation of existing approaches, using the BDI paradigm as a reference. Our analysis illustrates the fast-evolving literature on rational agents enhanced by ML, and identifies key research opportunities and open challenges for designing effective rational ML agents.

large language model, machine learning, reinforcement learning, (23 more...)

2510.20641

Country:

Asia (0.67)
North America > United States (0.46)
Europe > Austria (0.28)
North America > Mexico (0.28)

Genre:

Overview (1.00)
Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Beyond MedQA: Towards Real-world Clinical Decision Making in the Era of LLMs

Xiao, Yunpeng, Yang, Carl, Mai, Mark, Hu, Xiao, Shu, Kai

Large language models (LLMs) show promise for clinical use. They are often evaluated using datasets such as MedQA. However, Many medical datasets, such as MedQA, rely on simplified Question-Answering (Q\A) that underrepresents real-world clinical decision-making. Based on this, we propose a unifying paradigm that characterizes clinical decision-making tasks along two dimensions: Clinical Backgrounds and Clinical Questions. As the background and questions approach the real clinical environment, the difficulty increases. We summarize the settings of existing datasets and benchmarks along two dimensions. Then we review methods to address clinical decision-making, including training-time and test-time techniques, and summarize when they help. Next, we extend evaluation beyond accuracy to include efficiency, explainability. Finally, we highlight open challenges. Our paradigm clarifies assumptions, standardizes comparisons, and guides the development of clinically meaningful LLMs.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2510.20001

Genre:

Research Report (1.00)
Overview (0.87)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Vrba, Vojtěch, Walter, Viktor, Štěpán, Petr, Saska, Martin

FIMD: Fast Isolated Marker Detection for UV-Based Visual Relative Localisation in Agile UAV Swarms

A novel approach for the fast onboard detection of isolated markers for visual relative localisation of multiple teammates in agile UAV swarms is introduced in this paper. As the detection forms a key component of real-time localisation systems, a three-fold innovation is presented, consisting of an optimised procedure for CPUs, a GPU shader program, and a functionally equivalent FPGA streaming architecture. For the proposed CPU and GPU solutions, the mean processing time per pixel of input camera frames was accelerated by two to three orders of magnitude compared to the \rev{unoptimised state-of-the-art approach}. For the localisation task, the proposed FPGA architecture offered the most significant overall acceleration by minimising the total delay from camera exposure to detection results. Additionally, the proposed solutions were evaluated on various 32-bit and 64-bit embedded platforms to demonstrate their efficiency, as well as their feasibility for applications using low-end UAVs and MAVs. Thus, it has become a crucial enabling technology for agile UAV swarming.

artificial intelligence, machine learning, pixel, (16 more...)

doi: 10.1007/s10514-025-10197-5

2510.19663

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.54)
Overview > Innovation (0.54)

Industry:

Aerospace & Defense (0.46)
Semiconductors & Electronics (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

HauntAttack: When Attack Follows Reasoning as a Shadow

Ma, Jingyuan, Li, Rui, Li, Zheng, Liu, Junfeng, Xia, Heming, Sha, Lei, Sui, Zhifang

Emerging Large Reasoning Models (LRMs) consistently excel in mathematical and reasoning tasks, showcasing remarkable capabilities. However, the enhancement of reasoning abilities and the exposure of internal reasoning processes introduce new safety vulnerabilities. A critical question arises: when reasoning becomes intertwined with harmfulness, will LRMs become more vulnerable to jailbreaks in reasoning mode? To investigate this, we introduce HauntAttack, a novel and general-purpose black-box adversarial attack framework that systematically embeds harmful instructions into reasoning questions. Specifically, we modify key reasoning conditions in existing questions with harmful instructions, thereby constructing a reasoning pathway that guides the model step by step toward unsafe outputs. We evaluate HauntAttack on 11 LRMs and observe an average attack success rate of 70\%, achieving up to 12 percentage points of absolute improvement over the strongest prior baseline. Our further analysis reveals that even advanced safety-aligned models remain highly susceptible to reasoning-based attacks, offering insights into the urgent challenge of balancing reasoning capability and safety in future model development.

information, large language model, machine learning, (22 more...)

2506.07031

Country:

Europe > Austria (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (0.93)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Military (0.88)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Haase, Jennifer, Pokutta, Sebastian

Beyond Static Responses: Multi-Agent LLM Systems as a New Paradigm for Social Science Research

As large language models (LLMs) transition from static tools to fully agentic systems, their potential for transforming social science research has become increasingly evident. This paper introduces a structured framework for understanding the diverse applications of LLM-based agents, ranging from simple data processors to complex, multi-agent systems capable of simulating emergent social dynamics. By mapping this developmental continuum across six levels, the paper clarifies the technical and methodological boundaries between different agentic architectures, providing a comprehensive overview of current capabilities and future potential. It highlights how lower-tier systems streamline conventional tasks like text classification and data annotation, while higher-tier systems enable novel forms of inquiry, including the study of group dynamics, norm formation, and large-scale social processes. However, these advancements also introduce significant challenges, including issues of reproducibility, ethical oversight, and the risk of emergent biases. The paper critically examines these concerns, emphasizing the need for robust validation protocols, interdisciplinary collaboration, and standardized evaluation metrics. It argues that while LLM-based agents hold transformative potential for the social sciences, realizing this promise will require careful, context-sensitive deployment and ongoing methodological refinement. The paper concludes with a call for future research that balances technical innovation with ethical responsibility, encouraging the development of agentic systems that not only replicate but also extend the frontiers of social science, offering new insights into the complexities of human behavior.

large language model, machine learning, natural language, (18 more...)

2506.01839

Country:

North America > United States (0.46)
Europe > Germany (0.46)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Government (1.00)
Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

DesignX: Human-Competitive Algorithm Designer for Black-Box Optimization

Guo, Hongshu, Ma, Zeyuan, Ma, Yining, Zhang, Xinglin, Chen, Wei-Neng, Gong, Yue-Jiao

Designing effective black-box optimizers is hampered by limited problem-specific knowledge and manual control that spans months for almost every detail. In this paper, we present \textit{DesignX}, the first automated algorithm design framework that generates an effective optimizer specific to a given black-box optimization problem within seconds. Rooted in the first principles, we identify two key sub-tasks: 1) algorithm structure generation and 2) hyperparameter control. To enable systematic construction, a comprehensive modular algorithmic space is first built, embracing hundreds of algorithm components collected from decades of research. We then introduce a dual-agent reinforcement learning system that collaborates on structural and parametric design through a novel cooperative training objective, enabling large-scale meta-training across 10k diverse instances. Remarkably, through days of autonomous learning, the DesignX-generated optimizers continuously surpass human-crafted optimizers by orders of magnitude, either on synthetic testbed or on realistic optimization scenarios such as Protein-docking, AutoML and UAV path planning. Further in-depth analysis reveals DesignX's capability to discover non-trivial algorithm patterns beyond expert intuition, which, conversely, provides valuable design insights for the optimization community. We provide DesignX's Python project at~ https://github.com/MetaEvo/DesignX.

evolutionary algorithm, large language model, machine learning, (18 more...)

2505.17866

Country: Europe (1.00)

Genre:

Overview (1.00)
Instructional Material (0.66)
Research Report (0.64)

Industry: Transportation > Air (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Geng, Mingmeng, Poibeau, Thierry

On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

With the widespread use of large language models (LLMs), many researchers have turned their attention to detecting text generated by them. However, there is no consistent or precise definition of their target, namely "LLM-generated text". Differences in usage scenarios and the diversity of LLMs further increase the difficulty of detection. What is commonly regarded as the detecting target usually represents only a subset of the text that LLMs can potentially produce. Human edits to LLM outputs, together with the subtle influences that LLMs exert on their users, are blurring the line between LLM-generated and human-written text. Existing benchmarks and evaluation approaches do not adequately address the various conditions in real-world detector applications. Hence, the numerical results of detectors are often misunderstood, and their significance is diminishing. Therefore, detectors remain useful under specific conditions, but their results should be interpreted only as references rather than decisive indicators.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2510.2081

Genre: Overview (0.93)

Industry:

Information Technology > Security & Privacy (0.94)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages

Ni, Ronghao, Yang, Aidan Z. H., Hsu, Min-Chien, Sabino, Nuno, Jia, Limin, Martins, Ruben, Cassel, Darion, Cheang, Kevin

Program analysis tools often produce large volumes of candidate vulnerability reports that require costly manual review, creating a practical challenge: how can security analysts prioritize the reports most likely to be true vulnerabilities? This paper investigates whether machine learning can be applied to prioritizing vulnerabilities reported by program analysis tools. We focus on Node.js packages and collect a benchmark of 1,883 Node.js packages, each containing one reported ACE or ACI vulnerability. We evaluate a variety of machine learning approaches, including classical models, graph neural networks (GNNs), large language models (LLMs), and hybrid models that combine GNN and LLMs, trained on data based on a dynamic program analysis tool's output. The top LLM achieves $F_{1} {=} 0.915$, while the best GNN and classical ML models reaching $F_{1} {=} 0.904$. At a less than 7% false-negative rate, the leading model eliminates 66.9% of benign packages from manual review, taking around 60 ms per package. If the best model is tuned to operate at a precision level of 0.8 (i.e., allowing 20% false positives amongst all warnings), our approach can detect 99.2% of exploitable taint flows while missing only 0.8%, demonstrating strong potential for real-world vulnerability triage.

large language model, machine learning, natural language, (19 more...)

2510.20739

Country: North America (0.46)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Large Multimodal Models-Empowered Task-Oriented Autonomous Communications: Design Methodology and Implementation Challenges

Yang, Hyun Jong, Kim, Hyunsoo, Noh, Hyeonho, Kim, Seungnyun, Shim, Byonghyo

Notice: This work has been submitted to the IEEE for possible publication. Abstract--Large language models (LLMs) and large multi-modal models (LMMs) have achieved unprecedented breakthrough, showcasing remarkable capabilities in natural language understanding, generation, and complex reasoning. This transformative potential has positioned them as key enablers for 6G autonomous communications among machines, vehicles, and humanoids. In this article, we provide an overview of task-oriented autonomous communications with LLMs/LMMs, focusing on multimodal sensing integration, adaptive reconfiguration, and prompt/fine-tuning strategies for wireless tasks. We demonstrate the framework through three case studies: LMM-based traffic control, LLM-based robot scheduling, and LMM-based environment-aware channel estimation. From experimental results, we show that the proposed LLM/LMM-aided autonomous systems significantly outperform conventional and discriminative deep learning (DL) model-based techniques, maintaining robustness under dynamic objectives, varying input parameters, and heterogeneous multimodal conditions where conventional static optimization degrades. Driven by the huge success of ChatGPT, large language models (LLMs) have gained widespread attention, reshaping various fields by solving problems with zero-shot or few-shot prompting. Recently, large multimodal models (LMMs) extend this capability by embracing various modalities such as images, videos, and audio. These models can handle changing objectives and input variations using diverse multimodal observations.

large language model, machine learning, natural language, (16 more...)

2510.20637

Country:

Asia > South Korea (0.69)
North America > United States > Massachusetts (0.28)

Genre: Overview (1.00)

Industry:

Telecommunications (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges

Zhou, Zhenhuan, Zhu, Jingbo, Zhang, Yuchen, Guan, Xiaohang, Wang, Peng, Li, Tao

Efficient analysis and processing of dental images are crucial for dentists to achieve accurate diagnosis and optimal treatment planning. However, dental imaging inherently poses several challenges, such as low contrast, metallic artifacts, and variations in projection angles. Combined with the subjectivity arising from differences in clinicians' expertise, manual interpretation often proves time-consuming and prone to inconsistency. Artificial intelligence (AI)-based automated dental image analysis (DIA) offers a promising solution to these issues and has become an integral part of computer-aided dental diagnosis and treatment. Among various AI technologies, deep learning (DL) stands out as the most widely applied and influential approach due to its superior feature extraction and representation capabilities. To comprehensively summarize recent progress in this field, we focus on the two fundamental aspects of DL research-datasets and models. In this paper, we systematically review 260 studies on DL applications in DIA, including 49 papers on publicly available dental datasets and 211 papers on DL-based algorithms. We first introduce the basic concepts of dental imaging and summarize the characteristics and acquisition methods of existing datasets. Then, we present the foundational techniques of DL and categorize relevant models and algorithms according to different DIA tasks, analyzing their network architectures, optimization strategies, training methods, and performance. Furthermore, we summarize commonly used training and evaluation metrics in the DIA domain. Finally, we discuss the current challenges of existing research and outline potential future directions. We hope that this work provides a valuable and systematic reference for researchers in this field. All supplementary materials and detailed comparison tables will be made publicly available on GitHub.

data mining, machine learning, natural language, (22 more...)

2510.20634

Country:

Europe (1.00)
Asia > China (0.46)
North America > United States (0.45)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Dental and Oral Health (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)