AITopics

2509.09037

Country: North America > United States > Colorado > Boulder County > Boulder (0.15)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Task-based Loss Functions in Computer Vision: A Comprehensive Review

Elharrouss, Omar, Mahmood, Yasir, Bechqito, Yassine, Serhani, Mohamed Adel, Badidi, Elarbi, Riffi, Jamal, Tairi, Hamid

Loss functions are at the heart of deep learning, shaping how models learn and perform across diverse tasks. They are used to quantify the difference between predicted outputs and ground truth labels, guiding the optimization process to minimize errors. Selecting the right loss function is critical, as it directly impacts model convergence, generalization, and overall performance across various applications, from computer vision to time series forecasting. This paper presents a comprehensive review of loss functions, covering fundamental metrics like Mean Squared Error and Cross-Entropy to advanced functions such as Adversarial and Diffusion losses. We explore their mathematical foundations, impact on model training, and strategic selection for various applications, including computer vision (Discriminative and generative), tabular data prediction, and time series forecasting. For each of these categories, we discuss the most used loss functions in the recent advancements of deep learning techniques. Also, this review explore the historical evolution, computational efficiency, and ongoing challenges in loss function design, underlining the need for more adaptive and robust solutions. Emphasis is placed on complex scenarios involving multi-modal data, class imbalances, and real-world constraints. Finally, we identify key future directions, advocating for loss functions that enhance interpretability, scalability, and generalization, leading to more effective and resilient deep learning models.

2504.04242

Genre: Overview (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Nolan, Matthew, Yao, Lina, Davidson, Robert

Ensemble Distribution Distillation for Self-Supervised Human Activity Recognition

Human Activity Recognition (HAR) has seen significant advancements with the adoption of deep learning techniques, yet challenges remain in terms of data requirements, reliability and robustness. This paper explores a novel application of Ensemble Distribution Distillation (EDD) within a self-supervised learning framework for HAR aimed at overcoming these challenges. By leveraging unlabeled data and a partially supervised training strategy, our approach yields an increase in predictive accuracy, robust estimates of uncertainty, and substantial increases in robustness against adversarial perturbation; thereby significantly improving reliability in real-world scenarios without increasing computational complexity at inference. We demonstrate this with an evaluation on several publicly available datasets. The contributions of this work include the development of a self-supervised EDD framework, an innovative data augmentation technique designed for HAR, and empirical validation of the proposed method's effectiveness in increasing robustness and reliability.

artificial intelligence, inductive learning, machine learning, (19 more...)

2509.08225

Country:

Oceania > Australia (0.46)
Asia (0.28)

Genre:

Research Report > New Finding (0.94)
Overview (0.88)

Industry:

Information Technology (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)

Delavari, Elahe, Khanzada, Feeza Khan, Kwon, Jaerock

A Comprehensive Review of Reinforcement Learning for Autonomous Driving in the CARLA Simulator

Autonomous-driving research has recently embraced deep Reinforcement Learning (RL) as a promising framework for data-driven decision making, yet a clear picture of how these algorithms are currently employed, benchmarked and evaluated is still missing. This survey fills that gap by systematically analysing around 100 peer-reviewed papers that train, test or validate RL policies inside the open-source CARLA simulator. We first categorize the literature by algorithmic family model-free, model-based, hierarchical, and hybrid and quantify their prevalence, highlighting that more than 80% of existing studies still rely on model-free methods such as DQN, PPO and SAC. Next, we explain the diverse state, action and reward formulations adopted across works, illustrating how choices of sensor modality (RGB, LiDAR, BEV, semantic maps, and carla kinematics states), control abstraction (discrete vs. continuous) and reward shaping are used across various literature. We also consolidate the evaluation landscape by listing the most common metrics (success rate, collision rate, lane deviation, driving score) and the towns, scenarios and traffic configurations used in CARLA benchmarks. Persistent challenges including sparse rewards, sim-to-real transfer, safety guarantees and limited behaviour diversity are distilled into a set of open research questions, and promising directions such as model-based RL, meta-learning and richer multi-agent simulations are outlined. By providing a unified taxonomy, quantitative statistics and a critical discussion of limitations, this review aims to serve both as a reference for newcomers and as a roadmap for advancing RL-based autonomous driving toward real-world deployment.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2509.08221

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.47)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Engineering Automotive Digital Twins on Standardized Architectures: A Case Study

Ramdhan, Stefan, Trandinh, Winnie, David, Istvan, Pantelic, Vera, Lawford, Mark

Digital twin (DT) technology has become of interest in the automotive industry. There is a growing need for smarter services that utilize the unique capabilities of DTs, ranging from computer-aided remote control to cloud-based fleet coordination. Developing such services starts with the software architecture. However, the scarcity of DT architectural guidelines poses a challenge for engineering automotive DTs. Currently, the only DT architectural standard is the one defined in ISO 23247. Though not developed for automotive systems, it is one of the few feasible starting points for automotive DTs. In this work, we investigate the suitability of the ISO 23247 reference architecture for developing automotive DTs. Through the case study of developing an Adaptive Cruise Control DT for a 1/10th-scale autonomous vehicle, we identify some strengths and limitations of the reference architecture and begin distilling future directions for researchers, practitioners, and standard developers.

artificial intelligence, real time system, vehicle, (18 more...)

2508.18662

Country:

North America > United States (0.46)
North America > Canada > Ontario > Hamilton (0.28)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Transportation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Architecture > Real Time Systems (0.70)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.49)

A Survey on Training-free Alignment of Large Language Models

Pan, Birong, Li, Yongqi, Zhang, Weiyu, Lu, Wenpeng, Xu, Mayi, Zhou, Shen, Zhu, Yuanyuan, Zhong, Ming, Qian, Tieyun

The alignment of large language models (LLMs) aims to ensure their outputs adhere to human values, ethical standards, and legal norms. Traditional alignment methods often rely on resource-intensive fine-tuning (FT), which may suffer from knowledge degradation and face challenges in scenarios where the model accessibility or computational resources are constrained. In contrast, training-free (TF) alignment techniques--leveraging in-context learning, decoding-time adjustments, and post-generation corrections--offer a promising alternative by enabling alignment without heavily retraining LLMs, making them adaptable to both open-source and closed-source environments. This paper presents the first systematic review of TF alignment methods, categorizing them by stages of pre-decoding, in-decoding, and post-decoding. For each stage, we provide a detailed examination from the viewpoint of LLMs and multimodal LLMs (MLLMs), highlighting their mechanisms and limitations. Furthermore, we identify key challenges and future directions, paving the way for more inclusive and effective TF alignment techniques. By synthesizing and organizing the rapidly growing body of research, this survey offers a guidance for practitioners and advances the development of safer and more reliable LLMs.

large language model, machine learning, natural language, (17 more...)

2508.09016

Country:

Europe (1.00)
North America > United States (0.47)
Asia > China (0.46)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Hong, Hanhua, Xiao, Chenghao, Wang, Yang, Liu, Yiqi, Rong, Wenge, Lin, Chenghua

Evaluating natural language generation systems is challenging due to the diversity of valid outputs. While human evaluation is the gold standard, it suffers from inconsistencies, lack of standardisation, and demographic biases, limiting reproducibility. LLM-based evaluators offer a scalable alternative but are highly sensitive to prompt design, where small variations can lead to significant discrepancies. In this work, we propose an inversion learning method that learns effective reverse mappings from model outputs back to their input instructions, enabling the automatic generation of highly effective, model-specific evaluation prompts. Our method requires only a single evaluation sample and eliminates the need for time-consuming manual prompt engineering, thereby improving both efficiency and robustness. Our work contributes toward a new direction for more robust and efficient LLM-based evaluation.

large language model, machine learning, translation, (19 more...)

2504.21117

Country:

Asia (1.00)
North America > United States (0.46)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Sucipto, Willy, Zhou, Jianlong, Kwon, Ray Seung Min, Chen, Fang

A Survey of TinyML Applications in Beekeeping for Hive Monitoring and Management

Honey bee colonies are essential for global food security and ecosystem stability, yet they face escalating threats from pests, diseases, and environmental stressors. Traditional hive inspections are labor-intensive and disruptive, while cloud-based monitoring solutions remain impractical for remote or resource-limited apiaries. Recent advances in Internet of Things (IoT) and Tiny Machine Learning (TinyML) enable low-power, real-time monitoring directly on edge devices, offering scalable and non-invasive alternatives. This survey synthesizes current innovations at the intersection of TinyML and apiculture, organized around four key functional areas: monitoring hive conditions, recognizing bee behaviors, detecting pests and diseases, and forecasting swarming events. We further examine supporting resources, including publicly available datasets, lightweight model architectures optimized for embedded deployment, and benchmarking strategies tailored to field constraints. Critical limitations such as data scarcity, generalization challenges, and deployment barriers in off-grid environments are highlighted, alongside emerging opportunities in ultra-efficient inference pipelines, adaptive edge learning, and dataset standardization. By consolidating research and engineering practices, this work provides a foundation for scalable, AI-driven, and ecologically informed monitoring systems to support sustainable pollinator management.

data mining, machine learning, real time system, (20 more...)

2509.08822

Country: Oceania > Australia (0.94)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Energy (1.00)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Cloud Computing (1.00)
(7 more...)

Martin, Rebecca, Patrikar, Jay, Scherer, Sebastian

AutoODD: Agentic Audits via Bayesian Red Teaming in Black-Box Models

Specialized machine learning models, regardless of architecture and training, are susceptible to failures in deployment. With their increasing use in high risk situations, the ability to audit these models by determining their operational design domain (ODD) is crucial in ensuring safety and compliance. However, given the high-dimensional input spaces, this process often requires significant human resources and domain expertise. To alleviate this, we introduce \coolname, an LLM-Agent centric framework for automated generation of semantically relevant test cases to search for failure modes in specialized black-box models. By leveraging LLM-Agents as tool orchestrators, we aim to fit a uncertainty-aware failure distribution model on a learned text-embedding manifold by projecting the high-dimension input space to low-dimension text-embedding latent space. The LLM-Agent is tasked with iteratively building the failure landscape by leveraging tools for generating test-cases to probe the model-under-test (MUT) and recording the response. The agent also guides the search using tools to probe uncertainty estimate on the low dimensional manifold. We demonstrate this process in a simple case using models trained with missing digits on the MNIST dataset and in the real world setting of vision-based intruder detection for aerial vehicles.

digit, large language model, machine learning, (16 more...)

2509.08638

Genre:

Research Report (0.52)
Overview (0.47)

Industry: Transportation > Air (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

A Structured Review of Underwater Object Detection Challenges and Solutions: From Traditional to Large Vision Language Models

Nabahirwa, Edwine, Song, Wei, Zhang, Minghua, Fang, Yi, Ni, Zhou

Despite its significance, the underwater world remains largely overlooked as a result of the challenging conditions that hinder traditional research methods. Historically, the study of marine ecosystems relied on labor intensive research [1], which provided limited data and had a high error margin. In recent years, advances in autonomous and remotely operated vehicles (AUVs and ROVs) have revolutionized underwater exploration. These technologies, equipped with object detection systems, now allow real-time monitoring, which includes capturing images of marine organisms, environmental conditions, and even assessing biodiversity [2], [3]. However, the quality of images and videos captured underwater remains a significant obstacle. Light absorption, scattering, and water-related distortions, such as haze and color shifts [4], create noisy low-contrast images, further compounded by complex underwater backgrounds and camera motion. These challenges call for advanced detection techniques capable of accurately identifying and localizing objects despite underwater noise. Efficient underwater object detection (UOD) is crucial for a variety of marine applications, including biodiversity monitoring, conservation efforts, and resource management.

large language model, machine learning, natural language, (21 more...)

2509.0849

Country:

Asia > China (1.00)
North America > United States (0.67)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.92)

Industry:

Media > Photography (0.48)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)