AITopics | sth

Collaborating Authors

sth

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards

Neural Information Processing SystemsJun-22-2026, 19:45:44 GMT

Online advertising platforms use automated auctions to connect advertisers with potential customers, requiring effective bidding strategies to maximize profits. Accurate ad impact estimation requires considering three key factors: delayed and long-term effects, cumulative ad impacts such as reinforcement or fatigue, and customer heterogeneity. However, these effects are often not jointly addressed in previous studies. To capture these factors, we model ad bidding as a Contextual Markov Decision Process (CMDP) with delayed Poisson rewards. For efficient estimation, we propose a two-stage maximum likelihood estimator combined with data-splitting strategies, ensuring controlled estimation error based on the first-stage estimator's (in)accuracy. Building on this, we design a reinforcement learning algorithm to derive efficient personalized bidding strategies. This approach achieves a near-optimal regret bound of O(dH2 T), where d is the contextual dimension, H is the number of rounds, and T is the number of customers. Our theoretical findings are validated by simulation experiments.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report > Experimental Study (1.00)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Optimal Posterior Sampling for Policy Identification in Tabular Markov Decision Processes

Kone, Cyrille, Jamieson, Kevin

arXiv.org Machine LearningMay-6-2026

We study the $(\varepsilon, δ)$-PAC policy identification problem in finite-horizon episodic Markov Decision Processes. Existing approaches provide finite-time guarantees for approximate settings ($\varepsilon>0$) but suffer from high computational cost, rendering them hard to implement, and also suffer from suboptimal dependence on $\log(1/δ)$. We propose a randomized and computationally efficient algorithm for best policy identification that combines posterior sampling with an online learning algorithm to guide exploration in the MDP. Our method achieves asymptotic optimality in sample complexity, also in terms of posterior contraction rate, and runs in $O(S^2AH)$ per episode, matching standard model-based approaches. Unlike prior algorithms such as MOCA and PEDEL, our guarantees remain meaningful in the asymptotic regime and avoid sub-optimal polynomial dependence on $\log(1/δ)$. Our results provide both theoretical insights and practical tools for efficient policy identification in tabular MDPs.

artificial intelligence, machine learning, sth, (15 more...)

arXiv.org Machine Learning

2605.03921

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

DARLING: Detection Augmented Reinforcement Learning with Non-Stationary Guarantees

Gerogiannis, Argyrios, Huang, Yu-Han, Veeravalli, Venugopal V.

arXiv.org Machine LearningApr-21-2026

We study model-free reinforcement learning (RL) in non-stationary finite-horizon episodic Markov decision processes (MDPs) without prior knowledge of the non-stationarity. We focus on the piecewise-stationary (PS) setting, where both the reward and transition dynamics can change an arbitrary number of times. We propose Detection Augmented Reinforcement Learning (DARLING), a modular wrapper for PS-RL that applies to both tabular and linear MDPs, without knowledge of the changes. Under certain change-point separation and reachability conditions, DARLING improves the best available dynamic regret bounds in both settings and yields strong empirical performance. We further establish the first minimax lower bounds for PS-RL in tabular and linear MDPs, showing that DARLING is the first nearly optimal algorithm. Experiments on standard benchmarks demonstrate that DARLING consistently surpasses the state-of-the-art methods across diverse non-stationary scenarios.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2604.16684

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

MobILE: Model-BasedImitationLearning From ObservationAlone

Neural Information Processing SystemsFeb-11-2026, 20:22:52 GMT

Weprovide aunified analysis for MobILE, and demonstrate that MobILE enjoys strong performance guarantees for classes of MDP dynamics that satisfy certain well studied notions of structural complexity. We also show that the ILFO problem isstrictly harder than the standard IL problem by presenting an exponential sample complexity separation between ILand ILFO.

ath, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

UnpackingRewardShaping

Neural Information Processing SystemsFeb-9-2026, 09:55:56 GMT

Much of this work is based on upper confidence bound (UCB) principles and prescribes some kind of exploration bonus to prioritize exploration of rarely visited regions.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

A Selective Temporal Hamming distance to find patterns in state transition event timeseries, at scale

Marié, Sylvain, Knecht, Pablo

arXiv.org Machine LearningDec-2-2025

Discrete event systems are present both in observations of nature, socio economical sciences, and industrial systems. Standard analysis approaches do not usually exploit their dual event / state nature: signals are either modeled as transition event sequences, emphasizing event order alignment, or as categorical or ordinal state timeseries, usually resampled a distorting and costly operation as the observation period and number of events grow. In this work we define state transition event timeseries (STE-ts) and propose a new Selective Temporal Hamming distance (STH) leveraging both transition time and duration-in-state, avoiding costly and distorting resampling on large databases. STH generalizes both resampled Hamming and Jaccard metrics with better precision and computation time, and an ability to focus on multiple states of interest. We validate these benefits on simulated and real-world datasets.

application, hamming distance, timesery, (10 more...)

arXiv.org Machine Learning

2512.0144

Country:

Europe > Austria > Tyrol > Innsbruck (0.05)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
(2 more...)

Add feedback

Scheduling Agile Earth Observation Satellites with Onboard Processing and Real-Time Monitoring

Mercado-Martínez, Antonio M., Soret, Beatriz, Jurado-Navas, Antonio

arXiv.org Artificial IntelligenceJun-16-2025

--The emergence of Agile Earth Observation Satellites (AEOSs) has marked a significant turning point in the field of Earth Observation (EO), offering enhanced flexibility in data acquisition. Concurrently, advancements in onboard satellite computing and communication technologies have greatly enhanced data compression efficiency, reducing network latency and congestion while supporting near real-time information delivery. In this paper, we address the Agile Earth Observation Satellite Scheduling Problem (AEOSSP), which involves determining the optimal sequence of target observations to maximize overall observation profit. T o this end, we define a set of priority indicators and develop a constructive heuristic method, further enhanced with a Local Search (LS) strategy. The results show that the proposed algorithm provides high-quality information by increasing the resolution of the collected frames by up to 10% on average, while reducing the variance in the monitoring frequency of the targets within the instance by up to 83%, ensuring more up-to-date information across the entire set compared to a First-In First-Out (FIFO) method.

artificial intelligence, planning & scheduling, real time system, (17 more...)

arXiv.org Artificial Intelligence

2506.11556

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.66)

Add feedback

SOP-Agent: Empower General Purpose AI Agent with Domain-Specific SOPs

Ye, Anbang, Ma, Qianran, Chen, Jia, Li, Muqi, Li, Tong, Liu, Fujiao, Mai, Siqi, Lu, Meichen, Bao, Haitao, You, Yang

arXiv.org Artificial IntelligenceJan-16-2025

Despite significant advancements in general-purpose AI agents, several challenges still hinder their practical application in real-world scenarios. First, the limited planning capabilities of Large Language Models (LLM) restrict AI agents from effectively solving complex tasks that require long-horizon planning. Second, general-purpose AI agents struggle to efficiently utilize domain-specific knowledge and human expertise. In this paper, we introduce the Standard Operational Procedure-guided Agent (SOP-agent), a novel framework for constructing domain-specific agents through pseudocode-style Standard Operational Procedures (SOPs) written in natural language. Formally, we represent a SOP as a decision graph, which is traversed to guide the agent in completing tasks specified by the SOP. We conduct extensive experiments across tasks in multiple domains, including decision-making, search and reasoning, code generation, data cleaning, and grounded customer service. The SOP-agent demonstrates excellent versatility, achieving performance superior to general-purpose agent frameworks and comparable to domain-specific agent systems. Additionally, we introduce the Grounded Customer Service Benchmark, the first benchmark designed to evaluate the grounded decision-making capabilities of AI agents in customer service scenarios based on SOPs.

agent, api, instruction, (17 more...)

arXiv.org Artificial Intelligence

2501.09316

Country: Asia > Singapore (0.04)

Genre: Research Report (0.84)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Different thresholding methods on Nearest Shrunken Centroid algorithm

Sahtout, Mohammad Omar, Wang, Haiyan, Ghimire, Santosh

arXiv.org Machine LearningDec-31-2024

This article considers the impact of different thresholding methods to the Nearest Shrunken Centroid algorithm, which is popularly referred as the Prediction Analysis of Microarrays (PAM) for high-dimensional classification. PAM uses soft thresholding to achieve high computational efficiency and high classification accuracy but in the price of retaining too many features. When applied to microarray human cancers, PAM selected 2611 features on average from 10 multi-class datasets. Such a large number of features make it difficult to perform follow up study. One reason behind this problem is the soft thresholding, which is known to produce biased parameter estimate in regression analysis. In this article, we extend the PAM algorithm with two other thresholding methods, hard and order thresholding, and a deep search algorithm to achieve better thresholding parameter estimate. The modified algorithms are extensively tested and compared to the original one based on real data and Monte Carlo studies. In general, the modification not only gave better cancer status prediction accuracy, but also resulted in more parsimonious models with significantly smaller number of features.

algorithm, informative gene, test error, (15 more...)

arXiv.org Machine Learning

doi: 10.1080/03610918.2022.2047201

2501.00632

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Star-shaped Tilted Hexarotor Maneuverability: Analysis of the Role of the Tilt Cant Angles

Perin, Marco, Bertoni, Massimiliano, Viezzer, Nicolas, Michieletto, Giulia, Cenedese, Angelo

arXiv.org Artificial IntelligenceAug-22-2024

Star-shaped Tilted Hexarotors are rapidly emerging for applications highly demanding in terms of robustness and maneuverability. To ensure improvement in such features, a careful selection of the tilt angles is mandatory. In this work, we present a rigorous analysis of how the force subspace varies with the tilt cant angles, namely the tilt angles along the vehicle arms, taking into account gravity compensation and torque decoupling to abide by the hovering condition. Novel metrics are introduced to assess the performance of existing tilted platforms, as well as to provide some guidelines for the selection of the tilt cant angle in the design phase.

maneuverability, propeller, selection, (15 more...)

arXiv.org Artificial Intelligence

2408.12294

Country: Europe > Italy > Apulia > Bari (0.04)

Genre: Research Report (0.40)

Industry: Aerospace & Defense (0.46)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback