AITopics

2508.1962

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Chakraborty, Sourav, Rege, Amit Kiran, Monteleoni, Claire, Chen, Lijun

Incentivized Lipschitz Bandits

arXiv.org Artificial IntelligenceAug-28-2025

We study incentivized exploration in multi-armed bandit (MAB) settings with infinitely many arms modeled as elements in continuous metric spaces. Unlike classical bandit models, we consider scenarios where the decision-maker (principal) incentivizes myopic agents to explore beyond their greedy choices through compensation, but with the complication of reward drift--biased feedback arising due to the incentives. We propose novel incentivized exploration algorithms that discretize the infinite arm space uniformly and demonstrate that these algorithms simultaneously achieve sublinear cumulative regret and sublinear total compensation. Specifically, we derive regret and compensation bounds of $\Tilde{O}(T^{d+1/d+2})$, with $d$ representing the covering dimension of the metric space. Furthermore, we generalize our results to contextual bandits, achieving comparable performance guarantees. We validate our theoretical findings through numerical simulations.

artificial intelligence, data mining, machine learning, (20 more...)

2508.19466

Country:

Europe (0.93)
North America > United States > Colorado (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Li, Mengmeng, Schneider, Philipp, Aleksić, Jelisaveta, Kuhn, Daniel

Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

arXiv.org Machine LearningAug-27-2025

We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $\widetilde{\mathcal{O}}(\sqrt{T})$ regret in the adversarial regime and $\widetilde{\mathcal{O}}(\ln T)$ regret in the corrupted stochastic regime. Our approach builds on the Follow-the-Regularized-Leader (FTRL) framework equipped with a Shannon entropy regularizer, yielding a flexible method that admits efficient implementations. Beyond regret bounds, we tackle the practical bottleneck in FTRL (or, equivalently, Online Stochastic Mirror Descent) arising from the high-dimensional projection step encountered in each round of interaction. By leveraging the Karush-Kuhn-Tucker conditions, we transform the $K$-dimensional convex projection problem into a single-variable root-finding problem, dramatically accelerating each round. Empirical evaluations demonstrate that this combined strategy not only attains the attractive regret bounds of best-of-both-worlds algorithms but also delivers substantial per-round speed-ups, making it well-suited for large-scale, real-time applications.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2508.18768

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Sankagiri, Suryanarayana, Etesami, Jalal, Fatemi, Pouria, Grossglauser, Matthias

Recycling History: Efficient Recommendations from Contextual Dueling Bandits

The contextual duelling bandit problem models adaptive recommender systems, where the algorithm presents a set of items to the user, and the user's choice reveals their preference. This setup is well suited for implicit choices users make when navigating a content platform, but does not capture other possible comparison queries. Motivated by the fact that users provide more reliable feedback after consuming items, we propose a new bandit model that can be described as follows. The algorithm recommends one item per time step; after consuming that item, the user is asked to compare it with another item chosen from the user's consumption history. Importantly, in our model, this comparison item can be chosen without incurring any additional regret, potentially leading to better performance. However, the regret analysis is challenging because of the temporal dependency in the user's history. To overcome this challenge, we first show that the algorithm can construct informative queries provided the history is rich, i.e., satisfies a certain diversity condition. We then show that a short initial random exploration phase is sufficient for the algorithm to accumulate a rich history with high probability. This result, proven via matrix concentration bounds, yields $O(\sqrt{T})$ regret guarantees. Additionally, our simulations show that reusing past items for comparisons can lead to significantly lower regret than only comparing between simultaneously recommended items.

artificial intelligence, data mining, machine learning, (19 more...)

2508.18841

Country: Europe (0.46)

Genre: Research Report (0.64)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.35)
Information Technology > Data Science > Data Mining > Big Data (0.34)

STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning

Wu, Chenghao, Ren, Ruiyang, Zhang, Junjie, Wang, Ruirui, Ma, Zhongrui, Ye, Qi, Zhao, Wayne Xin

While modern recommender systems are instrumental in navigating information abundance, they remain fundamentally limited by static user modeling and reactive decision-making paradigms. Current large language model (LLM)-based agents inherit these shortcomings through their overreliance on heuristic pattern matching, yielding recommendations prone to shallow correlation bias, limited causal inference, and brittleness in sparse-data scenarios. We introduce STARec, a slow-thinking augmented agent framework that endows recommender systems with autonomous deliberative reasoning capabilities. Each user is modeled as an agent with parallel cognitions: fast response for immediate interactions and slow reasoning that performs chain-of-thought rationales. To cultivate intrinsic slow thinking, we develop anchored reinforcement training - a two-stage paradigm combining structured knowledge distillation from advanced reasoning models with preference-aligned reward shaping. This hybrid approach scaffolds agents in acquiring foundational capabilities (preference summarization, rationale generation) while enabling dynamic policy adaptation through simulated feedback loops. Experiments on MovieLens 1M and Amazon CDs benchmarks demonstrate that STARec achieves substantial performance gains compared with state-of-the-art baselines, despite using only 0.4% of the full training data.

large language model, machine learning, natural language, (16 more...)

doi: 10.1145/3746252.3760995

2508.18812

Country: Asia > China (0.48)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Hsu, Yi-Ping, Wang, Po-Wei, Eksombatchai, Chantat, Xu, Jiajing

Taming the One-Epoch Phenomenon in Online Recommendation System by Two-stage Contrastive ID Pre-training

ID-based embeddings are widely used in web-scale online recommendation systems. However, their susceptibility to overfitting, particularly due to the long-tail nature of data distributions, often limits training to a single epoch, a phenomenon known as the "one-epoch problem." This challenge has driven research efforts to optimize performance within the first epoch by enhancing convergence speed or feature sparsity. In this study, we introduce a novel two-stage training strategy that incorporates a pre-training phase using a minimal model with contrastive loss, enabling broader data coverage for the embedding system. Our offline experiments demonstrate that multi-epoch training during the pre-training phase does not lead to overfitting, and the resulting embeddings improve online generalization when fine-tuned for more complex downstream recommendation tasks. We deployed the proposed system in live traffic at Pinterest, achieving significant site-wide engagement gains.

artificial intelligence, downstream model, machine learning, (12 more...)

doi: 10.1145/3640457.3688053

2508.187

Country:

Europe > Italy (0.17)
North America > United States (0.16)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology (0.39)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kumar, Vivek, Sahu, Himanshu, Gupta, Hari Prabhat, Srivastava, Biplav

Technology-assisted Personalized Yoga for Better Health -- Challenges and Outlook

Yoga is a discipline of physical postures, breathing techniques, and meditative practices rooted in ancient Indian traditions, now embraced worldwide for promoting overall well-being and inner balance. The practices are a large set of items, our term for executable actions like physical poses or breath exercises, to offer for a person's well-being. However, to get benefits of Yoga tailored to a person's unique needs, a person needs to (a) discover their subset from the large and seemingly complex set with inter-dependencies, (b) continue to follow them with interest adjusted to their changing abilities and near-term objectives, and (c) as appropriate, adapt to alternative items based on changing environment and the person's health conditions. In this vision paper, we describe the challenges for the Yoga personalization problem. Next, we sketch a preliminary approach and use the experience to provide an outlook on solving the challenging problem using existing and novel techniques from a multidisciplinary computing perspective. To the best of our knowledge, this is the first paper that comprehensively examines decision support issues around Yoga personalization, from pose sensing to recommendation of corrections for a complete regimen, and illustrates with a case study of Surya Namaskar -- a set of 12 choreographed poses.

artificial intelligence, asana, machine learning, (17 more...)

2508.18283

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

arXiv.org Machine LearningAug-26-2025

Multi-User Contextual Cascading Bandits for Personalized Recommendation

Park, Jiho, Jia, Huiwen

We introduce a Multi-User Contextual Cascading Bandit model, a new combinatorial bandit framework that captures realistic online advertising scenarios where multiple users interact with sequentially displayed items simultaneously. Unlike classical contextual bandits, MCCB integrates three key structural elements: (i) cascading feedback based on sequential arm exposure, (ii) parallel context sessions enabling selective exploration, and (iii) heterogeneous arm-level rewards. We first propose Upper Confidence Bound with Backward Planning (UCBBP), a UCB-style algorithm tailored to this setting, and prove that it achieves a regret bound of $\widetilde{O}(\sqrt{THN})$ over $T$ episodes, $H$ session steps, and $N$ contexts per episode. Motivated by the fact that many users interact with the system simultaneously, we introduce a second algorithm, termed Active Upper Confidence Bound with Backward Planning (AUCBBP), which shows a strict efficiency improvement in context scaling, i.e., user scaling, with a regret bound of $\widetilde{O}(\sqrt{T+HN})$. We validate our theoretical findings via numerical experiments, demonstrating the empirical effectiveness of both algorithms under various settings.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Machine Learning

2508.13981

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Marketing (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.84)

arXiv.org Artificial IntelligenceAug-26-2025

HLLM-Creator: Hierarchical LLM-based Personalized Creative Generation

Chen, Junyi, Chi, Lu, Xu, Siliang, Ran, Shiwei, Peng, Bingyue, Yuan, Zehuan

AI-generated content technologies are widely used in content creation. However, current AIGC systems rely heavily on creators' inspiration, rarely generating truly user-personalized content. In real-world applications such as online advertising, a single product may have multiple selling points, with different users focusing on different features. This underscores the significant value of personalized, user-centric creative generation. Effective personalized content generation faces two main challenges: (1) accurately modeling user interests and integrating them into the content generation process while adhering to factual constraints, and (2) ensuring high efficiency and scalability to handle the massive user base in industrial scenarios. Additionally, the scarcity of personalized creative data in practice complicates model training, making data construction another key hurdle. We propose HLLM-Creator, a hierarchical LLM framework for efficient user interest modeling and personalized content generation. During inference, a combination of user clustering and a user-ad-matching-prediction based pruning strategy is employed to significantly enhance generation efficiency and reduce computational overhead, making the approach suitable for large-scale deployment. Moreover, we design a data construction pipeline based on chain-of-thought reasoning, which generates high-quality, user-specific creative titles and ensures factual consistency despite limited personalized data. This pipeline serves as a critical foundation for the effectiveness of our model. Extensive experiments on personalized title generation for Douyin Search Ads show the effectiveness of HLLM-Creator. Online A/B test shows a 0.476% increase on Adss, paving the way for more effective and efficient personalized generation in industrial scenarios. Codes for academic dataset are available at https://github.com/bytedance/HLLM.

large language model, machine learning, natural language, (19 more...)

2508.18118

Genre: Research Report (1.00)

Industry:

Marketing (1.00)
Information Technology > Services (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gkillas, Alexandros, Anagnostopoulos, Christos, Piperigkos, Nikos, Tsiktsiris, Dimitris, Christodoulou, Theofilos, Siamatras, Theofanis, Triantafyllou, Dimitrios, Basdekis, Christos, Marinopoulou, Theoktisti, Lepentsiotis, Panagiotis, Blitsis, Elefterios, Zacharaki, Aggeliki, Stylianidis, Nearchos, Katelaris, Leonidas, Salvan, Lamberto, Lalos, Aris S., Laoudias, Christos, Lalas, Antonios, Votis, Konstantinos

A holistic perception system of internal and external monitoring for ground autonomous vehicles: AutoTRUST paradigm

arXiv.org Artificial IntelligenceAug-26-2025

This paper introduces a holistic perception system for internal and external monitoring of autonomous vehicles, with the aim of demonstrating a novel AI-leveraged self-adaptive framework of advanced vehicle technologies and solutions that optimize perception and experience on-board. Internal monitoring system relies on a multi-camera setup designed for predicting and identifying driver and occupant behavior through facial recognition, exploiting in addition a large language model as virtual assistant. Moreover, the in-cabin monitoring system includes AI-empowered smart sensors that measure air-quality and perform thermal comfort analysis for efficient on and off-boarding. On the other hand, external monitoring system perceives the surrounding environment of vehicle, through a LiDAR-based cost-efficient semantic segmentation approach, that performs highly accurate and efficient super-resolution on low-quality raw 3D point clouds. The holistic perception framework is developed in the context of EU's Horizon Europe programm AutoTRUST, and has been integrated and deployed on a real electric vehicle provided by ALKE. Experimental validation and evaluation at the integration site of Joint Research Centre at Ispra, Italy, highlights increased performance and efficiency of the modular blocks of the proposed perception architecture.

large language model, machine learning, natural language, (21 more...)

2508.17969

Country: Europe > Italy (0.24)

Genre: Research Report (0.40)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(3 more...)