AITopics | offline phase

Collaborating Authors

offline phase

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Minimax PAC Bounds for Learning in Exogenous Contextual MDPs

Pla, Corentin, Richard, Hugo, Abeille, Marc, Perchet, Vianney

arXiv.org Machine LearningJun-25-2026

We study PAC learning in tabular discounted Markov decision processes with exogenous i.i.d. contexts, with discount factor $γ$, finite state space $\mathcal X$, action space $\mathcal A$, and context space $\mathcal Z$. At each time step, a context is drawn independently from an unknown distribution $μ$ and revealed before the agent acts. This context may affect both rewards and transitions, while remaining uncontrolled by the agent. Depending on the regime, the learner has access either to a sampling oracle for $μ$, to a sampling oracle for the transition kernel conditioned on state-context-action tuples, or to both. Oracles can be accessed before and during policy execution. The sample complexity is measured by a couple $(n,m)$, where $n$ is the number of calls to the sampling oracles before execution and $m$ is the number of calls to the sampling oracles during execution. When rewards and transitions are known and only the context distribution $μ$ is sampled, we give a variance-reduced algorithm that solves policy evaluation (PE), best-value estimation (BVE), and best-policy extraction (BPE) with $\left(\widetilde O\left(1/((1-γ)^3\varepsilon^2)\right), 0 \right) $ sample complexity. The rate is independent of $|\mathcal Z|$ and minimax optimal up to logarithmic factors. As a corollary, we also obtain tight rates in the case of one-step perfect look-ahead, improving upon the existing guarantees. In the fully unknown regime, where both $μ$ and P must be learned, we show that PE remains $|\mathcal Z|$-free, with matching upper and lower bounds $\bigl(\widetilde O(|\mathcal X|/((1-γ)^3\varepsilon^2)),\, \widetilde O(1/((1-γ)^2\varepsilon^2))\bigr)$.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2606.2517

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

8433bb4f7477bf8202614ce1ae8b1169-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 09:42:58 GMT

assumption, online phase, rfo live, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

8433bb4f7477bf8202614ce1ae8b1169-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 09:42:55 GMT

assumption, international conference, rfo live, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

An Empirical Study on the Effectiveness of Incorporating Offline RL As Online RL Subroutines

Su, Jianhai, Luo, Jinzhu, Zhang, Qi

arXiv.org Machine LearningDec-2-2025

We take the novel perspective of incorporating offline RL algorithms as subroutines of tabula rasa online RL. This is feasible because an online learning agent can repurpose its historical interactions as offline dataset. We formalize this idea into a framework that accommodates several variants of offline RL incorporation such as final policy recommendation and online fine-tuning. We further introduce convenient techniques to improve its effectiveness in enhancing online learning efficiency. Our extensive and systematic empirical analyses show that 1) the effectiveness of the proposed framework depends strongly on the nature of the task, 2) our proposed techniques greatly enhance its effectiveness, and 3) existing online fine-tuning methods are overall ineffective, calling for more research therein.

dataset, fine-tuning, sac, (14 more...)

arXiv.org Machine Learning

2512.00383

Country: North America > United States > South Carolina (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

PrivGNN: High-Performance Secure Inference for Cryptographic Graph Neural Networks

Wang, Fuyi, Chen, Zekai, Fan, Mingyuan, Zhou, Jianying, Pan, Lei, Zhang, Leo Yu

arXiv.org Artificial IntelligenceNov-5-2025

Graph neural networks (GNNs) are powerful tools for analyzing and learning from graph-structured (GS) data, facilitating a wide range of services. Deploying such services in privacy-critical cloud environments necessitates the development of secure inference (SI) protocols that safeguard sensitive GS data. However, existing SI solutions largely focus on convolutional models for image and text data, leaving the challenge of securing GNNs and GS data relatively underexplored. In this work, we design, implement, and evaluate $\sysname$, a lightweight cryptographic scheme for graph-centric inference in the cloud. By hybridizing additive and function secret sharings within secure two-party computation (2PC), $\sysname$ is carefully designed based on a series of novel 2PC interactive protocols that achieve $1.5\times \sim 1.7\times$ speedups for linear layers and $2\times \sim 15\times$ for non-linear layers over state-of-the-art (SotA) solutions. A thorough theoretical analysis is provided to prove $\sysname$'s correctness, security, and lightweight nature. Extensive experiments across four datasets demonstrate $\sysname$'s superior efficiency with $1.3\times \sim 4.7\times$ faster secure predictions while maintaining accuracy comparable to plaintext graph property inference.

artificial intelligence, machine learning, privgnn, (18 more...)

arXiv.org Artificial Intelligence

2511.02185

Country: Asia > China (0.28)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adapt under Attack and Domain Shift: Unified Adversarial Meta-Learning and Domain Adaptation for Robust Automatic Modulation Classification

Owfi, Ali, Bamdad, Amirmohammad, Seyfi, Tolunay, Afghah, Fatemeh

arXiv.org Artificial IntelligenceNov-4-2025

Deep learning has emerged as a leading approach for Automatic Modulation Classification (AMC), demonstrating superior performance over traditional methods. However, vulnerability to adversarial attacks and susceptibility to data distribution shifts hinder their practical deployment in real-world, dynamic environments. To address these threats, we propose a novel, unified framework that integrates meta-learning with domain adaptation, making AMC systems resistant to both adversarial attacks and environmental changes. Our framework utilizes a two-phase strategy. First, in an offline phase, we employ a meta-learning approach to train the model on clean and adversarially perturbed samples from a single source domain. This method enables the model to generalize its defense, making it resistant to a combination of previously unseen attacks. Subsequently, in the online phase, we apply domain adaptation to align the model's features with a new target domain, allowing it to adapt without requiring substantial labeled data. As a result, our framework achieves a significant improvement in modulation classification accuracy against these combined threats, offering a critical solution to the deployment and operational challenges of modern AMC systems.

amc model, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.01172

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

IP-Basis PINNs: Efficient Multi-Query Inverse Parameter Estimation

Manor, Shalev, Kohandel, Mohammad

arXiv.org Artificial IntelligenceSep-10-2025

Solving inverse problems with Physics-Informed Neural Networks (PINNs) is computationally expensive for multi-query scenarios, as each new set of observed data requires a new, expensive training procedure. We present Inverse-Parameter Basis PINNs (IP-Basis PINNs), a meta-learning framework that extends the foundational work of Desai et al. (2022) to enable rapid and efficient inference for inverse problems. Our method employs an offline-online decomposition: a deep network is first trained offline to produce a rich set of basis functions that span the solution space of a parametric differential equation. For each new inverse problem online, this network is frozen, and solutions and parameters are inferred by training only a lightweight linear output layer against observed data. Key innovations that make our approach effective for inverse problems include: (1) a novel online loss formulation for simultaneous solution reconstruction and parameter identification, (2) a significant reduction in computational overhead via forward-mode automatic differentiation for PDE loss evaluation, and (3) a non-trivial validation and early-stopping mechanism for robust offline training. We demonstrate the efficacy of IP-Basis PINNs on three diverse benchmarks, including an extension to universal PINNs for unknown functional terms-showing consistent performance across constant and functional parameter estimation, a significant speedup per query over standard PINNs, and robust operation with scarce and noisy data.

artificial intelligence, machine learning, pinn, (18 more...)

arXiv.org Artificial Intelligence

2509.07245

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

Neural Information Processing SystemsAug-16-2025, 14:01:47 GMT

Our analyses indicate that the explorability or reachability assumptions, previously made for the latter two settings, are not necessary statistically for reward-free exploration.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

Neural Information Processing SystemsAug-16-2025, 14:01:43 GMT

Our analyses indicate that the explorability or reachability assumptions, previously made for the latter two settings, are not necessary statistically for reward-free exploration.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

WiFi-based Global Localization in Large-Scale Environments Leveraging Structural Priors from osmAG

Ma, Xu, Zhang, Jiajie, Xie, Fujing, Schwertfeger, Sören

arXiv.org Artificial IntelligenceAug-15-2025

Global localization is essential for autonomous robotics, especially in indoor environments where the GPS signal is denied. We propose a novel WiFi-based localization framework that leverages ubiquitous wireless infrastructure and the OpenStreetMap Area Graph (osmAG) for large-scale indoor environments. Our approach integrates signal propagation modeling with osmAG's geometric and topological priors. In the offline phase, an iterative optimization algorithm localizes WiFi Access Points (APs) by modeling wall attenuation, achieving a mean localization error of 3.79 m (35.3\% improvement over trilateration). In the online phase, real-time robot localization uses the augmented osmAG map, yielding a mean error of 3.12 m in fingerprinted areas (8.77\% improvement over KNN fingerprinting) and 3.83 m in non-fingerprinted areas (81.05\% improvement). Comparison with a fingerprint-based method shows that our approach is much more space efficient and achieves superior localization accuracy, especially for positions where no fingerprint data are available. Validated across a complex 11,025 &m^2& multi-floor environment, this framework offers a scalable, cost-effective solution for indoor robotic localization, solving the kidnapped robot problem. The code and dataset are available at https://github.com/XuMa369/osmag-wifi-localization.

artificial intelligence, localization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.10144

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback