AITopics | pa1

Collaborating Authors

pa1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary material: Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees AProofs of lemmas and theorems

Neural Information Processing SystemsApr-25-2026, 11:51:13 GMT

A.1 Additional lemma Lemma 9 Let s0 be the starting state, let (a)n represent a sequence of actions and let M = Z(ar)Z(ar 1)...Z(a1) i.e., the product of matrices in {Z(a)}left multiplied in order of the sequence Proof Here we use proof by induction. We note that the interchange of the integral and infinite summation is justified by Section 3.7 in [5], since the coefficients Z We can then conclude the statement of the lemma by induction. A.2 Proof of Proposition 1 Proof By Lemma 9, given a fixed sequence of actions (a)n, the r-th state sr under this sequence of actions starting from state s0 has a distribution that can be represented over the basis {φn(s)}. Therefore, the expected reward under any sequence of actions for reward Ris the same as for the projected reward R0 for any state sr where r > 0. The reward at the starting state, R(s0) does not depend on the policy. Therefore, the value of R(s0) does not change whether a policy is optimal or not.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

CollapsingBanditsandTheirApplicationtoPublic HealthInterventions

Neural Information Processing SystemsFeb-9-2026, 22:15:09 GMT

Neither (i) nor (ii) are known for general RMABs. Therefore, to capture the scheduling problems addressed inthiswork,weintroduce anewsubclass ofRMABs,Collapsing Bandits, distinguished by the following feature: when an arm is played, the agent fully observes its state, "collapsing" any uncertainty, but when an arm is passive, no observation is made and uncertainty evolves.

artificial intelligence, machine learning, pa1, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Africa > Kenya (0.04)
South America > Peru (0.04)
(4 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.34)

Add feedback

Supplementary material: Inverse Reinforcement Learning in a ContinuousStateSpacewithFormalGuarantees AProofsoflemmasandtheorems

Neural Information Processing SystemsFeb-8-2026, 06:05:29 GMT

We note that the interchange of the integral and infinite summation is justified by Section 3.7 in [5], since the coefficients Z Now,define action sequence (a)n such thata1 = a and an = a1 for alln > 1. Then we can use subadditivity of measure to bound the maximum difference across all entries of [kZ]. Therefore, the induced infinity norm error ofbZ isless thanεifthe element wise error isless than ε/k. Therefore,bα>Fφ(s) is ρ-Lipschitz if the absolute value of its derivativeisboundedbyρ,i.e. SincebF has all zeros beyond thek-th column and row, each infinite-matrix bF can be treated as ak k matrix.

machine learning, reinforcement learning, transition function, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback