AITopics | tmax

Collaborating Authors

tmax

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving planning and MBRL with temporally-extended actions

Neural Information Processing SystemsJun-22-2026, 19:56:34 GMT

Continuous time systems are often modeled using discrete time dynamics but this requires a small simulation step to maintain accuracy. In turn, this requires a large planning horizon which leads to computationally demanding planning problems and reduced performance. Previous work in model-free reinforcement learning has partially addressed this issue using action repeats where a policy is learned to determine a discrete action duration. Instead we propose to control the continuous decision timescale directly by using temporally-extended actions and letting the planner treat the duration of the action as an additional optimization variable along with the standard action variables. This additional structure has multiple advantages. It speeds up simulation time of trajectories and, importantly, it allows for deep horizon search in terms of primitive actions while using a shallow search depth in the planner. In addition, in the model-based reinforcement learning (MBRL) setting, it reduces compounding errors from model learning and improves training time for models. We show that this idea is effective and that the range for action durations can be automatically selected using a multi-armed bandit formulation and integrated into the MBRL framework. An extensive experimental evaluation both in planning and in MBRL, shows that our approach yields faster planning, better solutions, and that it enables solutions to problems that are not solved in the standard formulation.

data mining, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

Neural Information Processing SystemsJun-19-2026, 07:16:18 GMT

This paper investigates infinite-horizon average reward Constrained Markov Decision Processes (CMDPs) under general parametrized policies with smooth and bounded policy gradients. We propose a Primal-Dual Natural Actor-Critic algorithm that adeptly manages constraints while ensuring a high convergence rate. In particular, our algorithm achieves global convergence and constraint violation rates of O(1/ T) over a horizon of length T when the mixing time, τmix, is known to the learner. In absence of knowledge of τmix, the achievable rates change to O(1/T0.5 ϵ) provided that T O τ2/ϵmix . Our results match the theoretical lower bound for Markov Decision Processes and establish a new benchmark in the theoretical exploration of average reward CMDPs.

assumption 4, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Optimal Estimation of the Best Mean in Multi-Armed Bandits

Neural Information Processing SystemsJun-14-2026, 09:53:03 GMT

We study the problem of estimating the mean reward of the best arm in a multiarmed bandit (MAB) setting. Specifically, given a target precision εand confidence level 1 δ, the goal is to return an ε-accurate estimate of the largest mean reward with probability at least 1 δ, while minimizing the number of samples. We first establish an instance-dependent lower bound on the sample complexity, which requires handling the infinitely many possible candidates of the estimated best mean. This lower bound is expressed in a non-convex optimization problem, which becomes the main difficulty of this problem, preventing the direct application of standard techniques such as Track-and-Stop to provably achieve optimality. To overcome this difficulty, we introduce several new algorithmic and analytical techniques and propose an algorithm that achieves the asymptotic lower bound with matching constants in the leading term. Our method combines a confidence ellipsoid-based stopping condition with a two-phase sampling strategy tailored to manage non-convexity proposed algorithm is simple, nearly free of hyperparameters, and achieves the instance-dependent, asymptotically optimal sample complexity. Experimental results support our theoretical guarantees and demonstrate the practical effectiveness of our method.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Add feedback

Adaptive Experimentation for Censored Survival Outcomes

Wang, Yuxin, Frauen, Dennis, Schweisthal, Jonas, Schröder, Maresa, Javurek, Emil, Feuerriegel, Stefan

arXiv.org Machine LearningMay-19-2026

Adaptive experimentation enables efficient estimation of causal effects, but existing methods are not designed for survival data with censoring, where event times are only partially observed (e.g., overall survival in cancer trials but with dropout). In this paper, we develop a novel framework for adaptive experimentation to estimate causal effects under right censoring. For this, we derive the semiparametric efficiency bound for the average survival effect curve as a function of the treatment allocation policy and thereby obtain a closed-form efficiency-optimal allocation policy. The policy generalizes classical Neyman allocation to survival settings by prioritizing patient strata where both event and censoring dynamics induce high uncertainty. Building on this, we propose the Adaptive Survival Estimator (ASE), an adaptive framework that learns the allocation policy and estimates the average survival effect curve sequentially. Our framework has three main benefits: (i) it accommodates arbitrary machine learning models for nuisance estimation; (ii) it is guided by a closed-form efficiency-optimal allocation policy; and (iii) it admits strong theoretical guarantees, including asymptotic normality via a martingale central limit theorem. We demonstrate our framework across various numerical experiments to show consistent efficiency gains over uniform randomization and censoring-agnostic baselines.

allocation, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2605.18459

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Supplementary Material for " Path following algorithms for ℓ2-regularized M-estimation with approximation guarantee "

Neural Information Processing SystemsApr-24-2026, 04:41:48 GMT

Figure S2: Number of iterations at each grid point for the Newton and gradient descent methods applying to the ℓ2-regularized logistic regression over simulated data generated in Example 2. We summarize the results in Figure S1-S3. Figure S1 presents the results for ridge regression. In this case, the number of iterations by gradient method first increases and then stays flat as tk grows. Newton method, however, only takes one 1.51.5 iteration at each grid point. Moreover, the level of approximation (i.e., ϵ) seems to have no impact onthe number of iterations at each grid point, which is highly desirable.

artificial intelligence, machine learning, tmax, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback

A.1 Hyper-Parameters For all datasets, the surrogate gradient function isσ(x) = 1π arctan(π2αx) + 12, thus σ0(x) = α 2(1+(π

Neural Information Processing SystemsFeb-10-2026, 17:29:37 GMT

A.1 Hyper-Parameters For all datasets, the surrogate gradient function isσ(x) = 1π arctan(π2αx) + 12, thus σ0(x) = The results on the three networks are consistent, indicating that RTD is a general sequential data augmentationmethod. We compare different surrogate functions, including Rectangular (σ0(x) = sign(|x| < 12)),ArcTan(σ0(x) = 11+(πx)2)and Constant 1(σ0(x) 1),intheSNNs on CIFAR-10. The results are shown in Tab.9. Tab.9 indicates that the choice of surrogate function has a considerable influence on the SNN's performance. Although Rectangular and Constant 1 can avoid the gradient exploding/vanishing problems in Eq.(8), they still cause lower accuracy or even make the optimization not converges.

artificial intelligence, machine learning, vth, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

StochasticBias-ReducedGradientMethods

Neural Information Processing SystemsFeb-8-2026, 19:58:49 GMT

We develop a new primitive for stochastic optimization: a low-bias, low-cost estimator of the minimizerx? of any Lipschitz strongly-convex function.

artificial intelligence, estimator, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Leveraging GPT-4 for Food Effect Summarization to Enhance Product-Specific Guidance Development via Iterative Prompting

Shi, Yiwen, Ren, Ping, Wang, Jing, Han, Biao, ValizadehAslani, Taha, Agbavor, Felix, Zhang, Yi, Hu, Meng, Zhao, Liang, Liang, Hualou

arXiv.org Artificial IntelligenceJun-28-2023

Food effect summarization from New Drug Application (NDA) is an essential component of product-specific guidance (PSG) development and assessment. However, manual summarization of food effect from extensive drug application review documents is time-consuming, which arouses a need to develop automated methods. Recent advances in large language models (LLMs) such as ChatGPT and GPT-4, have demonstrated great potential in improving the effectiveness of automated text summarization, but its ability regarding the accuracy in summarizing food effect for PSG assessment remains unclear. In this study, we introduce a simple yet effective approach, iterative prompting, which allows one to interact with ChatGPT or GPT-4 more effectively and efficiently through multi-turn interaction. Specifically, we propose a three-turn iterative prompting approach to food effect summarization in which the keyword-focused and length-controlled prompts are respectively provided in consecutive turns to refine the quality of the generated summary. We conduct a series of extensive evaluations, ranging from automated metrics to FDA professionals and even evaluation by GPT-4, on 100 NDA review documents selected over the past five years. We observe that the summary quality is progressively improved throughout the process. Moreover, we find that GPT-4 performs better than ChatGPT, as evaluated by FDA professionals (43% vs. 12%) and GPT-4 (64% vs. 35%). Importantly, all the FDA professionals unanimously rated that 85% of the summaries generated by GPT-4 are factually consistent with the golden reference summary, a finding further supported by GPT-4 rating of 72% consistency. These results strongly suggest a great potential for GPT-4 to draft food effect summaries that could be reviewed by FDA professionals, thereby improving the efficiency of PSG assessment cycle and promoting the generic drug product development.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.16275

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Maryland > Montgomery County > Silver Spring (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government > FDA (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An operational framework to automatically evaluate the quality of weather observations from third-party stations

Shao, Quanxi, Li, Ming, Dabrowski, Joel Janek, Bakar, Shuvo, Rahman, Ashfaqur, Powell, Andrea, Henderson, Brent

arXiv.org Machine LearningDec-4-2022

With increasing number of crowdsourced private automatic weather stations (called TPAWS) established to fill the gap of official network and obtain local weather information for various purposes, the data quality is a major concern in promoting their usage. Proper quality control and assessment are necessary to reach mutual agreement on the TPAWS observations. To derive near real-time assessment for operational system, we propose a simple, scalable and interpretable framework based on AI/Stats/ML models. The framework constructs separate models for individual data from official sources and then provides the final assessment by fusing the individual models. The performance of our proposed framework is evaluated by synthetic data and demonstrated by applying it to a re-al TPAWS network.

artificial intelligence, machine learning, social media, (19 more...)

arXiv.org Machine Learning

2212.01998

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Western Australia (0.04)
Oceania > Australia > Queensland > Cairns Region > Cairns (0.04)
North America > United States > New York (0.04)

Genre:

Research Report (0.83)
Workflow (0.51)

Industry: Banking & Finance > Insurance (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Add feedback

Learning to solve the single machine scheduling problem with release times and sum of completion times

Parmentier, Axel, T'Kindt, Vincent

arXiv.org Artificial IntelligenceJan-4-2021

In this paper, we focus on the solution of a hard single machine scheduling problem by new heuristic algorithms embedding techniques from machine learning field and scheduling theory. These heuristics transform an instance of the hard problem into an instance of a simpler one solved to optimality. The obtained schedule is then transposed to the original problem. Computational experiments show that they are competitive with state-of-the-art heuristics, notably on large instances.

algorithm, optimal solution, scheduling problem, (15 more...)

arXiv.org Artificial Intelligence

2101.01082

Country: Europe > France (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)

Add feedback