AITopics | continuous time

Collaborating Authors

continuous time

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Origin of Edge of Stability

Litman, Elon

arXiv.org Machine LearningApr-23-2026

Full-batch gradient descent on neural networks drives the largest Hessian eigenvalue to the threshold $2/η$, where $η$ is the learning rate. This phenomenon, the Edge of Stability, has resisted a unified explanation: existing accounts establish self-regulation near the edge but do not explain why the trajectory is forced toward $2/η$ from arbitrary initialization. We introduce the edge coupling, a functional on consecutive iterate pairs whose coefficient is uniquely fixed by the gradient-descent update. Differencing its criticality condition yields a step recurrence with stability boundary $2/η$, and a second-order expansion yields a loss-change formula whose telescoping sum forces curvature toward $2/η$. The two formulas involve different Hessian averages, but the mean value theorem localizes each to the true Hessian at an interior point of the step segment, yielding exact forcing of the Hessian eigenvalue with no gap. Setting both gradients of the edge coupling to zero classifies fixed points and period-two orbits; near a fixed point, the problem reduces to a function of the half-amplitude alone, which determines which directions support period-two orbits and on which side of the critical learning rate they appear.

artificial intelligence, machine learning, theorem 2, (17 more...)

arXiv.org Machine Learning

2604.20446

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Deep Reinforcement Learning of Marked Temporal Point Processes

Utkarsh Upadhyay, Abir De, Manuel Gomez Rodriguez

Neural Information Processing SystemsFeb-13-2026, 05:00:55 GMT

Neural Information Processing Systems http://nips.cc/

hvjktley5ibxojyw9x35pz1e8oe, latexit sha1, wbodyozq, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Group-Fair Online Allocation in Continuous Time

Neural Information Processing SystemsDec-24-2025, 09:03:48 GMT

The theory of discrete-time online learning has been successfully applied in many problems that involve sequential decision-making under uncertainty. However, in many applications including contractual hiring in online freelancing platforms and server allocation in cloud computing systems, the outcome of each action is observed only after a random and action-dependent time. Furthermore, as a consequence of certain ethical and economic concerns, the controller may impose deadlines on the completion of each task, and require fairness across different groups in the allocation of total time budget $B$. In order to address these applications, we consider continuous-time online learning problem with fairness considerations, and present a novel framework based on continuous-time utility maximization. We show that this formulation recovers reward-maximizing, max-min fair and proportionally fair allocation rules across different groups as special cases. We characterize the optimal offline policy, which allocates the total time between different actions in an optimally fair way (as defined by the utility function), and impose deadlines to maximize time-efficiency. In the absence of any statistical knowledge, we propose a novel online learning algorithm based on dual ascent optimization for time averages, and prove that it achieves $\tilde{O}(B^{-1/2})$ regret bound.

continuous time, group-fair online allocation, name change, (5 more...)

Neural Information Processing Systems

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

POMDPs in Continuous Time and Discrete Spaces

Neural Information Processing SystemsDec-24-2025, 08:22:52 GMT

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Deep Reinforcement Learning of Marked Temporal Point Processes

Utkarsh Upadhyay, Abir De, Manuel Gomez Rodriguez

Neural Information Processing SystemsNov-20-2025, 17:22:08 GMT

In doing so, we define the agent's policy using the intensity and

latexit sha1, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Nash Equilibrium and Belief Evolution in Differential Games

Zhou, Jiangjing, Petrosian, Ovanes, Zhang, Ye, Gao, Hongwei

arXiv.org Artificial IntelligenceSep-16-2025

Differential games [4, 6] involve multiple players controlling a dynamical system through their actions, which are described by differential state equations. These games evolve over a continuous-time horizon, where each player seeks to optimize an objective function that depends on the system's state, their own actions, and potentially the actions of others. In this study, we extend the classic differential game model to scenarios involving motion-payoff uncertainty, where players face uncertainties in both the dynamic equations and the payoff functions, and are unaware of certain parameters in the environment or in their opponents' payoff structures. In dynamic games, optimal control techniques are generalized to accommodate multiple players with both shared and conflicting interests. As shown in [9], if a set of interconnected partial differential equations--commonly referred to as the Hamilton-Jacobi-Bellman (HJB) equations--has solutions, then a Nash equilibrium can be achieved. At this equilibrium, no player can improve their outcome by unilaterally changing their strategy. However, traditional dynamic game models often assume that all players possess complete knowledge of the game. In many real-world scenarios, players face rapidly changing and uncertain environments, leading to incomplete information about the system's dynamics and payoffs [22, 3, 15, 1]. To address this uncertainty, we apply Bayesian updating methods, where players update their beliefs about unknown parameters as new information becomes available.

artificial intelligence, continuous bayesian, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.11739

Country:

North America > United States (1.00)
Asia > China (0.94)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

First, we want to emphasize that our work is the first step towards the control of 2

Neural Information Processing SystemsAug-15-2025, 07:36:16 GMT

Sorry for the misnomer, we will scrap the word "countable" in "finite countable set".

discretization, reviewer, space problem, (16 more...)

Neural Information Processing Systems

Genre: Workflow (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

32ac710102f0620d0f28d5d05a44fe08-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 04:42:20 GMT

batch size, different batch size, log scale, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ZORMS-LfD: Learning from Demonstrations with Zeroth-Order Random Matrix Search

Dry, Olivia, Molloy, Timothy L., Jin, Wanxin, Shames, Iman

arXiv.org Artificial IntelligenceJul-24-2025

We propose Zeroth-Order Random Matrix Search for Learning from Demonstrations (ZORMS-LfD). ZORMS-LfD enables the costs, constraints, and dynamics of constrained optimal control problems, in both continuous and discrete time, to be learned from expert demonstrations without requiring smoothness of the learning-loss landscape. In contrast, existing state-of-the-art first-order methods require the existence and computation of gradients of the costs, constraints, dynamics, and learning loss with respect to states, controls and/or parameters. Most existing methods are also tailored to discrete time, with constrained problems in continuous time receiving only cursory attention. We demonstrate that ZORMS-LfD matches or surpasses the performance of state-of-the-art methods in terms of both learning loss and compute time across a variety of benchmark problems. On unconstrained continuous-time benchmark problems, ZORMS-LfD achieves similar loss performance to state-of-the-art first-order methods with an over $80$\% reduction in compute time. On constrained continuous-time benchmark problems where there is no specialized state-of-the-art method, ZORMS-LfD is shown to outperform the commonly used gradient-free Nelder-Mead optimization method.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2507.17096

Country: North America > United States (0.28)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do

Wald, Yoav, Goldstein, Mark, Efroni, Yonathan, van Amsterdam, Wouter A. C., Ranganath, Rajesh

arXiv.org Artificial IntelligenceMar-20-2025

Problems in fields such as healthcare, robotics, and finance requires reasoning about the value both of what decision or action to take and when to take it. The prevailing hope is that artificial intelligence will support such decisions by estimating the causal effect of policies such as how to treat patients or how to allocate resources over time. However, existing methods for estimating the effect of a policy struggle with \emph{irregular time}. They either discretize time, or disregard the effect of timing policies. We present a new deep-Q algorithm that estimates the effect of both when and what to do called Earliest Disagreement Q-Evaluation (EDQ). EDQ makes use of recursion for the Q-function that is compatible with flexible sequence models, such as transformers. EDQ provides accurate estimates under standard assumptions. We validate the approach through experiments on survival time and tumor growth tasks.

machine learning, reinforcement learning, trajectory, (21 more...)

arXiv.org Artificial Intelligence

2503.1589

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback