Goto

Collaborating Authors

 collision


A Dirac-Frenkel-Onsager principle: Instantaneous residual minimization with gauge momentum for nonlinear parametrizations of PDE solutions

arXiv.org Machine Learning

Dirac-Frenkel instantaneous residual minimization evolves nonlinear parametrizations of PDE solutions in time, but ill-conditioning can render the parameter dynamics non-unique. We interpret this non-uniqueness as a gauge freedom: nullspace directions that leave the time derivative unchanged can be used to select better-conditioned parameter velocities. Building on Onsager's minimum-dissipation principle, we introduce a history variable -- interpretable as momentum -- and inject it only along the nullspace directions. The resulting Dirac-Frenkel-Onsager dynamics preserve instantaneous residual minimization, in contrast to standard regularization that can introduce bias, while promoting temporally smooth parameter evolutions. Examples demonstrate that the approach leads to increased robustness in singular and near-singular regimes.


e197fe307eb3467035f892dc100d570a-Supplemental-Conference.pdf

Neural Information Processing Systems

In addition to the radar plot, we present the specific numerical values for the prediction and driving performance metrics to provide a more detailed and comprehensive analysis of the system's performance, as demonstrated in Table 1. The static evaluation metrics, ADE and FDE, are trained and validated on the Alignment dataset collected from the SUMMIT simulator. The task-driven evaluation metrics, including safety, efficiency, comfort, and driving performance, are derived from interactive closed-loop scenarios. The process for calculating these metrics is described in Appendix C. Results in Table 1 are used to plot the correlation map between ADE/FDE and driving performance, which surprisingly indicates no strong correlation between static evaluation metrics and real driving performance. Moreover, to ensure the comparability between prediction performance metrics and driving performance metrics in the radar plot, we normalize all metrics to the scale of [0, 1]. B.1 The RVOPlanner The Reciprocal Velocity Obstacle (RVO) planner is developed based on [8], which expands on the concept of velocity obstacles [4] to consider the reactive behaviors of exo-agents.


details

Neural Information Processing Systems

A.1 MONet To segment each w hframe Ft into No object representations, MONet uses a recurrent attention network to obtain No attention masks Ati [0,1]w h for i = 1,...,No that represent the probability of each pixel in Ft belonging to the i-th object, with This attention network is coupled with a component VAE with latents zti Rd for i= 1,...,No that reconstructs Ati Ft, the i-th object in the image. The latent posterior distribution q(zt|Ft,Ati)is a diagonal Gaussian with mean µti, and we use µti as the representation of the i-th object. When these representations are fed into the transformer, we use a linear projection to map the raw object/word embeddings, which lie in Rd, to a vector in RdNH, where NH is the number of selfattention heads. This step is necessary as generally the latent dimensionality of MONet, d, is less than NH whereas a transformer expects the embedding size to be divisible by NH. A.2 Self-supervised training Recall in the main text that we wrote the auxiliary self-supervised loss as auxiliary loss = X A comparison of these losses and the masking schemes is given in Figure 4. We also tested a few variations of the contrastive loss inspired by literature and tested all combinations of variations.





251c5ffd6b62cc21c446c963c76cf214-Supplemental.pdf

Neural Information Processing Systems

A.1 Network Architecture Here, we describe the architecture of the eVAE presented in Figure 1 of the main paper, in more detail. Event Context Network: We adapt the architecture proposed in [21] for the event context network, but without the feature transformation preprocessing steps. In our implementation, we use three Conv1d layers of 64, 128 and 1024 channels each followed by BatchNorm and a ReLU activation. At the end of the ECN, we add the temporal features (see Appendix A.2) to the N 1024 feature tensor, and execute the max operation to result in a context vector. The sizes of the intermediate features and the context feature are hyperparameters that can be varied based on the application, data complexity etc. Encoder: The encoder for the VAE is composed of two layers, of sizes 1024 and 256 respectively, resulting in two output vectors of 1 8 each, corresponding to the mean and standard deviation for the latent space vector.



Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Neural Information Processing Systems

In this work, we propose a unified framework, called Visual Reasoning with Differentiable Physics (VRDP) 1, that can jointly learn visual concepts and infer physics models of objects and their interactions from videos and language. This is achieved by seamlessly integrating three components: a visual perception module, a concept learner, and a differentiable physics engine. The visual perception module parses each video frame into object-centric trajectories and represents them as latent scene representations. The concept learner grounds visual concepts (e.g., color, shape, and material) from these object-centric representations based on the language, thus providing prior knowledge for the physics engine. The differentiable physics model, implemented as an impulse-based differentiable rigid-body simulator, performs differentiable physical simulation based on the grounded concepts to infer physical properties, such as mass, restitution, and velocity, by fitting the simulated trajectories into the video observations. Consequently, these learned concepts and physical models can explain what we have seen and imagine what is about to happen in future and counterfactual scenarios.


#AAAI2026 invited talk: machine learning for particle physics

AIHub

Daniel Whiteson is a particle physicist, who uses machine learning and statistical tools to analyze high-energy particle collisions. He is also a dedicated science communicator, having published books and comics, and is co-host of a science podcast. In his invited talk at the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), Daniel shared insights on both these aspects of his career. Daniel works at the Large Hadron Collider (LHC) at CERN, primarily looking at proton-proton collisions, which occur at 13 TeV, a massive 13,000 times the energy stored in a single proton. The majority of collisions result in known particles, such as electrons or muons.