AITopics | domino

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Neural Information Processing SystemsNov-15-2025, 19:47:22 GMT

Appendix of Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning Y ao Mu The University of Hong Kong

Ping Luo is the corresponding author. With Equation 3 and Jensen's inequality applied in Equation 1, we have I (x,y) E Therefore, if the number of confounders increases, then the demand for data will grow exponentially. When data is not rich enough, the nesseray condition may not be satisfied. We provide the pseudo-code of DOMINO combined with model-based methods. Firstly, the past state-action pairs are encoded into the disentangled context vectors by the context encoder. Initialize batch B . for i = 1 to B do sample V Listing 1: PyTorch-style pseudo-code for dynamics change based on Mujoco engine.

artificial intelligence, machine learning, visualization, (15 more...)

Country:

Asia > China > Hong Kong (0.40)
Asia > China > Tianjin Province > Tianjin (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Neural Information Processing SystemsNov-15-2025, 19:47:19 GMT

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic applications.

confounder, machine learning, reinforcement learning, (15 more...)

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

arXiv.org Artificial IntelligenceOct-17-2025

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

Liang, Yichao, Nguyen, Dat, Yang, Cambridge, Li, Tianyang, Tenenbaum, Joshua B., Rasmussen, Carl Edward, Weller, Adrian, Tavares, Zenna, Silver, Tom, Ellis, Kevin

Long-horizon embodied planning is challenging because the world does not only change through an agent's actions: exogenous processes (e.g., water heating, dominoes cascading) unfold concurrently with the agent's actions. We propose a framework for abstract world models that jointly learns (i) symbolic state representations and (ii) causal processes for both endogenous actions and exogenous mechanisms. Each causal process models the time course of a stochastic cause-effect relation. We learn these world models from limited data via variational Bayesian inference combined with LLM proposals. Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.

artificial intelligence, machine learning, natural language, (20 more...)

2509.26255

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts (0.04)

Genre:

Research Report (0.63)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(4 more...)

Neural Information Processing SystemsAug-17-2025, 19:37:58 GMT

Appendix of Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning Y ao Mu The University of Hong Kong

Ping Luo is the corresponding author. With Equation 3 and Jensen's inequality applied in Equation 1, we have I (x,y) E Therefore, if the number of confounders increases, then the demand for data will grow exponentially. When data is not rich enough, the nesseray condition may not be satisfied. We provide the pseudo-code of DOMINO combined with model-based methods. Firstly, the past state-action pairs are encoded into the disentangled context vectors by the context encoder. Initialize batch B . for i = 1 to B do sample V Listing 1: PyTorch-style pseudo-code for dynamics change based on Mujoco engine.

artificial intelligence, machine learning, visualization, (15 more...)

Country:

Asia > China > Hong Kong (0.40)
Asia > China > Tianjin Province > Tianjin (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Neural Information Processing SystemsAug-17-2025, 19:37:54 GMT

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic applications.

confounder, machine learning, reinforcement learning, (15 more...)

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

arXiv.org Artificial IntelligenceJun-9-2025

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Cai, Zikui, Wang, Andrew, Satheesh, Anirudh, Nakhawa, Ankit, Jae, Hyunwoo, Powell, Keenan, Liu, Minghui, Jay, Neel, Oh, Sungbin, Wang, Xiyao, Liang, Yongyuan, Goldstein, Tom, Huang, Furong

Despite rapid advances in vision-language models (VLMs), current benchmarks for multimodal reasoning fall short in three key dimensions. First, they overwhelmingly rely on static images, failing to capture the temporal complexity of real-world environments. Second, they narrowly focus on mathematical problem-solving, neglecting the broader spectrum of reasoning skills -- including abstract, physical, planning, spatial, and temporal capabilities -- required for robust multimodal intelligence. Third, many benchmarks quickly saturate, offering limited headroom for diagnosing failure modes or measuring continued progress. We introduce MORSE-500 (Multimodal Reasoning Stress-test Environment), a video benchmark composed of 500 fully scripted clips with embedded questions spanning six complementary reasoning categories. Each instance is programmatically generated using deterministic Python scripts (via Manim, Matplotlib, MoviePy), generative video models, and curated real footage. This script-driven design allows fine-grained control over visual complexity, distractor density, and temporal dynamics -- enabling difficulty to be scaled systematically as models improve. Unlike static benchmarks that become obsolete once saturated, MORSE-500 is built to evolve: its controllable generation pipeline supports the creation of arbitrarily challenging new instances, making it ideally suited for stress-testing next-generation models. Initial experiments with state-of-the-art systems -- including various Gemini 2.5 Pro and OpenAI o3 which represent the strongest available at the time, alongside strong open-source models -- reveal substantial performance gaps across all categories, with particularly large deficits in abstract and planning tasks. We release the full dataset, generation scripts, and evaluation harness to support transparent, reproducible, and forward-looking multimodal reasoning research.

large language model, machine learning, multimodal reasoning, (18 more...)

2506.05523

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.69)
Leisure & Entertainment (0.46)
Media (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.92)

Ranade, Rishikesh, Nabian, Mohammad Amin, Tangsali, Kaustubh, Kamenev, Alexey, Hennigh, Oliver, Cherukuri, Ram, Choudhry, Sanjay

DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations

arXiv.org Artificial IntelligenceJan-22-2025

Numerical simulations play a critical role in design and development of engineering products and processes. Traditional computational methods, such as CFD, can provide accurate predictions but are computationally expensive, particularly for complex geometries. Several machine learning (ML) models have been proposed in the literature to significantly reduce computation time while maintaining acceptable accuracy. However, ML models often face limitations in terms of accuracy and scalability and depend on significant mesh downsampling, which can negatively affect prediction accuracy and generalization. In this work, we propose a novel ML model architecture, DoMINO (Decomposable Multi-scale Iterative Neural Operator) developed in NVIDIA Modulus to address the various challenges of machine learning based surrogate modeling of engineering simulations. DoMINO is a point cloudbased ML model that uses local geometric information to predict flow fields on discrete points. The DoMINO model is validated for the automotive aerodynamics use case using the DrivAerML dataset. Through our experiments we demonstrate the scalability, performance, accuracy and generalization of our model to both in-distribution and out-of-distribution testing samples. Moreover, the results are analyzed using a range of engineering specific metrics important for validating numerical simulations.

artificial intelligence, computational domain, machine learning, (16 more...)

2501.1335

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas (0.47)
Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsJan-18-2025, 13:56:05 GMT

DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by decomposed mutual information optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges via learning disentangled context and reduce the demand for the number of samples collected in various environments.

artificial intelligence, decomposed mutual information optimization, machine learning, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

arXiv.org Artificial IntelligenceSep-23-2024

Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping

Wang, Guanhua, Zhang, Chengming, Shen, Zheyu, Li, Ang, Ruwase, Olatunji

Given the popularity of generative AI, Large Language Models (LLMs) often consume hundreds or thousands of GPUs for parallelizing and accelerating the training process. Communication overhead becomes more pronounced when training LLMs at scale. To eliminate communication overhead in distributed LLM training, we propose Domino, which provides a generic scheme to hide communication behind computation. By breaking data dependency of a single batch training into smaller independent pieces, Domino pipelines these independent pieces training and provides generic strategy of fine-grained communication and computation overlapping. Extensive results show that, comparing with Megatron-LM, Domino achieves up to 1.3x speedup for LLM training on Nvidia DGX-H100 GPUs.

large language model, machine learning, natural language, (20 more...)

2409.15241

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)