Goto

Collaborating Authors

 Model-Based Reasoning


Introducing the NewsPaLM MBR and QE Dataset: LLM-Generated High-Quality Parallel Data Outperforms Traditional Web-Crawled Data

arXiv.org Artificial Intelligence

Recent research in neural machine translation (NMT) has shown that training on high-quality machine-generated data can outperform training on human-generated data. This work accompanies the first-ever release of a LLM-generated, MBR-decoded and QE-reranked dataset with both sentence-level and multi-sentence examples. We perform extensive experiments to demonstrate the quality of our dataset in terms of its downstream impact on NMT model performance. We find that training from scratch on our (machine-generated) dataset outperforms training on the (web-crawled) WMT'23 training dataset (which is 300 times larger), and also outperforms training on the top-quality subset of the WMT'23 training dataset. We also find that performing self-distillation by finetuning the LLM which generated this dataset outperforms the LLM's strong few-shot baseline. These findings corroborate the quality of our dataset, and demonstrate the value of high-quality machine-generated data in improving performance of NMT models.


Machine Learning with Physics Knowledge for Prediction: A Survey

arXiv.org Artificial Intelligence

This survey examines the broad suite of methods and models for combining machine learning with physics knowledge for prediction and forecast, with a focus on partial differential equations. These methods have attracted significant interest due to their potential impact on advancing scientific research and industrial practices by improving predictive models with small- or large-scale datasets and expressive predictive models with useful inductive biases. The survey has two parts. The first considers incorporating physics knowledge on an architectural level through objective functions, structured predictive models, and data augmentation. The second considers data as physics knowledge, which motivates looking at multi-task, meta, and contextual learning as an alternative approach to incorporating physics knowledge in a data-driven fashion. Finally, we also provide an industrial perspective on the application of these methods and a survey of the open-source ecosystem for physics-informed machine learning.


Quantification of total uncertainty in the physics-informed reconstruction of CVSim-6 physiology

arXiv.org Artificial Intelligence

When predicting physical phenomena through simulation, quantification of the total uncertainty due to multiple sources is as crucial as making sure the underlying numerical model is accurate. Possible sources include irreducible aleatoric uncertainty due to noise in the data, epistemic uncertainty induced by insufficient data or inadequate parameterization, and model-form uncertainty related to the use of misspecified model equations. Physics-based regularization interacts in nontrivial ways with aleatoric, epistemic and model-form uncertainty and their combination, and a better understanding of this interaction is needed to improve the predictive performance of physics-informed digital twins that operate under real conditions. With a specific focus on biological and physiological models, this study investigates the decomposition of total uncertainty in the estimation of states and parameters of a differential system simulated with MC X-TFC, a new physics-informed approach for uncertainty quantification based on random projections and Monte-Carlo sampling. MC X-TFC is applied to a six-compartment stiff ODE system, the CVSim-6 model, developed in the context of human physiology. The system is analyzed by progressively removing data while estimating an increasing number of parameters and by investigating total uncertainty under model-form misspecification of non-linear resistance in the pulmonary compartment. In particular, we focus on the interaction between the formulation of the discrepancy term and quantification of model-form uncertainty, and show how additional physics can help in the estimation process. The method demonstrates robustness and efficiency in estimating unknown states and parameters, even with limited, sparse, and noisy data. It also offers great flexibility in integrating data with physics for improved estimation, even in cases of model misspecification.


Beyond Closure Models: Learning Chaotic-Systems via Physics-Informed Neural Operators

arXiv.org Artificial Intelligence

Accurately predicting the long-term behavior of chaotic systems is crucial for various applications such as climate modeling. However, achieving such predictions typically requires iterative computations over a dense spatiotemporal grid to account for the unstable nature of chaotic systems, which is expensive and impractical in many real-world situations. An alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a \textit{closure model}, which approximates the overall information from fine scales not captured in the coarse-grid simulation. Recently, ML approaches have been used for closure modeling, but they typically require a large number of training samples from expensive fully-resolved simulations (FRS). In this work, we prove an even more fundamental limitation, i.e., the standard approach to learning closure models suffers from a large approximation error for generic problems, no matter how large the model is, and it stems from the non-uniqueness of the mapping. We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation by not using a closure model or a coarse-grid solver. We first train the PINO model on data from a coarse-grid solver and then fine-tune it with (a small amount of) FRS and physics-based losses on a fine grid. The discretization-free nature of neural operators means that they do not suffer from the restriction of a coarse grid that closure models face, and they can provably approximate the long-term statistics of chaotic systems. In our experiments, our PINO model achieves a 120x speedup compared to FRS with a relative error $\sim 5\%$. In contrast, the closure model coupled with a coarse-grid solver is $58$x slower than PINO while having a much higher error $\sim205\%$ when the closure model is trained on the same FRS dataset.


mbrs: A Library for Minimum Bayes Risk Decoding

arXiv.org Artificial Intelligence

Minimum Bayes risk (MBR) decoding is a decision rule of text generation tasks that outperforms conventional maximum a posterior (MAP) decoding using beam search by selecting high-quality outputs based on a utility function rather than those with high-probability. Typically, it finds the most suitable hypothesis from the set of hypotheses under the sampled pseudo-references. mbrs is a library of MBR decoding, which can flexibly combine various metrics, alternative expectation estimations, and algorithmic variants. It is designed with a focus on speed measurement and calling count of code blocks, transparency, reproducibility, and extensibility, which are essential for researchers and developers. We published our mbrs as an MIT-licensed open-source project, and the code is available on GitHub. GitHub: https://github.com/naist-nlp/mbrs


A Multi-class Ride-hailing Service Subsidy System Utilizing Deep Causal Networks

arXiv.org Machine Learning

In the ride-hailing industry, subsidies are predominantly employed to incentivize consumers to place more orders, thereby fostering market growth. Causal inference techniques are employed to estimate the consumer elasticity with different subsidy levels. However, the presence of confounding effects poses challenges in achieving an unbiased estimate of the uplift effect. We introduce a consumer subsidizing system to capture relationships between subsidy propensity and the treatment effect, which proves effective while maintaining a lightweight online environment.


Robust Point Cloud Registration in Robotic Inspection with Locally Consistent Gaussian Mixture Model

arXiv.org Artificial Intelligence

In robotic inspection of aviation parts, achieving accurate pairwise point cloud registration between scanned and model data is essential. However, noise and outliers generated in robotic scanned data can compromise registration accuracy. To mitigate this challenge, this article proposes a probability-based registration method utilizing Gaussian Mixture Model (GMM) with local consistency constraint. This method converts the registration problem into a model fitting one, constraining the similarity of posterior distributions between neighboring points to enhance correspondence robustness. We employ the Expectation Maximization algorithm iteratively to find optimal rotation matrix and translation vector while obtaining GMM parameters. Both E-step and M-step have closed-form solutions. Simulation and actual experiments confirm the method's effectiveness, reducing root mean square error by 20% despite the presence of noise and outliers. The proposed method excels in robustness and accuracy compared to existing methods.


Strategy and Skill Learning for Physics-based Table Tennis Animation

arXiv.org Artificial Intelligence

Recent advancements in physics-based character animation leverage deep learning to generate agile and natural motion, enabling characters to execute movements such as backflips, boxing, and tennis. However, reproducing the selection and use of diverse motor skills in dynamic environments to solve complex tasks, as humans do, still remains a challenge. We present a strategy and skill learning approach for physics-based table tennis animation. Our method addresses the issue of mode collapse, where the characters do not fully utilize the motor skills they need to perform to execute complex tasks. More specifically, we demonstrate a hierarchical control system for diversified skill learning and a strategy learning framework for effective decision-making. We showcase the efficacy of our method through comparative analysis with state-of-the-art methods, demonstrating its capabilities in executing various skills for table tennis. Our strategy learning framework is validated through both agent-agent interaction and human-agent interaction in Virtual Reality, handling both competitive and cooperative tasks.


Generalizable Physics-Informed Learning for Stochastic Safety-Critical Systems

arXiv.org Artificial Intelligence

Accurate estimate of long-term risk is critical for safe decision-making, but sampling from rare risk events and long-term trajectories can be prohibitively costly. Risk gradient can be used in many first-order techniques for learning and control methods, but gradient estimate is difficult to obtain using Monte Carlo (MC) methods because the infinitesimal divisor may significantly amplify sampling noise. Motivated by this gap, we propose an efficient method to evaluate long-term risk probabilities and their gradients using short-term samples without sufficient risk events. We first derive that four types of long-term risk probability are solutions of certain partial differential equations (PDEs). Then, we propose a physics-informed learning technique that integrates data and physics information (aforementioned PDEs). The physics information helps propagate information beyond available data and obtain provable generalization beyond available data, which in turn enables long-term risk to be estimated using short-term samples of safe events. Finally, we demonstrate in simulation that the proposed technique has improved sample efficiency, generalizes well to unseen regions, and adapts to changing system parameters.


Physics-Informed Machine Learning for Smart Additive Manufacturing

arXiv.org Artificial Intelligence

Compared to physics-based computational manufacturing, data-driven models such as machine learning (ML) are alternative approaches to achieve smart manufacturing. However, the data-driven ML's "black box" nature has presented a challenge to interpreting its outcomes. On the other hand, governing physical laws are not effectively utilized to develop data-efficient ML algorithms. To leverage the advantages of ML and physical laws of advanced manufacturing, this paper focuses on the development of a physics-informed machine learning (PIML) model by integrating neural networks and physical laws to improve model accuracy, transparency, and generalization with case studies in laser metal deposition (LMD).