AITopics | self-consistency loss

Collaborating Authors

self-consistency loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Integration Matters for Learning PDEs with Backwards SDEs

Park, Sungje, Tu, Stephen

arXiv.org Machine LearningMay-5-2025

Backward stochastic differential equation (BSDE)-based deep learning methods provide an alternative to Physics-Informed Neural Networks (PINNs) for solving high-dimensional partial differential equations (PDEs), offering algorithmic advantages in settings such as stochastic optimal control, where the PDEs of interest are tied to an underlying dynamical system. However, existing BSDE-based solvers have empirically been shown to underperform relative to PINNs in the literature. In this paper, we identify the root cause of this performance gap as a discretization bias introduced by the standard Euler-Maruyama (EM) integration scheme applied to short-horizon self-consistency BSDE losses, which shifts the optimization landscape off target. We find that this bias cannot be satisfactorily addressed through finer step sizes or longer self-consistency horizons. To properly handle this issue, we propose a Stratonovich-based BSDE formulation, which we implement with stochastic Heun integration. We show that our proposed approach completely eliminates the bias issues faced by EM integration. Furthermore, our empirical results show that our Heun-based BSDE method consistently outperforms EM-based variants and achieves competitive results with PINNs across multiple high-dimensional benchmarks. Our findings highlight the critical role of integration schemes in BSDE-based PDE solvers, an algorithmic detail that has received little attention thus far in the literature.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2505.01078

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Mishra, Aayush, Habermann, Daniel, Schmitt, Marvin, Radev, Stefan T., Bürkner, Paul-Christian

arXiv.org Machine LearningFeb-11-2025

Neural amortized Bayesian inference (ABI) can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, neural ABI is not yet sufficiently robust for widespread and safe applicability. In particular, when performing inference on observations outside of the scope of the simulated data seen during training, for example, because of model misspecification, the posterior approximations are likely to become highly biased. Due to the bad pre-asymptotic behavior of current neural posterior estimators in the out-of-simulation regime, the resulting estimation biases cannot be fixed in acceptable time by just simulating more training data. In this proof-of-concept paper, we propose a semi-supervised approach that enables training not only on (labeled) simulated data generated from the model, but also on unlabeled data originating from any source, including real-world data. To achieve the latter, we exploit Bayesian self-consistency properties that can be transformed into strictly proper losses without requiring knowledge of true parameter values, that is, without requiring data labels. The results of our initial experiments show remarkable improvements in the robustness of ABI on out-of-simulation data. Even if the observed data is far away from both labeled and unlabeled training data, inference remains highly accurate. If our findings also generalize to other scenarios and model classes, we believe that our new method represents a major breakthrough in neural ABI.

artificial intelligence, machine learning, self-consistency loss, (16 more...)

arXiv.org Machine Learning

2501.13483

Country:

North America > United States (0.14)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

Zhang, He, Liu, Chang, Wang, Zun, Wei, Xinran, Liu, Siyuan, Zheng, Nanning, Shao, Bin, Liu, Tie-Yan

arXiv.org Artificial IntelligenceJun-5-2024

Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale.

molecule, prediction, self-consistency training, (16 more...)

arXiv.org Artificial Intelligence

2403.0956

Country:

North America > United States (0.45)
Europe > Austria > Vienna (0.14)
Asia > China > Guangxi Province > Nanning (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Government > Regional Government > North America Government > United States Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Regression Transformer: Concurrent Conditional Generation and Regression by Blending Numerical and Textual Tokens

Born, Jannis, Manica, Matteo

arXiv.org Artificial IntelligenceFeb-1-2022

We report the Regression Transformer (RT), a method that abstracts regression as a conditional sequence modeling problem. The RT casts continuous properties as sequences of numerical tokens and encodes them jointly with conventional tokens. This yields a dichotomous model that can seamlessly transition between solving regression tasks and conditional generation tasks; solely governed by the mask location. We propose several extensions to the XLNet objective and adopt an alternating training scheme to concurrently optimize property prediction and conditional text generation based on a self-consistency loss. Our experiments on both chemical and protein languages demonstrate that the performance of traditional regression models can be surpassed despite training with cross entropy loss. Importantly, priming the same model with continuous properties yields a highly competitive conditional generative models that outperforms specialized approaches in a constrained property optimization benchmark. In sum, the Regression Transformer opens the door for "swiss army knife" models that excel at both regression and conditional generation. This finds application particularly in property-driven, local exploration of the chemical or protein space.

dataset, objective, regression transformer, (13 more...)

arXiv.org Artificial Intelligence

2202.01338

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Oman (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback