AITopics | average training time

For the comprehensiveness of proof, we duplicate Lemma 3.1 here. If we use Lemma A.1 with diagonal covariance matrices for In this section, we outline additional details of the experimental settings including the datasets (Appendix B.1), hyperparameters of the models used (Appendix B.2), metrics (Appendix B.3), and a brief analysis of computational complexity of MGP and MNPs (Appendix B.4). We generated 1,000 synthetic training samples (i.e., Robustness to Noisy Samples Dataset In Section 5.1, we evaluated the models' robustness to The details of each dataset are outlined in Table 1. These datasets lie within a feature space where each feature extraction method can be found in [5]. Table 1: Multimodal datasets used for evaluating robustness to noisy samples.

accuracy, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County > San Diego (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Visualizing the Local Atomic Environment Features of Machine Learning Interatomic Potential

Shao, Xuqiang, Zhang, Yuqi, Zhang, Di, Gao, Tianxiang, Liu, Xinyuan, Gan, Zhiran, Meng, Fanshun, Li, Hao, Yang, Weijie

arXiv.org Artificial IntelligenceJan-26-2025

This paper addresses the challenges of creating efficient and high-quality datasets for machine learning potential functions. We present a novel approach, termed DV-LAE (Difference Vectors based on Local Atomic Environments), which utilizes the properties of atomic local environments and employs histogram statistics to generate difference vectors. This technique facilitates dataset screening and optimization, effectively minimizing redundancy while maintaining data diversity. We have validated the optimized datasets in high-temperature and high-pressure hydrogen systems as well as the {\alpha}-Fe/H binary system, demonstrating a significant reduction in computational resource usage without compromising prediction accuracy. Additionally, our method has revealed new structures that emerge during simulations but were underrepresented in the initial training datasets. The redundancy in the datasets and the distribution of these new structures can be visually analyzed through the visualization of difference vectors. This approach enhances our understanding of the characteristics of these newly formed structures and their impact on physical processes.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.16398

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Efficient transformer with reinforced position embedding for language models

Hsiao, Yen-Che, Dutta, Abhishek

arXiv.org Artificial IntelligenceOct-6-2024

In this paper, we propose an efficient transformer architecture that uses reinforced positional embedding to obtain superior performance with half the number of encoder decoder layers. We demonstrate that concatenating positional encoding with trainable token embeddings, normalizing columns in the token embedding matrix, and using the normalized token embedding matrix as the value of the attention layer improve the training and validation loss and the training time in an encoder-decoder Transformer model for a Portuguese-English translation task with 10 epochs or 12 hours of training across 10 trials. Our method, with roughly a threefold parameter reduction compared to the baseline model, yields a mean training loss of 1.21, a mean validation loss of 1.51, and an average training time of 1352.27 Additionally, we evaluated our proposed architecture and the baseline across 14 diverse translation datasets from TensorFlow. The results indicate that our method consistently achieves lower or comparable training and validation losses, suggesting enhanced learning efficiency.

matrix, training time, validation loss, (16 more...)

arXiv.org Artificial Intelligence

2410.04731

Country: North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data

Holzmüller, David, Grinsztajn, Léo, Steinwart, Ingo

arXiv.org Artificial IntelligenceJul-5-2024

For classification and regression on tabular data, the dominance of gradient-boosted decision trees (GBDTs) has recently been challenged by often much slower deep learning methods with extensive hyperparameter tuning. We address this discrepancy by introducing (a) RealMLP, an improved multilayer perceptron (MLP), and (b) improved default parameters for GBDTs and RealMLP. We tune RealMLP and the default parameters on a meta-train benchmark with 71 classification and 47 regression datasets and compare them to hyperparameter-optimized versions on a disjoint meta-test benchmark with 48 classification and 42 regression datasets, as well as the GBDT-friendly benchmark by Grinsztajn et al. (2022). Our benchmark results show that RealMLP offers a better time-accuracy tradeoff than other neural nets and is competitive with GBDTs. Moreover, a combination of RealMLP and GBDTs with improved default parameters can achieve excellent results on medium-sized tabular datasets (1K--500K samples) without hyperparameter tuning.

benchmark, confidence interval, dataset, (14 more...)

arXiv.org Artificial Intelligence

2407.04491

Country:

Asia > Thailand (0.04)
North America > United States > California (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

a9be4c2a4041cadbf9d61ae16dd1389e-Reviews.html

Neural Information Processing SystemsMar-13-2024, 19:35:28 GMT

Recall that in the implementation of the proposed convex method, CVX2, each boosting step (which adds a single rank to the solution) is interleaved with local optimization. For the local optimization we use a standard LBFGS implementation with default termination conditions. For the outer boosting iterations we terminate when the relative objective improvement is less than 5e-5 or the absolute improvement is less than 1e-3. The average rank results in the above table corresponds to the number of boosting rounds used by CVX2, which also determines the rank of its final solutions. From these results, one can see that the method uses significantly less than the full O(t 2) storage.

average training time, formulation, implementation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

average training time

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

edd0d433f8a1a51aa11237a6543fc280-Supplemental-Conference.pdf

839e23e5b1c52cfd1268f4023a3af0d6-Supplemental-Conference.pdf

2ee1c87245956e3eaa71aaba5f5753eb-Paper-Conference.pdf

edd0d433f8a1a51aa11237a6543fc280-Supplemental-Conference.pdf

2ee1c87245956e3eaa71aaba5f5753eb-Paper-Conference.pdf

Beyond Unimodal: Generalising Neural Processes for Multimodal Uncertainty Estimation Appendix A Lemma and Proof

Visualizing the Local Atomic Environment Features of Machine Learning Interatomic Potential

Efficient transformer with reinforced position embedding for language models

Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data

a9be4c2a4041cadbf9d61ae16dd1389e-Reviews.html