AITopics | ebruary 26

Collaborating Authors

ebruary 26

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Theory-guided Pseudo-spectral Full Waveform Inversion via Deep Neural Networks

Zerafa, Christopher, Galea, Pauline, Sebu, Cristiana

arXiv.org Artificial IntelligenceFeb-24-2025

Full-Waveform Inversion seeks to achieve a high-resolution model of the subsurface through the application of multi-variate optimization to the seismic inverse problem. Although now a mature technology, FWI has limitations related to the choice of the appropriate solver for the forward problem in challenging environments requiring complex assumptions, and very wide angle and multi-azimuth data necessary for full reconstruction are often not available. Deep Learning techniques have emerged as excellent optimization frameworks. Data-driven methods do not impose a wave propagation model and are not exposed to modelling errors. On the contrary, deterministic models are governed by the laws of physics. Seismic FWI has recently started to be investigated as a Deep Learning framework. Focus has been on the time-domain, while the pseudo-spectral domain has not been yet explored. However, classical FWI experienced major breakthroughs when pseudo-spectral approaches were employed. This work addresses the lacuna that exists in incorporating the pseudo-spectral approach within Deep Learning. This has been done by re-formulating the pseudo-spectral FWI problem as a Deep Learning algorithm for a theory-driven pseudo-spectral approach. A novel Recurrent Neural Network framework is proposed. This is qualitatively assessed on synthetic data, applied to a two-dimensional Marmousi dataset and evaluated against deterministic and time-based approaches. Pseudo-spectral theory-guided FWI using RNN was shown to be more accurate than classical FWI with only 0.05 error tolerance and 1.45\% relative percent-age error. Indeed, this provides more stable convergence, able to identify faults better and has more low frequency content than classical FWI. Moreover, RNN was more suited than classical FWI at edge detection in the shallow and deep sections due to cleaner receiver residuals.

artificial intelligence, ebruary 26, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.17624

Country: Europe > Middle East > Malta (0.15)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation

Borup, Kenneth, Andersen, Lars N.

arXiv.org Machine LearningFeb-25-2021

Knowledge distillation is classically a procedure where a neural network is trained on the output of another network along with the original targets in order to transfer knowledge between the architectures. The special case of self-distillation, where the network architectures are identical, has been observed to improve generalization accuracy. In this paper, we consider an iterative variant of self-distillation in a kernel regression setting, in which successive steps incorporate both model outputs and the ground-truth targets. This allows us to provide the first theoretical results on the importance of using the weighted ground-truth targets in self-distillation. Our focus is on fitting nonlinear functions to training data with a weighted mean square error objective function suitable for distillation, subject to $\ell_2$ regularization of the model parameters. We show that any such function obtained with self-distillation can be calculated directly as a function of the initial fit, and that infinite distillation steps yields the same optimization problem as the original with amplified regularization. Finally, we examine empirically, both in a regression setting and with ResNet networks, how the choice of weighting parameter influences the generalization performance after self-distillation.

artificial intelligence, distillation step, machine learning, (15 more...)

arXiv.org Machine Learning

2102.13088

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

LazyFormer: Self Attention with Lazy Update

Ying, Chengxuan, Ke, Guolin, He, Di, Liu, Tie-Yan

arXiv.org Artificial IntelligenceFeb-25-2021

Improving the efficiency of Transformer-based language pre-training is an important task in NLP, especially for the self-attention module, which is computationally expensive. In this paper, we propose a simple but effective solution, called \emph{LazyFormer}, which computes the self-attention distribution infrequently. LazyFormer composes of multiple lazy blocks, each of which contains multiple Transformer layers. In each lazy block, the self-attention distribution is only computed once in the first layer and then is reused in all upper layers. In this way, the cost of computation could be largely saved. We also provide several training tricks for LazyFormer. Extensive experiments demonstrate the effectiveness of the proposed method.

arxiv preprint arxiv, lazyformer, transformer layer, (13 more...)

arXiv.org Artificial Intelligence

2102.12702

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Knowledge engineering mixed-integer linear programming: constraint typology

Mak-Hau, Vicky, Yearwood, John, Moran, William

arXiv.org Artificial IntelligenceFeb-20-2021

In this paper, we investigate the constraint typology of mixed-integer linear programming MILP formulations. MILP is a commonly used mathematical programming technique for modelling and solving real-life scheduling, routing, planning, resource allocation, timetabling optimization problems, providing optimized business solutions for industry sectors such as: manufacturing, agriculture, defence, healthcare, medicine, energy, finance, and transportation. Despite the numerous real-life Combinatorial Optimization Problems found and solved, and millions yet to be discovered and formulated, the number of types of constraints, the building blocks of a MILP, is relatively much smaller. In the search of a suitable machine readable knowledge representation for MILPs, we propose an optimization modelling tree built based upon an MILP ontology that can be used as a guidance for automated systems to elicit an MILP model from end-users on their combinatorial business optimization problems.

constraint, decision variable, omt, (14 more...)

arXiv.org Artificial Intelligence

2102.12574

Country:

Oceania > Australia (0.05)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback