AITopics | sgem

Collaborating Authors

sgem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable physics-informed deep generative model for solving forward and inverse stochastic differential equations

Zhou, Shaoqian, You, Wen, Guo, Ling, Meng, Xuhui

arXiv.org Machine LearningMar-23-2025

Physics-informed deep learning approaches have been developed to solve forward and inverse stochastic differential equation (SDE) problems with high-dimensional stochastic space. However, the existing deep learning models have difficulties solving SDEs with high-dimensional spatial space. In the present study, we propose a scalable physics-informed deep generative model (sPI-GeM), which is capable of solving SDE problems with both high-dimensional stochastic and spatial space. The sPI-GeM consists of two deep learning models, i.e., (1) physics-informed basis networks (PI-BasisNet), which are used to learn the basis functions as well as the coefficients given data on a certain stochastic process or random field, and (2) physics-informed deep generative model (PI-GeM), which learns the distribution over the coefficients obtained from the PI-BasisNet. The new samples for the learned stochastic process can then be obtained using the inner product between the output of the generator and the basis functions from the trained PI-BasisNet. The sPI-GeM addresses the scalability in the spatial space in a similar way as in the widely used dimensionality reduction technique, i.e., principal component analysis (PCA). A series of numerical experiments, including approximation of Gaussian and non-Gaussian stochastic processes, forward and inverse SDE problems, are performed to demonstrate the accuracy of the proposed model. Furthermore, we also show the scalability of the sPI-GeM in both the stochastic and spatial space using an example of a forward SDE problem with 38- and 20-dimension stochastic and spatial space, respectively.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2503.18012

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)

Add feedback

Personalized Speech Recognition for Children with Test-Time Adaptation

Shi, Zhonghao, Srivastava, Harshvardhan, Shi, Xuan, Narayanan, Shrikanth, Matarić, Maja J.

arXiv.org Artificial IntelligenceSep-19-2024

Accurate automatic speech recognition (ASR) for children is crucial for effective real-time child-AI interaction, especially in educational applications. However, off-the-shelf ASR models primarily pre-trained on adult data tend to generalize poorly to children's speech due to the data domain shift from adults to children. Recent studies have found that supervised fine-tuning on children's speech data can help bridge this domain shift, but human annotations may be impractical to obtain for real-world applications and adaptation at training time can overlook additional domain shifts occurring at test time. We devised a novel ASR pipeline to apply unsupervised test-time adaptation (TTA) methods for child speech recognition, so that ASR models pre-trained on adult speech can be continuously adapted to each child speaker at test time without further human annotations. Our results show that ASR models adapted with TTA methods significantly outperform the unadapted off-the-shelf ASR baselines both on average and statistically across individual child speakers. Our analysis also discovered significant data domain shifts both between child speakers and within each child speaker, which further motivates the need for test-time adaptation.

child speaker, recognition, speech recognition, (13 more...)

arXiv.org Artificial Intelligence

2409.13095

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization

Kim, Changhun, Park, Joonhyung, Shim, Hajin, Yang, Eunho

arXiv.org Artificial IntelligenceJun-21-2023

Automatic speech recognition (ASR) models are frequently exposed to data distribution shifts in many real-world scenarios, leading to erroneous predictions. To tackle this issue, an existing test-time adaptation (TTA) method has recently been proposed to adapt the pre-trained ASR model on unlabeled test instances without source data. Despite decent performance gain, this work relies solely on naive greedy decoding and performs adaptation across timesteps at a frame level, which may not be optimal given the sequential nature of the model output. Motivated by this, we propose a novel TTA framework, dubbed SGEM, for general ASR models. To treat the sequential output, SGEM first exploits beam search to explore candidate output logits and selects the most plausible one. Then, it utilizes generalized entropy minimization and negative sampling as unsupervised objectives to adapt the model. SGEM achieves state-of-the-art performance for three mainstream ASR models under various domain shifts.

artificial intelligence, asr model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.01981

Country:

Asia > South Korea (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SGEM: stochastic gradient with energy and momentum

Liu, Hailiang, Tian, Xuping

arXiv.org Artificial IntelligenceAug-3-2022

In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.

aegd, convergence, sgem, (14 more...)

arXiv.org Artificial Intelligence

2208.02208

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Iowa > Story County > Ames (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback