AITopics | command recognition

Collaborating Authors

command recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Speech Command Recognition Using LogNNet Reservoir Computing for Embedded Systems

Izotov, Yuriy, Velichko, Andrei

arXiv.org Artificial IntelligenceSep-3-2025

This paper presents a low-resource speech-command recognizer combining energy-based voice activity detection (VAD), an optimized Mel-Frequency Cepstral Coefficients (MFCC) pipeline, and the LogNNet reservoir-computing classifier. Using four commands from the Speech Commands da-taset downsampled to 8 kHz, we evaluate four MFCC aggregation schemes and find that adaptive binning (64-dimensional feature vector) offers the best accuracy-to-compactness trade-off. The LogNNet classifier with architecture 64:33:9:4 reaches 92.04% accuracy under speaker-independent evaluation, while requiring significantly fewer parameters than conventional deep learn-ing models. Hardware implementation on Arduino Nano 33 IoT (ARM Cor-tex-M0+, 48 MHz, 32 KB RAM) validates the practical feasibility, achieving ~90% real-time recognition accuracy while consuming only 18 KB RAM (55% utilization). The complete pipeline (VAD -> MFCC -> LogNNet) thus enables reliable on-device speech-command recognition under strict memory and compute limits, making it suitable for battery-powered IoT nodes, wire-less sensor networks, and hands-free control interfaces.

accuracy, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.00862

Country: Europe > Russia (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.88)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Large Language Model Data Generation for Enhanced Intent Recognition in German Speech

Rosin, Theresa Pekarek, Kaplan, Burak Can, Wermter, Stefan

arXiv.org Artificial IntelligenceAug-11-2025

Intent recognition (IR) for speech commands is essential for artificial intelligence (AI) assistant systems; however, most existing approaches are limited to short commands and are predominantly developed for English. This paper addresses these limitations by focusing on IR from speech by elderly German speakers. We propose a novel approach that combines an adapted Whisper ASR model, fine-tuned on elderly German speech (SVC-de), with Transformer-based language models trained on synthetic text datasets generated by three well-known large language models (LLMs): LeoLM, Llama3, and ChatGPT. To evaluate the robustness of our approach, we generate synthetic speech with a text-to-speech model and conduct extensive cross-dataset testing. Our results show that synthetic LLM-generated data significantly boosts classification performance and robustness to different speaking styles and unseen vocabulary. Notably, we find that LeoLM, a smaller, domain-specific 13B LLM, surpasses the much larger ChatGPT (175B) in dataset quality for German intent recognition. Our approach demonstrates that generative AI can effectively bridge data gaps in low-resource domains. We provide detailed documentation of our data generation and training process to ensure transparency and reproducibility.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.06277

Country:

Asia > Middle East (0.46)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning

Lin, Yuanxi, Zhou, Tonglin, Xiao, Yang

arXiv.org Artificial IntelligenceJun-28-2024

Accurate recognition of aviation commands is vital for flight safety and efficiency, as pilots must follow air traffic control instructions precisely. This paper addresses challenges in speech command recognition, such as noisy environments and limited computational resources, by advancing keyword spotting technology. We create a dataset of standardized airport tower commands, including routine and emergency instructions. We enhance broadcasted residual learning with squeeze-and-excitation and time-frame frequency-wise squeeze-and-excitation techniques, resulting in our BC-SENet model. This model focuses on crucial information with fewer parameters. Our tests on five keyword spotting models, including BC-SENet, demonstrate superior accuracy and efficiency. These findings highlight the effectiveness of our model advancements in improving speech command recognition for aviation safety and efficiency in noisy, high-stakes environments. Additionally, BC-SENet shows comparable performance on the common Google Speech Command dataset.

convolution, dataset, keyword, (15 more...)

arXiv.org Artificial Intelligence

2406.18313

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Singapore (0.04)
Asia > Russia (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech (0.70)

Add feedback

Classical-to-Quantum Transfer Learning for Spoken Command Recognition Based on Quantum Neural Networks

Qi, Jun, Tejedor, Javier

arXiv.org Artificial IntelligenceOct-16-2021

This work investigates an extension of transfer learning applied in machine learning algorithms to the emerging hybrid end-to-end quantum neural network (QNN) for spoken command recognition (SCR). Our QNN-based SCR system is composed of classical and quantum components: (1) the classical part mainly relies on a 1D convolutional neural network (CNN) to extract speech features; (2) the quantum part is built upon the variational quantum circuit with a few learnable parameters. Since it is inefficient to train the hybrid end-to-end QNN from scratch on a noisy intermediate-scale quantum (NISQ) device, we put forth a hybrid transfer learning algorithm that allows a pre-trained classical network to be transferred to the classical part of the hybrid QNN model. The pre-trained classical network is further modified and augmented through jointly fine-tuning with a variational quantum circuit (VQC). The hybrid transfer learning methodology is particularly attractive for the task of QNN-based SCR because low-dimensional classical features are expected to be encoded into quantum states. We assess the hybrid transfer learning algorithm applied to the hybrid classical-quantum QNN for SCR on the Google speech command dataset, and our classical simulation results suggest that the hybrid transfer learning can boost our baseline performance on the SCR task.

hybrid transfer, qnn, quantum circuit, (13 more...)

arXiv.org Artificial Intelligence

2110.08689

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming

Yen, Hao, Ku, Pin-Jui, Yang, Chao-Han Huck, Hu, Hu, Siniscalchi, Sabato Marco, Chen, Pin-Yu, Tsao, Yu

arXiv.org Artificial IntelligenceOct-8-2021

In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system. The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model (from the source domain). To solve the label mismatches between source and target domains, and further improve the stability of AR, we propose a novel similarity-based label mapping technique to align classes. In addition, the transfer learning (TL) technique is combined with the original AR process to improve the model adaptation capability. We evaluate the proposed AR-SCR system on three low-resource SCR datasets, including Arabic, Lithuanian, and dysarthric Mandarin speech. Experimental results show that with a pretrained AM trained on a large-scale English dataset, the proposed AR-SCR system outperforms the current state-of-the-art results on Arabic and Lithuanian speech commands datasets, with only a limited amount of training data.

command recognition, dataset, recognition, (13 more...)

arXiv.org Artificial Intelligence

2110.03894

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback