AITopics | aed

Collaborating Authors

aed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

Neural Information Processing SystemsMar-22-2026, 04:38:21 GMT

Modern systems for automatic speech recognition, including the RNN-Transducer and Attention-based Encoder-Decoder (AED), are designed so that the encoder is not required to alter the time-position of information from the audio sequence into the embedding; alignment to the final text output is processed during decoding. We discover that the transformer-based encoder adopted in recent years is actually capable of performing the alignment internally during the forward pass, prior to decoding. This new phenomenon enables a simpler and more efficient model, the ''Aligner-Encoder''. To train it, we discard the dynamic programming of RNN-T in favor of the frame-wise cross-entropy loss of AED, while the decoder employs the lighter text-only recurrence of RNN-T without learned cross-attention---it simply scans embedding frames in order from the beginning, producing one token each until predicting the end-of-message. We conduct experiments demonstrating performance remarkably close to the state of the art, including a special inference configuration enabling long-form recognition. In a representative comparison, we measure the total inference time for our model to be 2x faster than RNN-T and 16x faster than AED. Lastly, we find that the audio-text alignment is clearly visible in the self-attention weights of a certain layer, which could be said to perform ''self-transduction''.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Drones are delivering life-saving defibrillators to 911 calls

A new pilot program aims to help EMS respond quicker, not act as a replacement. Breakthroughs, discoveries, and DIY tips sent every weekday. When they aren't baffling the public or grounding wildfire planes, drones have some pretty solid uses. Apart from unnecessarily fast same-day deliveries, the pilotless aircrafts may soon become a lifesaving emergency response tool . A collaborative team of health experts, community organizations, and universities are in the middle of a pilot program using drones and automated external defibrillators (AEDs).

andrew paul, artificial intelligence, defibrillator, (16 more...)

Popular Science

Country:

North America > United States > Virginia (0.17)
North America > United States > North Carolina (0.15)

Genre:

Research Report > Experimental Study (0.72)
Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.51)

Add feedback

Enhanced Hybrid Transducer and Attention Encoder Decoder with Text Data

Tang, Yun, Kim, Eesung, Apsingekar, Vijendra Raj

arXiv.org Artificial IntelligenceJun-25-2025

A joint speech and text optimization method is proposed for hybrid transducer and attention-based encoder decoder (TAED) modeling to leverage large amounts of text corpus and enhance ASR accuracy. The joint TAED (J-TAED) is trained with both speech and text input modalities together, while it only takes speech data as input during inference. The trained model can unify the internal representations from different modalities, and be further extended to text-based domain adaptation. It can effectively alleviate data scarcity for mismatch domain tasks since no speech data is required. Our experiments show J-TAED successfully integrates speech and linguistic information into one model, and reduce the WER by 5.8 ~12.8% on the Librispeech dataset. The model is also evaluated on two out-of-domain datasets: one is finance and another is named entity focused. The text-based domain adaptation brings 15.3% and 17.8% WER reduction on those two datasets respectively.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.19159

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

Neural Information Processing SystemsMay-27-2025, 13:42:37 GMT

aligner-encoder, self-attention transformer, self-transducer, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.62)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

Stooke, Adam, Prabhavalkar, Rohit, Sim, Khe Chai, Mengibar, Pedro Moreno

arXiv.org Artificial IntelligenceFeb-6-2025

Modern systems for automatic speech recognition, including the RNN-Transducer and Attention-based Encoder-Decoder (AED), are designed so that the encoder is not required to alter the time-position of information from the audio sequence into the embedding; alignment to the final text output is processed during decoding. We discover that the transformer-based encoder adopted in recent years is actually capable of performing the alignment internally during the forward pass, prior to decoding. This new phenomenon enables a simpler and more efficient model, the "Aligner-Encoder". To train it, we discard the dynamic programming of RNN-T in favor of the frame-wise cross-entropy loss of AED, while the decoder employs the lighter text-only recurrence of RNN-T without learned cross-attention -- it simply scans embedding frames in order from the beginning, producing one token each until predicting the end-of-message. We conduct experiments demonstrating performance remarkably close to the state of the art, including a special inference configuration enabling long-form recognition. In a representative comparison, we measure the total inference time for our model to be 2x faster than RNN-T and 16x faster than AED. Lastly, we find that the audio-text alignment is clearly visible in the self-attention weights of a certain layer, which could be said to perform "self-transduction".

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.05232

Country:

North America > United States (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > South Korea > Incheon > Incheon (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Enhancing CTC-based speech recognition with diverse modeling units

Han, Shiyi, Lei, Zhihong, Xu, Mingbin, Na, Xingyu, Huang, Zhen

arXiv.org Artificial IntelligenceJun-11-2024

In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer. On top of E2E systems, researchers have achieved substantial accuracy improvement by rescoring E2E model's N-best hypotheses with a phoneme-based model. This raises an interesting question about where the improvements come from other than the system combination effect. We examine the underlying mechanisms driving these gains and propose an efficient joint training approach, where E2E models are trained jointly with diverse modeling units. This methodology does not only align the strengths of both phoneme and grapheme-based models but also reveals that using these diverse modeling units in a synergistic way can significantly enhance model accuracy. Our findings offer new insights into the optimal integration of heterogeneous modeling units in the development of more robust and accurate ASR systems.

recognition, representation, speech recognition, (15 more...)

arXiv.org Artificial Intelligence

2406.03274

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Berlin (0.04)
(7 more...)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Klie, Jan-Christoph, Webber, Bonnie, Gurevych, Iryna

arXiv.org Artificial IntelligenceSep-25-2022

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that several popular datasets contain a surprising amount of annotation errors or inconsistencies. To alleviate this issue, many methods for annotation error detection have been devised over the years. While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets. This raises strong concerns on methods' general performance and makes it difficult to asses their strengths and weaknesses. We therefore reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets for text classification as well as token and span labeling. In addition, we define a uniform evaluation setup including a new formalization of the annotation error detection task, evaluation protocol and general best practices. To facilitate future research and reproducibility, we release our datasets and implementations in an easy-to-use and open source software package.

machine learning, natural language, text classification, (18 more...)

arXiv.org Artificial Intelligence

2206.0228

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Greater London > London > Wimbledon (0.04)
(48 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.67)
(2 more...)

Add feedback

Drone saves the life of man, 71, suffering a heart attack by delivering defibrillator to his home

Daily Mail - Science & techJan-7-2022, 22:42:51 GMT

A 71-year-old Swedish man who suffered a heart attack while shoveling snow in his driveway was saved by an unlikely hero - a delivery drone. Sven, a retiree who asked for his last name to be withheld, collapsed outside his home in the western town of Trollhättan in early December. Within moments of receiving the call from Sven's wife, emergency services dispatched the unmanned aerial vehicle carrying an AED, or automated external defibrillator, which arrived in less than four minutes. The system, called Emergency Medical Aerial Delivery (EMADE), was developed by Everdrones to assist patients within 10 minutes of experiencing cardiac arrest. 'Everything from the first 112 call to the drone getting the signal to start and go took about 15-30 seconds and then the whole process took about three and a half minutes,' Sven told AFP.

defibrillator, drone, heart attack, (13 more...)

Daily Mail - Science & tech

Country: Europe > Sweden (0.07)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.74)

Add feedback

A New First Responder: How Drones May Revolutionize Healthcare

#artificialintelligenceSep-7-2021, 00:30:17 GMT

A new article published last week in the European Heart Journal discusses the use of drones for delivering life-saving automated external defibrillators (AED) to out-of-hospital cardiac arrest (OHCA) patients. As the study describes, "Early treatment in line with the'chain-of-survival' concept such as cardiopulmonary resuscitation (CPR) and defibrillation by an automated external defibrillator (AED) prior to ambulance arrival is associated with increased survival. Use of AEDs in the early-cardiac-arrest electrical phase can increase survival rates to up to 50–70%. Although hundreds of thousands of AEDs are available in high-income countries, their accessibility and use are still low." Thus, the investigators of the study designed a system to deploy drones to real-life suspected OHCA patients in order to determine whether this was a viable solution to the accessibility problem.

delivery, drone, drone technology, (15 more...)

#artificialintelligence

Country:

Africa > Ghana > Greater Accra > Accra (0.06)
South America > Brazil > Minas Gerais > Belo Horizonte (0.05)
Europe > Germany > Saxony > Leipzig (0.05)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

Reinforcement Learning for Robot Navigation with Adaptive ExecutionDuration (AED) in a Semi-Markov Model

Chen, Yu'an, Ye, Ruosong, Tao, Ziyang, Liu, Hongjian, Chen, Guangda, Peng, Jie, Ma, Jun, Zhang, Yu, Zhang, Yanyong, Ji, Jianmin

arXiv.org Artificial IntelligenceAug-13-2021

Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, through directly mapping perception inputs into robot control commands. Most existing methods adopt uniform execution duration with robots taking commands at fixed intervals. As such, the length of execution duration becomes a crucial parameter to the navigation algorithm. In particular, if the duration is too short, then the navigation policy would be executed at a high frequency, with increased training difficulty and high computational cost. Meanwhile, if the duration is too long, then the policy becomes unable to handle complex situations, like those with crowded obstacles. It is thus tricky to find the "sweet" duration range; some duration values may render a DRL model to fail to find a navigation path. In this paper, we propose to employ adaptive execution duration to overcome this problem. Specifically, we formulate the navigation task as a Semi-Markov Decision Process (SMDP) problem to handle adaptive execution duration. We also improve the distributed proximal policy optimization (DPPO) algorithm and provide its theoretical guarantee for the specified SMDP problem. We evaluate our approach both in the simulator and on an actual robot. The results show that our approach outperforms the other DRL-based method (with fixed execution duration) by 10.3% in terms of the navigation success rate.

adaptive executionduration, reinforcement learning, semi-markov model, (2 more...)

arXiv.org Artificial Intelligence

2108.06161

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback