Improving the Performance of the LSTM and HMM Models via Hybridization

Liu, Larkin, Lin, Yu-Chung, Reid, Joshua

Jul-9-2019–arXiv.org Machine Learning

Language modelling has been an integral part of providing an understanding of the nature of language to capture its meaning. In order to improve the machine understanding of language using sequential models, we seek to explore two prominent areas of statistical language models, the Hidden Markov Model (HMM), and a Recurrent Neural Network (RNN) architecture, known commonly as Long Short-Term Memory (LSTM). Under a discrete stochastic modelling framework, HMM's were first introduced in Rabiner [1] to classify speech signals. First used to automate AT&T's voice activated call center, the revolutionary technology allowed computers to robustly characterise speech, and form a basic understanding of spoken words. HMM's have since become a definitive benchmark for the state-of-the-art for speech recognition, and text recognition. Around the same period, RNN's were introduced by Rumelhart et al. [2], however, the training complexity of the model was far too high and not commensurate with the hardware capabilities at the time. In the 21st century, With the introduction of more advanced hardware for deep learning model training, came a wave of applications for the RNN for both voice, text recognition, [3], [4], [5] and machine translation [6]. In parallel, an early form of neural language model was developed in Bengio et al. [7], displaying promising results in statistical language modelling. LSTM's were the first introduced in Hochreiter and Schmidhuber [8], specifically to combat the vanishing gradient problem, which will be further addressed in Section 1.2.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

Jul-9-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Massachusetts
    - Middlesex County > Cambridge (0.04)
  - Canada > Ontario
    - Toronto (0.29)
- Asia > Middle East
  - Qatar > Ad-Dawhah > Doha (0.04)

Genre:
- Overview (0.67)
- Research Report (0.65)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found