AITopics | Ma, Yukun

Plotting

Ma, Yukun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive Knowledge Distillation between Text and Speech Pre-trained Models

Ni, Jinjie, Ma, Yukun, Wang, Wen, Chen, Qian, Ng, Dianwen, Lei, Han, Nguyen, Trung Hieu, Zhang, Chong, Ma, Bin, Cambria, Erik

arXiv.org Artificial IntelligenceMar-6-2023

Learning on a massive amount of speech corpus leads to the recent success of many self-supervised speech models. With knowledge distillation, these models may also benefit from the knowledge encoded by language models that are pre-trained on rich sources of texts. The distillation process, however, is challenging due to the modal disparity between textual and speech embedding spaces. This paper studies metric-based distillation to align the embedding space of text and speech with only a small amount of data without modifying the model structure. Since the semantic and granularity gap between text and speech has been omitted in literature, which impairs the distillation, we propose the Prior-informed Adaptive knowledge Distillation (PAD) that adaptively leverages text/speech units of variable granularity and prior distributions to achieve better global and local alignments between text and speech pre-trained models. We evaluate on three spoken language understanding benchmarks to show that PAD is more effective in transferring linguistic knowledge than other metric-based distillation approaches.

alignment, artificial intelligence, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.036

Country:

Europe (0.68)
North America > United States (0.46)
Oceania > Australia (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.89)

Add feedback

Learning Classifiers on Positive and Unlabeled Data with Policy Gradient

Li, Tianyu, Wang, Chien-Chih, Ma, Yukun, Ortal, Patricia, Zhao, Qifang, Stenger, Bjorn, Hirate, Yu

arXiv.org Machine LearningOct-15-2019

--Existing algorithms aiming to learn a binary classifier from positive (P) and unlabeled (U) data require estimating the class prior or label noise ahead of building a classification model. However, the estimation and classifier learning are normally conducted in a pipeline instead of being jointly optimized. In this paper, we propose to alternatively train the two steps using reinforcement learning. Our proposal adopts a policy network to adaptively make assumptions on the labels of unlabeled data, while a classifier is built upon the output of the policy network and provides rewards to learn a better policy. The dynamic and interactive training between the policy maker and the classifier can exploit the unlabeled data in a more effective manner and yield a significant improvement in terms of classification performance. Furthermore, we present two different approaches to represent the actions taken by the policy. The first approach considers continuous actions as soft labels, while the other uses discrete actions as hard assignment of labels for unlabeled examples. We validate the effectiveness of the proposed method on two public benchmark datasets as well as one e-commerce dataset. The results show that the proposed method is able to consistently outperform state-of-the-art methods in various settings. PU learning refers to the problem of learning from a dataset where only a subset of examples are positively labeled and the rest are not annotated at all. It is a critical task due to its prevalence in various real-world applications [1], [2], [3]. In many common situations only positive data are available, for instance, an e-commerce website may only record users who have clicked on advertisements or purchased items. Meanwhile, it is not possible to simply assume that unlabeled instances are negative.

classifier, inductive learning, neural network, (19 more...)

arXiv.org Machine Learning

1910.06535

Country:

North America > United States > California (0.28)
Europe > Spain > Catalonia (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Services > e-Commerce Services (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM

Ma, Yukun (Nanyang Technological University) | Peng, Haiyun (Nanyang Technological University) | Cambria, Erik (Nanyang Technological University)

AAAI ConferencesFeb-8-2018

Analyzing people’s opinions and sentiments towards certain aspects is an important task of natural language understanding. In this paper, we propose a novel solution to targeted aspect-based sentiment analysis, which tackles the challenges of both aspect-based sentiment analysis and targeted sentiment analysis by exploiting commonsense knowledge. We augment the long short-term memory (LSTM) network with a hierarchical attention mechanism consisting of a target-level attention and a sentence-level attention. Commonsense knowledge of sentiment-related concepts is incorporated into the end-to-end training of a deep neural network for sentiment classification. In order to tightly integrate the commonsense knowledge into the recurrent encoder, we propose an extension of LSTM, termed Sentic LSTM. We conduct experiments on two publicly released datasets, which show that the combination of the proposed attention architecture and Sentic LSTM can outperform state-of-the-art methods in targeted aspect sentiment tasks.

deep learning, neural network, sentiment analysis, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Europe > United Kingdom > England > Greater London > London (0.14)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback