AITopics | Jin, Hongxia

Plotting

Jin, Hongxia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Robust Semantic Frame Parsing Pipeline on a New Complex Twitter Dataset

Wang, Yu, Jin, Hongxia

arXiv.org Artificial IntelligenceDec-17-2022

Most recent semantic frame parsing systems for spoken language understanding (SLU) are designed based on recurrent neural networks. These systems display decent performance on benchmark SLU datasets such as ATIS or SNIPS, which contain short utterances with relatively simple patterns. However, the current semantic frame parsing models lack a mechanism to handle out-of-distribution (\emph{OOD}) patterns and out-of-vocabulary (\emph{OOV}) tokens. In this paper, we introduce a robust semantic frame parsing pipeline that can handle both \emph{OOD} patterns and \emph{OOV} tokens in conjunction with a new complex Twitter dataset that contains long tweets with more \emph{OOD} patterns and \emph{OOV} tokens. The new pipeline demonstrates much better results in comparison to state-of-the-art baseline SLU models on both the SNIPS dataset and the new Twitter dataset (Our new Twitter dataset can be downloaded from https://1drv.ms/u/s!AroHb-W6_OAlavK4begsDsMALfE?e=c8f2XX ). Finally, we also build an E2E application to demo the feasibility of our algorithm and show why it is useful in real application.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2212.08987

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Numerical Optimizations for Weighted Low-rank Estimation on Language Model

Hua, Ting, Hsu, Yen-Chang, Wang, Felicity, Lou, Qian, Shen, Yilin, Jin, Hongxia

arXiv.org Artificial IntelligenceDec-15-2022

Singular value decomposition (SVD) is one of the most popular compression methods that approximate a target matrix with smaller matrices. However, standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption. The parameters of a trained neural network model may affect task performance unevenly, which suggests non-equal importance among the parameters. Compared to SVD, the decomposition method aware of parameter importance is the more practical choice in real cases. Unlike standard SVD, weighted value decomposition is a non-convex optimization problem that lacks a closed-form solution. We systematically investigated multiple optimization strategies to tackle the problem and examined our method by compressing Transformer-based language models. Further, we designed a metric to predict when the SVD may introduce a significant performance drop, for which our method can be a rescue strategy. The extensive evaluations demonstrate that our method can perform better than current SOTA methods in compressing Transformer-based language models.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2211.09718

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding

Hua, Ting, Shen, Yilin, Zhao, Changsheng, Hsu, Yen-Chang, Jin, Hongxia

arXiv.org Artificial IntelligenceJan-4-2022

Domain classification is the fundamental task in natural language understanding (NLU), which often requires fast accommodation to new emerging domains. This constraint makes it impossible to retrain all previous domains, even if they are accessible to the new model. Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different. In fact, the key real-world problem is not the absence of old data, but the inefficiency to retrain the model with the whole old dataset. Is it potential to utilize some old data to yield high accuracy and maintain stable performance, while at the same time, without introducing extra hyperparameters? In this paper, we proposed a hyperparameter-free continual learning model for text data that can stably produce high performance under various environments. Specifically, we utilize Fisher information to select exemplars that can "record" key information of the original model. Also, a novel scheme called dynamical weight consolidation is proposed to enable hyperparameter-free learning during the retrain process. Extensive experiments demonstrate that baselines suffer from fluctuated performance and therefore useless in practice. On the contrary, our proposed model CCFI significantly and consistently outperforms the best state-of-the-art method by up to 20% in average accuracy, and each component of CCFI contributes effectively to overall performance.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2021.naacl-main.212

2201.0142

Country: Asia > China (0.14)

Genre: Research Report > Promising Solution (0.54)

Industry: Education > Educational Setting > Continuing Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)

Add feedback

ISEEQ: Information Seeking Question Generation using Dynamic Meta-Information Retrieval and Knowledge Graphs

Gaur, Manas, Gunaratna, Kalpa, Srinivasan, Vijay, Jin, Hongxia

arXiv.org Artificial IntelligenceDec-12-2021

Conversational Information Seeking (CIS) is a relatively new research area within conversational AI that attempts to seek information from end-users in order to understand and satisfy users' needs. If realized, such a system has far-reaching benefits in the real world; for example, a CIS system can assist clinicians in pre-screening or triaging patients in healthcare. A key open sub-problem in CIS that remains unaddressed in the literature is generating Information Seeking Questions (ISQs) based on a short initial query from the end-user. To address this open problem, we propose Information SEEking Question generator (ISEEQ), a novel approach for generating ISQs from just a short user query, given a large text corpus relevant to the user query. Firstly, ISEEQ uses a knowledge graph to enrich the user query. Secondly, ISEEQ uses the knowledge-enriched query to retrieve relevant context passages to ask coherent ISQs adhering to a conceptual flow. Thirdly, ISEEQ introduces a new deep generative-adversarial reinforcement learning-based approach for generating ISQs. We show that ISEEQ can generate high-quality ISQs to promote the development of CIS agents. ISEEQ significantly outperforms comparable baselines on five ISQ evaluation metrics across four datasets having user queries from diverse domains. Further, we argue that ISEEQ is transferable across domains for generating ISQs, as it shows the acceptable performance when trained and tested on different pairs of domains. The qualitative human evaluation confirms ISEEQ-generated ISQs are comparable in quality to human-generated questions and outperform the best comparable baseline.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2112.07622

Country: North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU

Shen, Yilin, Hsu, Yen-Chang, Ray, Avik, Jin, Hongxia

arXiv.org Artificial IntelligenceJun-28-2021

Intent classification is a major task in spoken language understanding (SLU). Since most models are built with pre-collected in-domain (IND) training utterances, their ability to detect unsupported out-of-domain (OOD) utterances has a critical effect in practical use. Recent works have shown that using extra data and labels can improve the OOD detection performance, yet it could be costly to collect such data. This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection. Our method designs a novel domain-regularized module (DRM) to reduce the overconfident phenomenon of a vanilla classifier, achieving a better generalization in both cases. Besides, DRM can be used as a drop-in replacement for the last layer in any neural network-based intent classifier, providing a low-cost strategy for a significant improvement. The evaluation on four datasets shows that our method built on BERT and RoBERTa models achieves state-of-the-art performance against existing approaches and the strong baselines we created for the comparisons.

artificial intelligence, generalization, intent classification and out-of-domain detection, (2 more...)

arXiv.org Artificial Intelligence

2106.14464

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Speech (0.53)

Add feedback

An Adversarial Learning based Multi-Step Spoken Language Understanding System through Human-Computer Interaction

Wang, Yu, Shen, Yilin, Jin, Hongxia

arXiv.org Artificial IntelligenceJun-5-2021

Most of the existing spoken language understanding systems can perform only semantic frame parsing based on a single-round user query. They cannot take users' feedback to update/add/remove slot values through multiround interactions with users. In this paper, we introduce a novel multi-step spoken language understanding system based on adversarial learning that can leverage the multiround user's feedback to update slot values. We perform two experiments on the benchmark ATIS dataset and demonstrate that the new system can improve parsing performance by at least $2.5\%$ in terms of F1, with only one round of feedback. The improvement becomes even larger when the number of feedback rounds increases. Furthermore, we also compare the new system with state-of-the-art dialogue state tracking systems and demonstrate that the new interactive system can perform better on multiround spoken language understanding tasks in terms of slot- and sentence-level accuracy.

artificial intelligence, natural language, speech recognition, (2 more...)

arXiv.org Artificial Intelligence

2106.14611

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

A Coarse to Fine Question Answering System based on Reinforcement Learning

Wang, Yu, Jin, Hongxia

arXiv.org Artificial IntelligenceJun-1-2021

In this paper, we present a coarse to fine question answering (CFQA) system based on reinforcement learning which can efficiently processes documents with different lengths by choosing appropriate actions. The system is designed using an actor-critic based deep reinforcement learning model to achieve multi-step question answering. Compared to previous QA models targeting on datasets mainly containing either short or long documents, our multi-step coarse to fine model takes the merits from multiple system modules, which can handle both short and long documents. The system hence obtains a much better accuracy and faster trainings speed compared to the current state-of-the-art models. We test our model on four QA datasets, WIKEREADING, WIKIREADING LONG, CNN and SQuAD, and demonstrate 1.3$\%$-1.7$\%$ accuracy improvements with 1.5x-3.4x training speed-ups in comparison to the baselines using state-of-the-art models.

dataset, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2106.00257

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Negative Data Augmentation

Sinha, Abhishek, Ayush, Kumar, Song, Jiaming, Uzkent, Burak, Jin, Hongxia, Ermon, Stefano

arXiv.org Artificial IntelligenceFeb-9-2021

Data augmentation is often used to enlarge datasets with synthetic samples generated in accordance with the underlying data distribution. To enable a wider range of augmentations, we explore negative data augmentation strategies (NDA) that intentionally create out-of-distribution samples. We show that such negative out-of-distribution samples provide information on the support of the data distribution, and can be leveraged for generative modeling and representation learning. We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator. We prove that under suitable conditions, optimizing the resulting objective still recovers the true data distribution but can directly bias the generator towards avoiding samples that lack the desired structure. Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities. Further, we incorporate the same negative data augmentation strategy in a contrastive learning framework for self-supervised representation learning on images and videos, achieving improved performance on downstream image classification, object detection, and action recognition tasks. These results suggest that prior knowledge on what does not constitute valid data is an effective form of weak supervision across a range of unsupervised learning tasks. Data augmentation strategies for synthesizing new data in a way that is consistent with an underlying task are extremely effective in both supervised and unsupervised learning (Oord et al., 2018; Zhang et al., 2016; Noroozi & Favaro, 2016; Asano et al., 2019).

deep learning, neural network, representation, (20 more...)

arXiv.org Artificial Intelligence

2102.05113

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Modeling Token-level Uncertainty to Learn Unknown Concepts in SLU via Calibrated Dirichlet Prior RNN

Shen, Yilin, Chen, Wenhu, Jin, Hongxia

arXiv.org Artificial IntelligenceOct-15-2020

One major task of spoken language understanding (SLU) in modern personal assistants is to extract semantic concepts from an utterance, called slot filling. Although existing slot filling models attempted to improve extracting new concepts that are not seen in training data, the performance in practice is still not satisfied. Recent research collected question and answer annotated data to learn what is unknown and should be asked, yet not practically scalable due to the heavy data collection effort. In this paper, we incorporate softmax-based slot filling neural architectures to model the sequence uncertainty without question supervision. We design a Dirichlet Prior RNN to model high-order uncertainty by degenerating as softmax layer for RNN model training. To further enhance the uncertainty modeling robustness, we propose a novel multi-task training to calibrate the Dirichlet concentration parameters. We collect unseen concepts to create two test datasets from SLU benchmark datasets Snips and ATIS. On these two and another existing Concept Learning benchmark datasets, we show that our approach significantly outperforms state-of-the-art approaches by up to 8.18%. Our method is generic and can be applied to any RNN or Transformer based slot filling models with a softmax layer.

deep learning, neural network, rnn, (19 more...)

arXiv.org Artificial Intelligence

2010.08101

Country:

North America > United States > California > Santa Clara County (0.14)
North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Complex KBQA System using Multiple Reasoning Paths

Qin, Kechen, Wang, Yu, Li, Cheng, Gunaratna, Kalpa, Jin, Hongxia, Pavlu, Virgil, Aslam, Javed A.

arXiv.org Machine LearningMay-21-2020

Multi-hop knowledge based question answering (KBQA) is a complex task for natural language understanding. Many KBQA approaches have been proposed in recent years, and most of them are trained based on labeled reasoning path. This hinders the system's performance as many correct reasoning paths are not labeled as ground truth, and thus they cannot be learned. In this paper, we introduce an end-to-end KBQA system which can leverage multiple reasoning paths' information and only requires labeled answer as supervision. We conduct experiments on several benchmark datasets containing both single-hop simple questions as well as muti-hop complex questions, including WebQuestionSP (WQSP), ComplexWebQuestion-1.1 (CWQ), and PathQuestion-Large (PQL), and demonstrate strong performance.

artificial intelligence, inductive learning, reasoning path, (19 more...)

arXiv.org Machine Learning

2005.1097

Country:

Europe (0.93)
North America > United States > California (0.14)
North America > United States > Texas (0.14)
(2 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback