AITopics | Lane, Ian

Collaborating Authors

Lane, Ian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Don't Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models

Vakharia, Priyesh, Joshi, Devavrat, Chavan, Meenal, Sonawane, Dhananjay, Garg, Bhrigu, Mazaheri, Parsa, Lane, Ian

arXiv.org Artificial IntelligenceDec-21-2023

Large Language Models (LLMs) are adept at text manipulation -- tasks such as machine translation and text summarization. However, these models can also be prone to hallucination, which can be detrimental to the faithfulness of any answers that the model provides. Recent works in combating hallucinations in LLMs deal with identifying hallucinated sentences and categorizing the different ways in which models hallucinate. This paper takes a deep dive into LLM behavior with respect to hallucinations, defines a token-level approach to identifying different kinds of hallucinations, and further utilizes this token-level tagging to improve the interpretability and faithfulness of LLMs in dialogue summarization tasks. Through this, the paper presents a new, enhanced dataset and a new training paradigm.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.14346

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Understanding and Improving Recurrent Networks for Human Activity Recognition by Continuous Attention

Zeng, Ming, Gao, Haoxiang, Yu, Tong, Mengshoel, Ole J., Langseth, Helge, Lane, Ian, Liu, Xiaobing

arXiv.org Artificial IntelligenceOct-7-2018

Deep neural networks, including recurrent networks, have been successfully applied to human activity recognition. Unfortunately, the final representation learned by recurrent networks might encode some noise (irrelevant signal components, unimportant sensor modalities, etc.). Besides, it is difficult to interpret the recurrent networks to gain insight into the models' behavior. To address these issues, we propose two attention models for human activity recognition: temporal attention and sensor attention. These two mechanisms adaptively focus on important signals and sensor modalities. To further improve the understandability and mean F1 score, we add continuity constraints, considering that continuous sensor signals are more robust than discrete ones. We evaluate the approaches on three datasets and obtain state-of-the-art results. Furthermore, qualitative analysis shows that the attention learned by the models agree well with human intuition.

activity recognition, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1810.04038

Country:

Asia > Singapore (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models

Liu, Bing (Carnegie Mellon University) | Yu, Tong ( Carnegie Mellon University ) | Lane, Ian ( Carnegie Mellon University ) | Mengshoel, Ole J. (Carnegie Mellon University)

AAAI ConferencesFeb-8-2018

Dialog response selection is an important step towards natural response generation in conversational agents. Existing work on neural conversational models mainly focuses on offline supervised learning using a large set of context-response pairs. In this paper, we focus on online learning of response selection in retrieval-based dialog systems. We propose a contextual multi-armed bandit model with a nonlinear reward function that uses distributed representation of text for online response selection. A bidirectional LSTM is used to produce the distributed representations of dialog context and responses, which serve as the input to a contextual bandit. In learning the bandit, we propose a customized Thompson sampling method that is applied to a polynomial feature space in approximating the reward. Experimental results on the Ubuntu Dialogue Corpus demonstrate significant performance gains of the proposed method over conventional linear contextual bandits. Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.

bandit, deep learning, neural network, (21 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report (0.30)

Industry: Education > Educational Setting > Online (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semi-Supervised Convolutional Neural Networks for Human Activity Recognition

Zeng, Ming, Yu, Tong, Wang, Xiao, Nguyen, Le T., Mengshoel, Ole J., Lane, Ian

arXiv.org Machine LearningJan-22-2018

Labeled data used for training activity recognition classifiers are usually limited in terms of size and diversity. Thus, the learned model may not generalize well when used in real-world use cases. Semi-supervised learning augments labeled examples with unlabeled examples, often resulting in improved performance. However, the semi-supervised methods studied in the activity recognition literatures assume that feature engineering is already done. In this paper, we lift this assumption and present two semi-supervised methods based on convolutional neural networks (CNNs) to learn discriminative hidden features. Our semi-supervised CNNs learn from both labeled and unlabeled data while also performing feature learning on raw sensor data. In experiments on three real world datasets, we show that our CNNs outperform supervised methods and traditional semi-supervised learning methods by up to 18% in mean F1-score (Fm).

deep learning, neural network, unlabeled data, (18 more...)

arXiv.org Machine Learning

1801.07827

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.68)
Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Oral Exam for Measuring a Dialog System’s Capabilities

Cohen, David (Carnegie Mellon University) | Lane, Ian (Carnegie Mellon University)

AAAI ConferencesApr-19-2016

This paper suggests a model and methodology for measuring the breadth and flexibility of a dialog system's capabilities. The approach relies on having human evaluators administer a targeted oral exam to a system and provide their subjective views of that system's performance on each test problem. We present results from one instantiation of this test being performed on two publicly-accessible dialog systems and a human, and show that the suggested metrics do provide useful insights into the relative strengths and weaknesses of these systems. Results suggest that this approach can be performed with reasonable reliability and with reasonable amounts of effort. We hope that authors will augment their reporting with this approach to improve clarity and make more direct progress toward broadly-capable dialog systems.

artificial intelligence, evaluator, natural language, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.29)

Genre: Research Report (0.68)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Transferring Knowledge from a RNN to a DNN

Chan, William, Ke, Nan Rosemary, Lane, Ian

arXiv.org Machine LearningApr-7-2015

Deep Neural Network (DNN) acoustic models have yielded many state-of-the-art results in Automatic Speech Recognition (ASR) tasks. More recently, Recurrent Neural Network (RNN) models have been shown to outperform DNNs counterparts. However, state-of-the-art DNN and RNN models tend to be impractical to deploy on embedded systems with limited computational capacity. Traditionally, the approach for embedded platforms is to either train a small DNN directly, or to train a small DNN that learns the output distribution of a large DNN. In this paper, we utilize a state-of-the-art RNN to transfer knowledge to small DNN. We use the RNN model to generate soft alignments and minimize the Kullback-Leibler divergence against the small DNN. The small DNN trained on the soft RNN alignments achieved a 3.93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4.54 WER or more than 13% relative improvement.

alignment, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1504.01483

Genre: Research Report (0.50)

Industry: Media > News (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback