AITopics | Harrison, Brent

Collaborating Authors

Harrison, Brent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Goofus & Gallant Story Corpus for Practical Value Alignment

Nahian, Md Sultan Al, Tasrin, Tasmia, Frazier, Spencer, Riedl, Mark, Harrison, Brent

arXiv.org Artificial IntelligenceJan-16-2025

Values or principles are key elements of human society that influence people to behave and function according to an accepted standard set of social rules to maintain social order. As AI systems are becoming ubiquitous in human society, it is a major concern that they could violate these norms or values and potentially cause harm. Thus, to prevent intentional or unintentional harm, AI systems are expected to take actions that align with these principles. Training systems to exhibit this type of behavior is difficult and often requires a specialized dataset. This work presents a multi-modal dataset illustrating normative and non-normative behavior in real-life situations described through natural language and artistic images. This training set contains curated sets of images that are designed to teach young children about social principles. We argue that this is an ideal dataset to use for training socially normative agents given this fact.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICMLA61862.2024.00098

2501.09707

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Guiding Reinforcement Learning Using Uncertainty-Aware Large Language Models

Shoaeinaeini, Maryam, Harrison, Brent

arXiv.org Artificial IntelligenceNov-15-2024

Human guidance in reinforcement learning (RL) is often impractical for large-scale applications due to high costs and time constraints. Large Language Models (LLMs) offer a promising alternative to mitigate RL sample inefficiency and potentially replace human trainers. However, applying LLMs as RL trainers is challenging due to their overconfidence and less reliable solutions in sequential tasks. We address this limitation by introducing a calibrated guidance system that uses Monte Carlo Dropout to enhance LLM advice reliability by assessing prediction variances from multiple forward passes. Additionally, we develop a novel RL policy shaping method based on dynamic model average entropy to adjust the LLM's influence on RL policies according to guidance uncertainty. This approach ensures robust RL training by relying on reliable LLM guidance. To validate our contributions, we conduct extensive experiments in a Minigrid environment with three goals in varying environment sizes. The results showcase superior model performance compared to uncalibrated LLMs, unguided RL, and calibrated LLMs with different shaping policies. Moreover, we analyze various uncertainty estimation methods, demonstrating the effectiveness of average entropy in reflecting higher uncertainty in incorrect guidance. These findings highlight the persistent overconfidence in fine-tuned LLMs and underscore the importance of effective calibration in sequential decision-making problems.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.14457

Country: North America > United States > Kentucky (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Controllable Neural Story Plot Generation via Reward Shaping

Tambwekar, Pradyumna, Dhuliawala, Murtaza, Martin, Lara J., Mehta, Animesh, Harrison, Brent, Riedl, Mark O.

arXiv.org Artificial IntelligenceJan-18-2023

By themselves, large neural language models have Language-modeling-based approaches to story been shown to work well with a variety of short-term tasks, plot generation attempt to construct a plot by sampling such as understanding short children's stories [Radford et al., from a language model (LM) to predict the 2019]. However, while recurrent neural networks (RNNs) using next character, word, or sentence to add to the story. LSTM or GRU cells can theoretically maintain long-term LM techniques lack the ability to receive guidance context in their hidden layers, in practice RNNs only use a from the user to achieve a specific goal, resulting in relatively small part of the history of tokens [Khandelwal et stories that don't have a clear sense of progression al., 2018]. Consequently, stories or plots generated by RNNs and lack coherence. We present a reward-shaping tend to lose coherence as the generation continues.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.24963/ijcai.2019/829

1809.10736

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Machine Learning Approaches for Principle Prediction in Naturally Occurring Stories

Nahian, Md Sultan Al, Frazier, Spencer, Harrison, Brent, Riedl, Mark

arXiv.org Artificial IntelligenceNov-18-2022

Value alignment is the task of creating autonomous systems whose values align with those of humans. Past work has shown that stories are a potentially rich source of information on human values; however, past work has been limited to considering values in a binary sense. In this work, we explore the use of machine learning models for the task of normative principle prediction on naturally occurring story data. To do this, we extend a dataset that has been previously used to train a binary normative classifier with annotations of moral principles. We then use this dataset to train a variety of machine learning models, evaluate these models and compare their results against humans who were asked to perform the same task. We show that while individual principles can be classified, the ambiguity of what "moral principles" represent, poses a challenge for both human participants and autonomous systems which are faced with the same task.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2212.06048

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)

Add feedback

StyleM: Stylized Metrics for Image Captioning Built with Contrastive N-grams

Li, Chengxi, Harrison, Brent

arXiv.org Artificial IntelligenceJan-3-2022

StyleCIDEr supports scoring the similarity of two compared captions with respect to their styles. We evaluate these two metrics using three stylized captioning methods trained on the PERSONALITY-CAPTIONS and FlickrStyle10K datasets: UPDOWN, MULTI-UPDOWN, and SVinVL. We also perform a human study to explore how well each caption aligns with human judgments in similar situations.

caption, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2201.00975

Country:

Europe (0.28)
North America > United States > Kentucky (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.86)

Add feedback

Modelling Cournot Games as Multi-agent Multi-armed Bandits

Taywade, Kshitija, Harrison, Brent, Bagh, Adib

arXiv.org Artificial IntelligenceJan-1-2022

We investigate the use of a multi-agent multi-armed bandit (MA-MAB) setting for modeling repeated Cournot oligopoly games, where the firms acting as agents choose from the set of arms representing production quantity (a discrete value). Agents interact with separate and independent bandit problems. In this formulation, each agent makes sequential choices among arms to maximize its own reward. Agents do not have any information about the environment; they can only see their own rewards after taking an action. However, the market demand is a stationary function of total industry output, and random entry or exit from the market is not allowed. Given these assumptions, we found that an $\epsilon$-greedy approach offers a more viable learning mechanism than other traditional MAB approaches, as it does not require any additional knowledge of the system to operate. We also propose two novel approaches that take advantage of the ordered action space: $\epsilon$-greedy+HL and $\epsilon$-greedy+EL. These new approaches help firms to focus on more profitable actions by eliminating less profitable choices and hence are designed to optimize the exploration. We use computer simulations to study the emergence of various equilibria in the outcomes and do the empirical analysis of joint cumulative regrets.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2201.01182

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds

Lin, Zhiyu, Harrison, Brent, Keech, Aaron, Riedl, Mark O.

arXiv.org Artificial IntelligenceJun-22-2021

We describe a method to use discrete human feedback to enhance the performance of deep learning agents in virtual three-dimensional environments by extending deep-reinforcement learning to model the confidence and consistency of human feedback. This enables deep reinforcement learning algorithms to determine the most appropriate time to listen to the human feedback, exploit the current policy model, or explore the agent's environment. Managing the trade-off between these three strategies allows DRL agents to be robust to inconsistent or intermittent human feedback. Through experimentation using a synthetic oracle, we show that our technique improves the training speed and overall performance of deep reinforcement learning in navigating three-dimensional environments using Minecraft. We further show that our technique is robust to highly innacurate human feedback and can also operate when no human feedback is given.

agent, computer game, deep learning, (20 more...)

arXiv.org Artificial Intelligence

1709.03969

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior

Nahian, Md Sultan Al, Frazier, Spencer, Harrison, Brent, Riedl, Mark

arXiv.org Artificial IntelligenceApr-19-2021

As more machine learning agents interact with humans, it is increasingly a prospect that an agent trained to perform a task optimally, using only a measure of task performance as feedback, can violate societal norms for acceptable behavior or cause harm. Value alignment is a property of intelligent agents wherein they solely pursue non-harmful behaviors or human-beneficial goals. We introduce an approach to value-aligned reinforcement learning, in which we train an agent with two reward signals: a standard task performance reward, plus a normative behavior reward. The normative behavior reward is derived from a value-aligned prior model previously shown to classify text as normative or non-normative. We show how variations on a policy shaping technique can balance these two sources of reward and produce policies that are both effective and perceived as being more normative. We test our value-alignment technique on three interactive text-based worlds; each world is designed specifically to challenge agents with a task as well as provide opportunities to deviate from the task to engage in normative and/or altruistic behavior.

agent, artificial intelligence, computer game, (18 more...)

arXiv.org Artificial Intelligence

2104.09469

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Influencing Reinforcement Learning through Natural Language Guidance

Tasrin, Tasmia, Nahian, Md Sultan Al, Perera, Habarakadage, Harrison, Brent

arXiv.org Artificial IntelligenceApr-10-2021

Interactive reinforcement learning agents use human feedback or instruction to help them learn in complex environments. Often, this feedback comes in the form of a discrete signal that is either positive or negative. While informative, this information can be difficult to generalize on its own. In this work, we explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent by extending policy shaping, a well-known Interactive reinforcement learning technique. Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal. In our case, we replace this human feedback policy with policy generated based on natural language advice. We aim to inspect if the generated natural language reasoning provides support to a deep reinforcement learning agent to decide its actions successfully in any given environment. So, we design our model with three networks: first one is the experience driven, next is the advice generator and third one is the advice driven. While the experience driven reinforcement learning agent chooses its actions being influenced by the environmental reward, the advice driven neural network with generated feedback by the advice generator for any new state selects its actions to assist the reinforcement learning agent to better policy shaping.

agent, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2104.01506

Country:

North America > United States (0.29)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Decentralized Marriage Models

Taywade, Kshitija (University of Kentucky ) | Goldsmith, Judy (University of Kentucky) | Harrison, Brent (University of Kentucky)

AAAI ConferencesMay-16-2020

Most matching algorithms are centralized in that a single agent determines how other agents are matched together. This is contrary to how humans form matches in the real world. In this work, we propose three decentralized approaches for finding matchings that are inspired by three techniques that humans use to find matches. The first is to have individuals wander a grid environment, interacting and deciding preferences over potential partners. The second uses affiliation networks where agencies recommend potential partners. The third is based on small-world social networks, where we assume that individuals probabilistically introduce their friends to one another. we introduce a heuristic algorithm that can be used in each of these environments. We also explore how this algorithm can scale to a large number of agents.

decentralized marriage model

AAAI Conferences

The Thirty-Third International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback