AITopics | Wichers, Nevan

Collaborating Authors

Wichers, Nevan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Visualizing Neural Network Imagination

Wichers, Nevan, Tao, Victor, Volpato, Riccardo, Barez, Fazl

arXiv.org Artificial IntelligenceMay-10-2024

In certain situations, neural networks will represent environment states in their hidden activations. Our goal is to visualize what environment states the networks are representing. After training, we apply the decoder to the intermediate representations of the network to visualize what they represent. We define a quantitative interpretability metric and use it to demonstrate that hidden states can be highly interpretable on a simple task. We also develop autoencoder and adversarial techniques and show that benefit interpretability.

artificial intelligence, gol state, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.06409

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Gradient-Based Language Model Red Teaming

Wichers, Nevan, Denison, Carson, Beirami, Ahmad

arXiv.org Artificial IntelligenceJan-29-2024

Red teaming is a common strategy for identifying weaknesses in generative language models (LMs), where adversarial prompts are produced that trigger an LM to generate unsafe responses. Red teaming is instrumental for both model alignment and evaluation, but is labor-intensive and difficult to scale when done by humans. In this paper, we present Gradient-Based Red Teaming (GBRT), a red teaming method for automatically generating diverse prompts that are likely to cause an LM to output unsafe responses. GBRT is a form of prompt learning, trained by scoring an LM response with a safety classifier and then backpropagating through the frozen safety classifier and LM to update the prompt. To improve the coherence of input prompts, we introduce two variants that add a realism loss and fine-tune a pretrained model to generate the prompts instead of learning the prompts directly. Our experiments show that GBRT is more effective at finding prompts that trigger an LM to generate unsafe responses than a strong reinforcement learning-based red teaming approach, and succeeds even when the LM has been fine-tuned to produce safer outputs.

classifier, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2401.16656

Country:

Europe (0.68)
Asia > Middle East > UAE (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.82)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

DRLC: Reinforcement Learning with Dense Rewards from LLM Critic

Cao, Meng, Shu, Lei, Yu, Lei, Zhu, Yun, Wichers, Nevan, Liu, Yinxiao, Meng, Lei

arXiv.org Artificial IntelligenceJan-14-2024

Reinforcement learning (RL) can align language models with non-differentiable reward signals, such as human preferences. However, a major challenge arises from the sparsity of these reward signals - typically, there is only one reward for the entire generation. This sparsity of rewards can lead to inefficient and unstable learning. In this paper, we introduce a novel framework leveraging the critique ability of LLMs to produce dense rewards throughout the learning process. Our approach incorporates a critic language model alongside the policy model. This critic is prompted with the task description, question, policy model's output, and environment's reward signal as input, and provides token or span-level dense rewards that reflect the quality of each segment of the output. We assess our approach on three text generation tasks: sentiment control, language model detoxification, and summarization. Experimental results show that incorporating artificial dense rewards in training yields consistent performance gains over the PPO baseline with holistic rewards. Furthermore, in a setting where the same model serves as both policy and critic, we demonstrate that "self-critique" rewards also boost learning efficiency.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2401.07382

Country:

Europe (1.00)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas (0.14)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment (0.67)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SiRA: Sparse Mixture of Low Rank Adaptation

Zhu, Yun, Wichers, Nevan, Lin, Chu-Cheng, Wang, Xinyi, Chen, Tianlong, Shu, Lei, Lu, Han, Liu, Canoee, Luo, Liangchen, Chen, Jindong, Meng, Lei

arXiv.org Artificial IntelligenceNov-15-2023

Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks. Most previous works considers adding the dense trainable parameters, where all parameters are used to adapt certain task. We found this less effective empirically using the example of LoRA that introducing more trainable parameters does not help. Motivated by this we investigate the importance of leveraging "sparse" computation and propose SiRA: sparse mixture of low rank adaption. SiRA leverages the Sparse Mixture of Expert(SMoE) to boost the performance of LoRA. Specifically it enforces the top $k$ experts routing with a capacity limit restricting the maximum number of tokens each expert can process. We propose a novel and simple expert dropout on top of gating network to reduce the over-fitting issue. Through extensive experiments, we verify SiRA performs better than LoRA and other mixture of expert approaches across different single tasks and multitask settings.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2311.09179

Country: North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Fusion-Eval: Integrating Evaluators with LLMs

Shu, Lei, Wichers, Nevan, Luo, Liangchen, Zhu, Yun, Liu, Yinxiao, Chen, Jindong, Meng, Lei

arXiv.org Artificial IntelligenceNov-15-2023

Evaluating Large Language Models (LLMs) is a complex task, especially considering the intricacies of natural language understanding and the expectations for high-level reasoning. Traditional evaluations typically lean on human-based, model-based, or automatic-metrics-based paradigms, each with its own advantages and shortcomings. We introduce "Fusion-Eval", a system that employs LLMs not solely for direct evaluations, but to skillfully integrate insights from diverse evaluators. This gives Fusion-Eval flexibility, enabling it to work effectively across diverse tasks and make optimal use of multiple references. In testing on the SummEval dataset, Fusion-Eval achieved a Spearman correlation of 0.96, outperforming other evaluators. The success of Fusion-Eval underscores the potential of LLMs to produce evaluations that closely align human perspectives, setting a new standard in the field of LLM evaluation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.09204

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces

He, Zecheng, Sunkara, Srinivas, Zang, Xiaoxue, Xu, Ying, Liu, Lijuan, Wichers, Nevan, Schubiner, Gabriel, Lee, Ruby, Chen, Jindong

arXiv.org Artificial IntelligenceDec-22-2020

As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety of settings, building models that can assist users and accomplish tasks through the UI is vitally important. However, there are several challenges to achieve this. First, UI components of similar appearance can have different functionalities, making understanding their function more important than just analyzing their appearance. Second, domain-specific features like Document Object Model (DOM) in web pages and View Hierarchy (VH) in mobile applications provide important signals about the semantics of UI elements, but these features are not in a natural language format. Third, owing to a large diversity in UIs and absence of standard DOM or VH representations, building a UI understanding model with high coverage requires large amounts of training data. Inspired by the success of pre-training based approaches in NLP for tackling a variety of problems in a data-efficient way, we introduce a new pre-trained UI representation model called ActionBert. Our methodology is designed to leverage visual, linguistic and domain-specific features in user interaction traces to pre-train generic feature representations of UIs and their components. Our key intuition is that user actions, e.g., a sequence of clicks on different UI components, reveals important information about their functionality. We evaluate the proposed model on a wide variety of downstream tasks, ranging from icon classification to UI component retrieval based on its natural language description. Experiments show that the proposed ActionBert model outperforms multi-modal baselines across all downstream tasks by up to 15.5%.

artificial intelligence, neural network, ui component, (18 more...)

arXiv.org Artificial Intelligence

2012.1235

Genre: Research Report (0.50)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Resolving Spurious Correlations in Causal Models of Environments via Interventions

Volodin, Sergei, Wichers, Nevan, Nixon, Jeremy

arXiv.org Machine LearningFeb-12-2020

Causality (Halpern & Pearl, 2005) is an important concept (Pearl, 2018) for Machine Learning, since it resolves many issues in performance and Artificial Intelligence (AI) safety (Amodei et al., 2016) such as interpretability (Madumal et al., 2019; Bengio, 2017), robustness to distributional shift (de Haan et al., 2019a) and sample-efficiency (Buesing et al., 2018). It is particularly well suited for Reinforcement Learning (RL), compared to supervised learning, because in RL there is an opportunity to take actions and influence the environment in a directed way. Since causality is a cornerstone in science, such an agent is expected to be superior to noncausal agents (Marino et al., 2019). Spurious correlations are a major obstacle in learning causal models. If present, they make learning from purely observational data impossible (Pearl & Mackenzie, 2018). We take advantage of the fact that it is possible to uncover the causal graph by executing interventions (Halpern & Pearl, 2005) which change the data distribution. We design a method to automatically resolve spurious correlations when learning the causal graph of the environment.

artificial intelligence, graph, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2002.05217

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback