AITopics | Boutilier, Craig

Collaborating Authors

Boutilier, Craig

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Content Prompting: Modeling Content Provider Dynamics to Improve User Welfare in Recommender Ecosystems

Prasad, Siddharth, Mladenov, Martin, Boutilier, Craig

arXiv.org Artificial IntelligenceSep-2-2023

Users derive value from a recommender system (RS) only to the extent that it is able to surface content (or items) that meet their needs/preferences. While RSs often have a comprehensive view of user preferences across the entire user base, content providers, by contrast, generally have only a local view of the preferences of users that have interacted with their content. This limits a provider's ability to offer new content to best serve the broader population. In this work, we tackle this information asymmetry with content prompting policies. A content prompt is a hint or suggestion to a provider to make available novel content for which the RS predicts unmet user demand. A prompting policy is a sequence of such prompts that is responsive to the dynamics of a provider's beliefs, skills and incentives. We aim to determine a joint prompting policy that induces a set of providers to make content available that optimizes user social welfare in equilibrium, while respecting the incentives of the providers themselves. Our contributions include: (i) an abstract model of the RS ecosystem, including content provider behaviors, that supports such prompting; (ii) the design and theoretical analysis of sequential prompting policies for individual providers; (iii) a mixed integer programming formulation for optimal joint prompting using path planning in content space; and (iv) simple, proof-of-concept experiments illustrating how such policies improve ecosystem health and user welfare.

artificial intelligence, machine learning, provider, (18 more...)

arXiv.org Artificial Intelligence

2309.0094

Country: North America > United States (1.00)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reinforcement Learning with History-Dependent Dynamic Contexts

Tennenholtz, Guy, Merlis, Nadav, Shani, Lior, Mladenov, Martin, Boutilier, Craig

arXiv.org Artificial IntelligenceMay-17-2023

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time. We consider special cases of the model, with a focus on logistic DCMDPs, which break the exponential dependence on history length by leveraging aggregation functions to determine context transitions. This special structure allows us to derive an upper-confidence-bound style algorithm for which we establish regret bounds. Motivated by our theoretical results, we introduce a practical model-based algorithm for logistic DCMDPs that plans in a latent space and uses optimism over history-dependent features. We demonstrate the efficacy of our approach on a recommendation task (using MovieLens data) where user behavior dynamics evolve in response to recommendations.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2302.02061

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Aligning Text-to-Image Models using Human Feedback

Lee, Kimin, Liu, Hao, Ryu, Moonkyung, Watkins, Olivia, Du, Yuqing, Boutilier, Craig, Abbeel, Pieter, Ghavamzadeh, Mohammad, Gu, Shixiang Shane

arXiv.org Artificial IntelligenceFeb-23-2023

Deep generative models have shown impressive results in text-to-image synthesis. However, current text-to-image models often generate images that are inadequately aligned with text prompts. We propose a fine-tuning method for aligning such models using human feedback, comprising three stages. First, we collect human feedback assessing model output alignment from a set of diverse text prompts. We then use the human-labeled image-text dataset to train a reward function that predicts human feedback. Lastly, the text-to-image model is fine-tuned by maximizing reward-weighted likelihood to improve image-text alignment. Our method generates objects with specified colors, counts and backgrounds more accurately than the pre-trained model. We also analyze several design choices and find that careful investigations on such design choices are important in balancing the alignment-fidelity tradeoffs. Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2302.12192

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report

Littman, Michael L., Ajunwa, Ifeoma, Berger, Guy, Boutilier, Craig, Currie, Morgan, Doshi-Velez, Finale, Hadfield, Gillian, Horowitz, Michael C., Isbell, Charles, Kitano, Hiroaki, Levy, Karen, Lyons, Terah, Mitchell, Melanie, Shah, Julie, Sloman, Steven, Vallor, Shannon, Walsh, Toby

arXiv.org Artificial IntelligenceOct-27-2022

In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strength, Gathering Storms," answers a set of 14 questions probing critical areas of AI development addressing the major risks and dangers of AI, its effects on society, its public perception and the future of the field. The report concludes that AI has made a major leap from the lab to people's lives in recent years, which increases the urgency to understand its potential negative effects. The questions were developed by the AI100 Standing Committee, chaired by Peter Stone of the University of Texas at Austin, consisting of a group of AI leaders with expertise in computer science, sociology, ethics, economics, and other disciplines.

artificial intelligence, chatbot, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2210.15767

Country:

Asia (1.00)
North America > United States > California (0.46)
Europe > United Kingdom > England (0.45)
North America > United States > Texas > Travis County > Austin (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Personal (1.00)
Instructional Material (1.00)

Industry:

Transportation > Ground > Road (1.00)
Media > News (1.00)
Law > Civil Rights & Constitutional Law (1.00)
(28 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
(10 more...)

Add feedback

Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors

Göpfert, Christina, Chow, Yinlam, Hsu, Chih-wei, Vendrov, Ivan, Lu, Tyler, Ramachandran, Deepak, Boutilier, Craig

arXiv.org Artificial IntelligenceFeb-6-2022

Interactive recommender systems (RSs) allow users to express intent, preferences and contexts in a rich fashion, often using natural language. One challenge in using such feedback is inferring a user's semantic intent from the open-ended terms used to describe an item, and using it to refine recommendation results. Leveraging concept activation vectors (CAVs) [21], we develop a framework to learn a representation that captures the semantics of such attributes and connects them to user preferences and behaviors in RSs. A novel feature of our approach is its ability to distinguish objective and subjective attributes and associate different senses with different users. Using synthetic and real-world datasets, we show that our CAV representation accurately interprets users' subjective semantics, and can improve recommendations via interactive critiquing

artificial intelligence, cav, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/1122445.1122456

2202.0283

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Thompson Sampling with a Mixture Prior

Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad, Boutilier, Craig

arXiv.org Machine LearningJun-10-2021

We study Thompson sampling (TS) in online decision-making problems where the uncertain environment is sampled from a mixture distribution. This is relevant to multi-task settings, where a learning agent is faced with different classes of problems. We incorporate this structure in a natural way by initializing TS with a mixture prior -- dubbed MixTS -- and develop a novel, general technique for analyzing the regret of TS with such priors. We apply this technique to derive Bayes regret bounds for MixTS in both linear bandits and tabular Markov decision processes (MDPs). Our regret bounds reflect the structure of the problem and depend on the number of components and confidence width of each component of the prior. Finally, we demonstrate the empirical effectiveness of MixTS in both synthetic and real-world experiments.

artificial intelligence, big data, latent state, (20 more...)

arXiv.org Machine Learning

2106.05608

Country: North America > United States (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems

Mladenov, Martin, Hsu, Chih-Wei, Jain, Vihan, Ie, Eugene, Colby, Christopher, Mayoraz, Nicolas, Pham, Hubert, Tran, Dustin, Vendrov, Ivan, Boutilier, Craig

arXiv.org Artificial IntelligenceMar-14-2021

The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e.g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years. Developing and training models and algorithms for such recommenders can be especially difficult using static datasets, which often fail to offer the types of counterfactual predictions needed to evaluate policies over extended horizons. To address this, we develop RecSim NG, a probabilistic platform for the simulation of multi-agent recommender systems. RecSim NG is a scalable, modular, differentiable simulator implemented in Edward2 and TensorFlow. It offers: a powerful, general probabilistic programming language for agent-behavior specification; tools for probabilistic inference and latent-variable model learning, backed by automatic differentiation and tracing; and a TensorFlow-based runtime for running simulations on accelerated hardware. We describe RecSim NG and illustrate how it can be used to create transparent, configurable, end-to-end models of a recommender ecosystem, complemented by a small set of simple use cases that demonstrate how RecSim NG can help both researchers and practitioners easily develop and train novel algorithms for recommender systems.

deep learning, neural network, recsim ng, (21 more...)

arXiv.org Artificial Intelligence

2103.08057

Country:

Asia (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Games (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Add feedback

Meta-Thompson Sampling

Kveton, Branislav, Konobeev, Mikhail, Zaheer, Manzil, Hsu, Chih-wei, Mladenov, Martin, Boutilier, Craig, Szepesvari, Csaba

arXiv.org Machine LearningFeb-11-2021

Efficient exploration in multi-armed bandits is a fundamental online learning problem. In this work, we propose a variant of Thompson sampling that learns to explore better as it interacts with problem instances drawn from an unknown prior distribution. Our algorithm meta-learns the prior and thus we call it Meta-TS. We propose efficient implementations of Meta-TS and analyze it in Gaussian bandits. Our analysis shows the benefit of meta-learning the prior and is of a broader interest, because we derive the first prior-dependent upper bound on the Bayes regret of Thompson sampling. This result is complemented by empirical evaluation, which shows that Meta-TS quickly adapts to the unknown prior.

big data, educational setting, thompson, (18 more...)

arXiv.org Machine Learning

2102.06129

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

Non-Stationary Latent Bandits

Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Chow, Yinlam, Ahmed, Amr, Ghavamzadeh, Mohammad, Boutilier, Craig

arXiv.org Artificial IntelligenceDec-1-2020

Users of recommender systems often behave in a non-stationary fashion, due to their evolving preferences and tastes over time. In this work, we propose a practical approach for fast personalization to non-stationary users. The key idea is to frame this problem as a latent bandit, where the prototypical models of user behavior are learned offline and the latent state of the user is inferred online from its interactions with the models. We call this problem a non-stationary latent bandit. We propose Thompson sampling algorithms for regret minimization in non-stationary latent bandits, analyze them, and evaluate them on a real-world dataset. The main strength of our approach is that it can be combined with rich offline-learned models, which can be misspecified, and are subsequently fine-tuned online using posterior sampling. In this way, we naturally combine the strengths of offline and online learning.

artificial intelligence, data mining, latent state, (18 more...)

arXiv.org Artificial Intelligence

2012.00386

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.87)

Add feedback

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

Mladenov, Martin, Creager, Elliot, Ben-Porat, Omer, Swersky, Kevin, Zemel, Richard, Boutilier, Craig

arXiv.org Artificial IntelligenceAug-18-2020

Most recommender systems (RS) research assumes that a user's utility can be maximized independently of the utility of the other agents (e.g., other users, content providers). In realistic settings, this is often not true---the dynamics of an RS ecosystem couple the long-term utility of all agents. In this work, we explore settings in which content providers cannot remain viable unless they receive a certain level of user engagement. We formulate the recommendation problem in this setting as one of equilibrium selection in the induced dynamical system, and show that it can be solved as an optimal constrained matching problem. Our model ensures the system reaches an equilibrium with maximal social welfare supported by a sufficiently diverse set of viable providers. We demonstrate that even in a simple, stylized dynamical RS model, the standard myopic approach to recommendation---always matching a user to the best provider---performs poorly. We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

artificial intelligence, optimization problem, provider, (13 more...)

arXiv.org Artificial Intelligence

2008.00104

Country: North America > United States (1.00)

Genre: Research Report (0.81)

Industry:

Media (0.67)
Information Technology (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback