AITopics | Harris, Keegan

Collaborating Authors

Harris, Keegan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information

Balcan, Maria-Florina, Bernasconi, Martino, Castiglioni, Matteo, Celli, Andrea, Harris, Keegan, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceJan-31-2025

We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers. In every round the leader observes contextual information and commits to a mixed strategy, after which the follower best-responds. We provide learning algorithms for the leader which achieve $O(T^{1/2})$ regret under bandit feedback, an improvement from the previously best-known rates of $O(T^{2/3})$. Our algorithms rely on a reduction to linear contextual bandits in the utility space: In each round, a linear contextual bandit algorithm recommends a utility vector, which our algorithm inverts to determine the leader's mixed strategy. We extend our algorithms to the setting in which the leader's utility function is unknown, and also apply it to the problems of bidding in second-price auctions with side information and online Bayesian persuasion with public and private states. Finally, we observe that our algorithms empirically outperform previous results on numerical simulations.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.00204

Country: Europe > Middle East > Cyprus (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.46)
Education (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

Should You Use Your Large Language Model to Explore or Exploit?

Harris, Keegan, Slivkins, Aleksandrs

arXiv.org Artificial IntelligenceJan-31-2025

We evaluate the ability of the current generation of large language models (LLMs) to help a decision-making agent facing an exploration-exploitation tradeoff. We use LLMs to explore and exploit in silos in various (contextual) bandit tasks. We find that while the current LLMs often struggle to exploit, in-context mitigations may be used to substantially improve performance for small-scale tasks. However even then, LLMs perform worse than a simple linear regression. On the other hand, we find that LLMs do help at exploring large action spaces with inherent semantics, by suggesting suitable candidates to explore.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.00225

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Upstream (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can large language models explore in-context?

Krishnamurthy, Akshay, Harris, Keegan, Foster, Dylan J., Zhang, Cyril, Slivkins, Aleksandrs

arXiv.org Artificial IntelligenceJul-12-2024

We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. We focus on native performance of existing LLMs, without training interventions. We deploy LLMs as agents in simple multi-armed bandit environments, specifying the environment description and interaction history entirely in-context, i.e., within the LLM prompt. We experiment with GPT-3.5, GPT-4, and Llama2, using a variety of prompt designs, and find that the models do not robustly engage in exploration without substantial interventions: i) Across all of our experiments, only one configuration resulted in satisfactory exploratory behavior: GPT-4 with chain-of-thought reasoning and an externally summarized interaction history, presented as sufficient statistics; ii) All other configurations did not result in robust exploratory behavior, including those with chain-of-thought reasoning but unsummarized history. Although these findings can be interpreted positively, they suggest that external summarization -- which may not be possible in more complex settings -- is important for obtaining desirable behavior from LLM agents. We conclude that non-trivial algorithmic interventions, such as fine-tuning or dataset curation, may be required to empower LLM-based decision making agents in complex settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.15371

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Regret Minimization in Stackelberg Games with Side Information

Harris, Keegan, Wu, Zhiwei Steven, Balcan, Maria-Florina

arXiv.org Artificial IntelligenceMay-23-2024

Algorithms for playing in Stackelberg games have been deployed in real-world domains including airport security, anti-poaching efforts, and cyber-crime prevention. However, these algorithms often fail to take into consideration the additional information available to each player (e.g. traffic patterns, weather conditions, network congestion), a salient feature of reality which may significantly affect both players' optimal strategies. We formalize such settings as Stackelberg games with side information, in which both players observe an external context before playing. The leader commits to a (context-dependent) strategy, and the follower best-responds to both the leader's strategy and the context. We focus on the online setting in which a sequence of followers arrive over time, and the context may change from round-to-round. In sharp contrast to the non-contextual version, we show that it is impossible for the leader to achieve good performance (measured by regret) in the full adversarial setting. Motivated by our impossibility result, we show that no-regret learning is possible in two natural relaxations: the setting in which the sequence of followers is chosen stochastically and the sequence of contexts is adversarial, and the setting in which the sequence of contexts is stochastic and the sequence of followers is chosen by an adversary.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Artificial Intelligence

2402.08576

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)
Leisure & Entertainment (0.93)
(2 more...)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

Ngo, Daniel, Harris, Keegan, Agarwal, Anish, Syrgkanis, Vasilis, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceDec-26-2023

We consider a panel data setting in which one observes measurements of units over time, under different interventions. Our focus is on the canonical family of synthetic control methods (SCMs) which, after a pre-intervention time period when all units are under control, estimate counterfactual outcomes for test units in the post-intervention time period under control by using data from donor units who have remained under control for the entire post-intervention period. In order for the counterfactual estimate produced by synthetic control for a test unit to be accurate, there must be sufficient overlap between the outcomes of the donor units and the outcomes of the test unit. As a result, a canonical assumption in the literature on SCMs is that the outcomes for the test units lie within either the convex hull or the linear span of the outcomes for the donor units. However despite their ubiquity, such overlap assumptions may not always hold, as is the case when, e.g., units select their own interventions and different subpopulations of units prefer different interventions a priori. We shed light on this typically overlooked assumption, and we address this issue by incentivizing units with different preferences to take interventions they would not normally consider. Specifically, we provide a SCM for incentivizing exploration in panel data settings which provides incentive-compatible intervention recommendations to units by leveraging tools from information design and online learning. Using our algorithm, we show how to obtain valid counterfactual estimates using SCMs without the need for an explicit overlap assumption on the unit outcomes.

artificial intelligence, intervention, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2312.16307

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Strategyproof Decision-Making in Panel Data Settings and Beyond

Harris, Keegan, Agarwal, Anish, Podimata, Chara, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceDec-21-2023

We consider the problem of decision-making using panel data, in which a decision-maker gets noisy, repeated measurements of multiple units (or agents). We consider a setup where there is a pre-intervention period, when the principal observes the outcomes of each unit, after which the principal uses these observations to assign a treatment to each unit. Unlike this classical setting, we permit the units generating the panel data to be strategic, i.e. units may modify their pre-intervention outcomes in order to receive a more desirable intervention. The principal's goal is to design a strategyproof intervention policy, i.e. a policy that assigns units to their utility-maximizing interventions despite their potential strategizing. We first identify a necessary and sufficient condition under which a strategyproof intervention policy exists, and provide a strategyproof mechanism with a simple closed form when one does exist. Along the way, we prove impossibility results for strategic multiclass classification, which may be of independent interest. When there are two interventions, we establish that there always exists a strategyproof mechanism, and provide an algorithm for learning such a mechanism. For three or more interventions, we provide an algorithm for learning a strategyproof mechanism if there exists a sufficiently large gap in the principal's rewards between different interventions. Finally, we empirically evaluate our model using real-world panel data collected from product sales over 18 months. We find that our methods compare favorably to baselines which do not take strategic interactions into consideration, even in the presence of model misspecification.

artificial intelligence, intervention, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2211.14236

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Algorithmic Persuasion Through Simulation: Information Design in the Age of Generative AI

Harris, Keegan, Immorlica, Nicole, Lucier, Brendan, Slivkins, Aleksandrs

arXiv.org Artificial IntelligenceNov-29-2023

How can an informed sender persuade a receiver, having only limited information about the receiver's beliefs? Motivated by research showing generative AI can simulate economic agents, we initiate the study of information design with an oracle. We assume the sender can learn more about the receiver by querying this oracle, e.g., by simulating the receiver's behavior. Aside from AI motivations such as general-purpose Large Language Models (LLMs) and problem-specific machine learning models, alternate motivations include customer surveys and querying a small pool of live users. Specifically, we study Bayesian Persuasion where the sender has a second-order prior over the receiver's beliefs. After a fixed number of queries to an oracle to refine this prior, the sender commits to an information structure. Upon receiving the message, the receiver takes a payoff-relevant action maximizing her expected utility given her posterior beliefs. We design polynomial-time querying algorithms that optimize the sender's expected utility in this Bayesian Persuasion game. As a technical contribution, we show that queries form partitions of the space of receiver beliefs that can be used to quantify the sender's knowledge.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.18138

Country:

North America > United States > Hawaii (0.14)
Europe > Middle East > Cyprus (0.14)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

Add feedback

Meta-Learning Adversarial Bandit Algorithms

Khodak, Mikhail, Osadchiy, Ilya, Harris, Keegan, Balcan, Maria-Florina, Levy, Kfir Y., Meir, Ron, Wu, Zhiwei Steven

arXiv.org Machine LearningNov-1-2023

We study online meta-learning with bandit feedback, with the goal of improving performance across multiple tasks if they are similar according to some natural similarity measure. As the first to target the adversarial online-within-online partial-information setting, we design meta-algorithms that combine outer learners to simultaneously tune the initialization and other hyperparameters of an inner learner for two important cases: multi-armed bandits (MAB) and bandit linear optimization (BLO). For MAB, the meta-learners initialize and set hyperparameters of the Tsallis-entropy generalization of Exp3, with the task-averaged regret improving if the entropy of the optima-in-hindsight is small. For BLO, we learn to initialize and tune online mirror descent (OMD) with self-concordant barrier regularizers, showing that task-averaged regret varies directly with an action space-dependent measure they induce. Our guarantees rely on proving that unregularized follow-the-leader combined with two levels of low-dimensional hyperparameter tuning is enough to learn a sequence of affine functions of non-Lipschitz and sometimes non-convex Bregman divergences bounding the regret of OMD.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2307.02295

Country:

North America > United States (0.45)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report (1.00)
Instructional Material (0.66)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adaptive Principal Component Regression with Applications to Panel Data

Agarwal, Anish, Harris, Keegan, Whitehouse, Justin, Wu, Zhiwei Steven

arXiv.org Machine LearningOct-27-2023

Principal component regression (PCR) is a popular technique for fixed-design error-in-variables regression, a generalization of the linear regression setting in which the observed covariates are corrupted with random noise. We provide the first time-uniform finite sample guarantees for online (regularized) PCR whenever data is collected adaptively. Since the proof techniques for analyzing PCR in the fixed design setting do not readily extend to the online setting, our results rely on adapting tools from modern martingale concentration to the error-in-variables setting. As an application of our bounds, we provide a framework for experiment design in panel data settings when interventions are assigned adaptively. Our framework may be thought of as a generalization of the synthetic control and synthetic interventions frameworks, where data is collected via an adaptive intervention assignment policy.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2307.01357

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Strategic Apple Tasting

Harris, Keegan, Podimata, Chara, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceOct-27-2023

Algorithmic decision-making in high-stakes domains often involves assigning decisions to agents with incentives to strategically modify their input to the algorithm. In addition to dealing with incentives, in many domains of interest (e.g. lending and hiring) the decision-maker only observes feedback regarding their policy for rounds in which they assign a positive decision to the agent; this type of feedback is often referred to as apple tasting (or one-sided) feedback. We formalize this setting as an online learning problem with apple-tasting feedback where a principal makes decisions about a sequence of $T$ agents, each of which is represented by a context that may be strategically modified. Our goal is to achieve sublinear strategic regret, which compares the performance of the principal to that of the best fixed policy in hindsight, if the agents were truthful when revealing their contexts. Our main result is a learning algorithm which incurs $O (\sqrt{T})$ strategic regret when the sequence of agents is chosen stochastically. We also give an algorithm capable of handling adversarially-chosen agents, albeit at the cost of $O(T^{(d+1)/(d+2)})$ strategic regret (where $d$ is the dimension of the context). Our algorithms can be easily adapted to the setting where the principal receives bandit feedback -- this setting generalizes both the linear contextual bandit problem (by considering agents with incentives) and the strategic classification problem (by allowing for partial feedback).

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.0625

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback