AITopics

2404.1849

Country:

Africa (0.67)
North America > United States > California (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningApr-24-2024

Collaborative Heterogeneous Causal Inference Beyond Meta-analysis

Guo, Tianyu, Karimireddy, Sai Praneeth, Jordan, Michael I.

Collaboration between different data centers is often challenged by heterogeneity across sites. To account for the heterogeneity, the state-of-the-art method is to re-weight the covariate distributions in each site to match the distribution of the target population. Nevertheless, this method could easily fail when a certain site couldn't cover the entire population. Moreover, it still relies on the concept of traditional meta-analysis after adjusting for the distribution shift. In this work, we propose a collaborative inverse propensity score weighting estimator for causal inference with heterogeneous data. Instead of adjusting the distribution shift separately, we use weighted propensity score models to collaboratively adjust for the distribution shift. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases. To account for the vulnerable density estimation, we further discuss the double machine method and show the possibility of using nonparametric density estimation with d<8 and a flexible machine learning method to guarantee asymptotic normality. We propose a federated learning algorithm to collaboratively train the outcome model while preserving privacy. Using synthetic and real datasets, we demonstrate the advantages of our method.

artificial intelligence, estimator, machine learning, (16 more...)

2404.15746

Country: Europe (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningMar-6-2024

Incentivized Learning in Principal-Agent Bandit Games

Scheid, Antoine, Tiapkin, Daniil, Boursier, Etienne, Capitaine, Aymeric, Mhamdi, El Mahdi El, Moulines, Eric, Jordan, Michael I., Durmus, Alain

Real-world decision-making problems, however, often present challenges that are not addressed in this simple This work considers a repeated principal-agent optimization framework. These include the challenge of bandit game, where the principal can only scarcity when there are multiple decision-makers, issues interact with her environment through the agent. of misaligned objectives, and problems arising from The principal and the agent have misaligned information asymmetries and signaling. The economics objectives and the choice of action is only left to literature addresses these issues through the design of the agent. However, the principal can influence game-theoretic mechanisms, including auctions and the agent's decisions by offering incentives which contracts (see, e.g., Myerson, 1989; Laffont & Martimort, add up to his rewards. The principal aims to 2009), aiming to achieve favorable outcomes despite agents' iteratively learn an incentive policy to maximize self-interest and limited information set.

agent, artificial intelligence, machine learning, (15 more...)

2403.03811

Country:

North America > United States > California (0.14)
Europe > France > Île-de-France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

arXiv.org Artificial IntelligenceJan-30-2024

A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport

Lin, Tianyi, Cuturi, Marco, Jordan, Michael I.

Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. Recent works suggest that these estimators are more statistically efficient than plug-in (linear programming-based) OT estimators when comparing probability measures in high-dimensions~\citep{Vacher-2021-Dimension}. Unfortunately, that statistical benefit comes at a very steep computational price: because their computation relies on the short-step interior-point method (SSIPM), which comes with a large iteration count in practice, these estimators quickly become intractable w.r.t. sample size $n$. To scale these estimators to larger $n$, we propose a nonsmooth fixed-point model for the kernel-based OT problem, and show that it can be efficiently solved via a specialized semismooth Newton (SSN) method: We show, exploring the problem's structure, that the per-iteration cost of performing one SSN step can be significantly reduced in practice. We prove that our SSN method achieves a global convergence rate of $O(1/\sqrt{k})$, and a local quadratic convergence rate under standard regularity conditions. We show substantial speedups over SSIPM on both synthetic and real datasets.

artificial intelligence, machine learning, optimization problem, (17 more...)

2310.14087

Country: Asia > Middle East > Jordan (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Artificial IntelligenceJan-29-2024

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

Zhu, Banghua, Jordan, Michael I., Jiao, Jiantao

A key ingredient in the roll-out of LLMs is the fine-tuning step, in which the models are brought into closer alignment with specific behavioral and normative goals. When no adequately fine-tuned, LLMs may exhibit undesirable and unpredictable behavior, including the fabrication of facts or the generation of biased and toxic content (Perez et al., 2022; Ganguli et al., 2022). The current approach towards mitigating such problems is to make use of reinforcement learning based on human assessments. In particular, Reinforcement Learning with Human Feedback (RLHF) proposes to use human assessments as a reward function from pairwise or multi-wise comparisons of model responses, and then fine-tune the language model based on the learned reward functions (Ziegler et al., 2019; Ouyang et al., 2022; Schulman et al., 2022). Following on from a supervised learning stage, a typical RLHF protocol involves two main steps: Reward learning: Sample prompts from a prompt dataset and generate multiple responses for the same prompt.

large language model, machine learning, reinforcement learning, (16 more...)

2401.16335

Country:

Asia (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJan-2-2024

Accelerated First-Order Optimization under Nonlinear Constraints

Muehlebach, Michael, Jordan, Michael I.

We exploit analogies between first-order algorithms for constrained optimization and non-smooth dynamical systems to design a new class of accelerated first-order algorithms for constrained optimization. Unlike Frank-Wolfe or projected gradients, these algorithms avoid optimization over the entire feasible set at each iteration. We prove convergence to stationary points even in a nonconvex setting and we derive accelerated rates for the convex setting both in continuous time, as well as in discrete time. An important property of these algorithms is that constraints are expressed in terms of velocities instead of positions, which naturally leads to sparse, local and convex approximations of the feasible set (even if the feasible set is nonconvex). Thus, the complexity tends to grow mildly in the number of decision variables and in the number of constraints, which makes the algorithms suitable for machine learning applications. We apply our algorithms to a compressed sensing and a sparse regression problem, showing that we can treat nonconvex $\ell^p$ constraints ($p<1$) efficiently, while recovering state-of-the-art performance for $p=1$.

algorithm, artificial intelligence, optimization problem, (13 more...)

2302.00316

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningDec-13-2023

Towards Optimal Statistical Watermarking

Huang, Baihe, Zhu, Banghua, Zhu, Hanlin, Lee, Jason D., Jiao, Jiantao, Jordan, Michael I.

We study statistical watermarking by formulating it as a hypothesis testing problem, a general framework which subsumes all previous statistical watermarking methods. Key to our formulation is a coupling of the output tokens and the rejection region, realized by pseudo-random generators in practice, that allows non-trivial trade-off between the Type I error and Type II error. We characterize the Uniformly Most Powerful (UMP) watermark in this context. In the most common scenario where the output is a sequence of $n$ tokens, we establish matching upper and lower bounds on the number of i.i.d. tokens required to guarantee small Type I and Type II errors. Our rate scales as $\Theta(h^{-1} \log (1/h))$ with respect to the average entropy per token $h$ and thus greatly improves the $O(h^{-2})$ rate in the previous works. For scenarios where the detector lacks knowledge of the model's distribution, we introduce the concept of model-agnostic watermarking and establish the minimax bounds for the resultant increase in Type II error. Moreover, we formulate the robust watermarking problem where user is allowed to perform a class of perturbation on the generated texts, and characterize the optimal type II error of robust UMP tests via a linear programming problem. To the best of our knowledge, this is the first systematic statistical treatment on the watermarking problem with near-optimal rates in the i.i.d. setting, and might be of interest for future works.

artificial intelligence, natural language, type ii error, (16 more...)

2312.0793

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

arXiv.org Artificial IntelligenceNov-29-2023

Operationalizing Counterfactual Metrics: Incentives, Ranking, and Information Asymmetry

Wang, Serena, Bates, Stephen, Aronow, P. M., Jordan, Michael I.

From the social sciences to machine learning, it has been well documented that metrics to be optimized are not always aligned with social welfare. In healthcare, Dranove et al. (2003) showed that publishing surgery mortality metrics actually harmed the welfare of sicker patients by increasing provider selection behavior. We analyze the incentive misalignments that arise from such average treated outcome metrics, and show that the incentives driving treatment decisions would align with maximizing total patient welfare if the metrics (i) accounted for counterfactual untreated outcomes and (ii) considered total welfare instead of averaging over treated patients. Operationalizing this, we show how counterfactual metrics can be modified to behave reasonably in patient-facing ranking systems. Extending to realistic settings when providers observe more about patients than the regulatory agencies do, we bound the decay in performance by the degree of information asymmetry between principal and agent. In doing so, our model connects principal-agent information asymmetry with unobserved heterogeneity in causal inference.

agent, artificial intelligence, machine learning, (15 more...)

2305.14595

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

arXiv.org Artificial IntelligenceNov-17-2023

A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games

Vasconcelos, Francisca, Vlatakis-Gkaragkounis, Emmanouil-Vasileios, Mertikopoulos, Panayotis, Piliouras, Georgios, Jordan, Michael I.

Recent developments in domains such as non-local games, quantum interactive proofs, and quantum generative adversarial networks have renewed interest in quantum game theory and, specifically, quantum zero-sum games. Central to classical game theory is the efficient algorithmic computation of Nash equilibria, which represent optimal strategies for both players. In 2008, Jain and Watrous proposed the first classical algorithm for computing equilibria in quantum zero-sum games using the Matrix Multiplicative Weight Updates (MMWU) method to achieve a convergence rate of $\mathcal{O}(d/\epsilon^2)$ iterations to $\epsilon$-Nash equilibria in the $4^d$-dimensional spectraplex. In this work, we propose a hierarchy of quantum optimization algorithms that generalize MMWU via an extra-gradient mechanism. Notably, within this proposed hierarchy, we introduce the Optimistic Matrix Multiplicative Weights Update (OMMWU) algorithm and establish its average-iterate convergence complexity as $\mathcal{O}(d/\epsilon)$ iterations to $\epsilon$-Nash equilibria. This quadratic speed-up relative to Jain and Watrous' original algorithm sets a new benchmark for computing $\epsilon$-Nash equilibria in quantum zero-sum games.

algorithm, artificial intelligence, machine learning, (17 more...)

2311.10859

Country:

North America > United States (0.93)
Europe (0.92)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningNov-16-2023

Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition

Jagadeesan, Meena, Jordan, Michael I., Steinhardt, Jacob, Haghtalab, Nika

As the scale of machine learning models increases, trends such as scaling laws anticipate consistent downstream improvements in predictive accuracy. However, these trends take the perspective of a single model-provider in isolation, while in reality providers often compete with each other for users. In this work, we demonstrate that competition can fundamentally alter the behavior of these scaling trends, even causing overall predictive accuracy across users to be non-monotonic or decreasing with scale. We define a model of competition for classification tasks, and use data representations as a lens for studying the impact of increases in scale. We find many settings where improving data representation quality (as measured by Bayes risk) decreases the overall predictive accuracy across users (i.e., social welfare) for a marketplace of competing model-providers. Our examples range from closed-form formulas in simple settings to simulations with pretrained representations on CIFAR-10. At a conceptual level, our work suggests that favorable scaling trends for individual model-providers need not translate to downstream improvements in social welfare in marketplaces with multiple model providers.

artificial intelligence, equilibrium social loss, machine learning, (16 more...)

2306.1467

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)