AITopics

2412.08012

Country:

Europe (0.46)
Asia > Middle East > Israel (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceNov-25-2024

Distributed Online Optimization with Stochastic Agent Availability

Achddou, Juliette, Cesa-Bianchi, Nicolò, Qiu, Hao

Motivated by practical federated learning settings where clients may not be always available, In this work we focus on distributed online optimization we investigate a variant of distributed (DOO), an online learning variant of distributed online optimization where agents are active convex optimization in which each agent is facing an with a known probability p at each time adversarial sequence of convex loss functions (Hosseini step, and communication between neighboring et al., 2013). The goal of an agent is to minimize its agents can only take place if they are regret with respect to a sequence of global loss functions, both active. We introduce a distributed variant each obtained by summing the corresponding of the FTRL algorithm and analyze its local losses for each agent. In both batch and online network regret, defined through the average distributed optimization settings, the presence of of the instantaneous regret of the active the communication network, which limits the exchange agents. Our analysis shows that, for any of information to adjacent nodes, implies that agents connected communication graph G over N must use some information-propagation technique to agents, the expected network regret of our collect information about the global loss function.

agent, artificial intelligence, machine learning, (16 more...)

2411.16477

Country: Europe > Italy (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceNov-21-2024

Market Making without Regret

Cesa-Bianchi, Nicolò, Cesari, Tommaso, Colomboni, Roberto, Foscari, Luigi, Pathak, Vinayak

We consider a sequential decision-making setting where, at every round $t$, a market maker posts a bid price $B_t$ and an ask price $A_t$ to an incoming trader (the taker) with a private valuation for one unit of some asset. If the trader's valuation is lower than the bid price, or higher than the ask price, then a trade (sell or buy) occurs. If a trade happens at round $t$, then letting $M_t$ be the market price (observed only at the end of round $t$), the maker's utility is $M_t - B_t$ if the maker bought the asset, and $A_t - M_t$ if they sold it. We characterize the maker's regret with respect to the best fixed choice of bid and ask pairs under a variety of assumptions (adversarial, i.i.d., and their variants) on the sequence of market prices and valuations. Our upper bound analysis unveils an intriguing connection relating market making to first-price auctions and dynamic pricing. Our main technical contribution is a lower bound for the i.i.d. case with Lipschitz distributions and independence between prices and valuations. The difficulty in the analysis stems from the unique structure of the reward and feedback functions, allowing an algorithm to acquire information by graduating the "cost of exploration" in an arbitrary way.

artificial intelligence, data mining, machine learning, (17 more...)

2411.13993

Country: Europe (0.28)

Genre: Research Report (0.81)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

arXiv.org Machine LearningJun-24-2024

Improved Regret Bounds for Bandits with Expert Advice

Cesa-Bianchi, Nicolò, Eldowa, Khaled, Esposito, Emmanuel, Olkhovskaya, Julia

In this research note, we revisit the bandits with expert advice problem. Under a restricted feedback model, we prove a lower bound of order $\sqrt{K T \ln(N/K)}$ for the worst-case regret, where $K$ is the number of actions, $N>K$ the number of experts, and $T$ the time horizon. This matches a previously known upper bound of the same order and improves upon the best available lower bound of $\sqrt{K T (\ln N) / (\ln K)}$. For the standard feedback model, we prove a new instance-based upper bound that depends on the agreement between the experts and provides a logarithmic improvement compared to prior results.

artificial intelligence, data mining, machine learning, (16 more...)

2406.16802

Country: Europe > Italy (0.47)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

arXiv.org Machine LearningJun-15-2024

A Theory of Interpretable Approximations

Bressan, Marco, Cesa-Bianchi, Nicolò, Esposito, Emmanuel, Mansour, Yishay, Moran, Shay, Thiessen, Maximilian

Can a deep neural network be approximated by a small decision tree based on simple features? This question and its variants are behind the growing demand for machine learning models that are *interpretable* by humans. In this work we study such questions by introducing *interpretable approximations*, a notion that captures the idea of approximating a target concept $c$ by a small aggregation of concepts from some base class $\mathcal{H}$. In particular, we consider the approximation of a binary concept $c$ by decision trees based on a simple class $\mathcal{H}$ (e.g., of bounded VC dimension), and use the tree depth as a measure of complexity. Our primary contribution is the following remarkable trichotomy. For any given pair of $\mathcal{H}$ and $c$, exactly one of these cases holds: (i) $c$ cannot be approximated by $\mathcal{H}$ with arbitrary accuracy; (ii) $c$ can be approximated by $\mathcal{H}$ with arbitrary accuracy, but there exists no universal rate that bounds the complexity of the approximations as a function of the accuracy; or (iii) there exists a constant $\kappa$ that depends only on $\mathcal{H}$ and $c$ such that, for *any* data distribution and *any* desired accuracy level, $c$ can be approximated by $\mathcal{H}$ with a complexity not exceeding $\kappa$. This taxonomy stands in stark contrast to the landscape of supervised classification, which offers a complex array of distribution-free and universally learnable scenarios. We show that, in the case of interpretable approximations, even a slightly nontrivial a-priori guarantee on the complexity of approximations implies approximations with constant (distribution-free and accuracy-free) complexity. We extend our trichotomy to classes $\mathcal{H}$ of unbounded VC dimension and give characterizations of interpretability based on the algebra generated by $\mathcal{H}$.

artificial intelligence, decision tree, machine learning, (19 more...)

2406.10529

Country:

Asia > Middle East > Israel (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningJun-3-2024

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

Jin, Tianyuan, Jang, Kyoungseok, Cesa-Bianchi, Nicolò

We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number $S$ of non-zero coefficients in the linear reward function. Previous works focused on the case where $S$ is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when $S$ is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When $S$ is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon.

algorithm, artificial intelligence, machine learning, (17 more...)

2406.01192

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

arXiv.org Artificial IntelligenceMay-22-2024

Fair Online Bilateral Trade

Bachoc, François, Cesa-Bianchi, Nicolò, Cesari, Tommaso, Colomboni, Roberto

In online bilateral trade, a platform posts prices to incoming pairs of buyers and sellers that have private valuations for a certain good. If the price is lower than the buyers' valuation and higher than the sellers' valuation, then a trade takes place. Previous work focused on the platform perspective, with the goal of setting prices maximizing the gain from trade (the sum of sellers' and buyers' utilities). Gain from trade is, however, potentially unfair to traders, as they may receive highly uneven shares of the total utility. In this work we enforce fairness by rewarding the platform with the fair gain from trade, defined as the minimum between sellers' and buyers' utilities. After showing that any no-regret learning algorithm designed to maximize the sum of the utilities may fail badly with fair gain from trade, we present our main contribution: a complete characterization of the regret regimes for fair gain from trade when, after each interaction, the platform only learns whether each trader accepted the current price. Specifically, we prove the following regret bounds: $\Theta(\ln T)$ in the deterministic setting, $\Omega(T)$ in the stochastic setting, and $\tilde{\Theta}(T^{2/3})$ in the stochastic setting when sellers' and buyers' valuations are independent of each other. We conclude by providing tight regret bounds when, after each interaction, the platform is allowed to observe the true traders' valuations.

algorithm, artificial intelligence, machine learning, (16 more...)

2405.13919

Country:

Europe > Italy (0.14)
North America > Canada (0.14)
Europe > France (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

arXiv.org Artificial IntelligenceFeb-15-2024

Information Capacity Regret Bounds for Bandits with Mediator Feedback

Eldowa, Khaled, Cesa-Bianchi, Nicolò, Metelli, Alberto Maria, Restelli, Marcello

This work addresses the mediator feedback problem, a bandit game where the decision set consists of a number of policies, each associated with a probability distribution over a common space of outcomes. Upon choosing a policy, the learner observes an outcome sampled from its distribution and incurs the loss assigned to this outcome in the present round. We introduce the policy set capacity as an information-theoretic measure for the complexity of the policy set. Adopting the classical EXP4 algorithm, we provide new regret bounds depending on the policy set capacity in both the adversarial and the stochastic settings. For a selection of policy set families, we prove nearly-matching lower bounds, scaling similarly with the capacity. We also consider the case when the policies' distributions can vary between rounds, thus addressing the related bandits with expert advice problem, which we improve upon its prior results. Additionally, we prove a lower bound showing that exploiting the similarity between the policies is not possible in general under linear bandit feedback. Finally, for a full-information variant, we provide a regret bound scaling with the information radius of the policy set.

artificial intelligence, data mining, machine learning, (16 more...)

2402.10282

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)

arXiv.org Machine LearningDec-24-2023

Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

Kuroki, Yuko, Rumi, Alberto, Tsuchiya, Taira, Vitale, Fabio, Cesa-Bianchi, Nicolò

Because of their relevance in practical applications, contextual bandits are a fundamental model of sequential decision-making with partial feedback. In particular, linear contextual bandits [Abe and Long, 1999, Auer, 2002], in which contexts are feature vectors and the loss is a linear function of the context, are among the most studied variants of contextual bandits. Traditionally, contextual bandits (and, in particular, their linear variant) have been investigated under stochastic assumptions on the generation of rewards. Namely, the loss of each action is a fixed and unknown linear function of the context to which some zero-mean noise is added. For this setting, efficient and nearly optimal algorithms, like OFUL [Abbasi-Yadkori et al., 2011] and a contextual variant of Thompson Sampling [Agrawal and Goyal, 2013], have been proposed in the past. Recently, Neu and Olkhovskaya [2020] introduced an adversarial variant of linear contextual bandits, where there are K arms and the linear loss associated with each arm is adversarially chosen in each round. They prove an upper bound on the regret of order dKT disregarding logarithmic factors, where d is the dimensionality of contexts and T is the time horizon. A matching lower bound Ω ( dKT) for this model is implied by the results of Zierahn et al. [2023]. The upper bound has been recently extended by Olkhovskaya et al. [2023], who show first and second-order regret bounds respectively of the order of K dL

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2312.15433

Country: Europe > Italy (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

arXiv.org Artificial IntelligenceNov-10-2023

Sum-max Submodular Bandits

Pasteris, Stephen, Rumi, Alberto, Vitale, Fabio, Cesa-Bianchi, Nicolò

Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including best-of-$K$-bandits, combinatorial bandits, and the bandit versions on facility location, $M$-medians, and hitting sets. We show that all functions in this class satisfy a key property that we call pseudo-concavity. This allows us to prove $\big(1 - \frac{1}{e}\big)$-regret bounds for bandit feedback in the nonstochastic setting of the order of $\sqrt{MKT}$ (ignoring log factors), where $T$ is the time horizon and $M$ is a cardinality constraint. This bound, attained by a simple and efficient algorithm, significantly improves on the $\widetilde{O}\big(T^{2/3}\big)$ regret bound for online monotone submodular maximization with bandit feedback.

artificial intelligence, data mining, machine learning, (15 more...)

2311.05975

Country:

Europe > Italy (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Data Science > Data Mining > Big Data (0.46)