AITopics | information feedback

Collaborating Authors

information feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

a0e1c2c40fc245b5fe7251ea33fbb045-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 01:43:12 GMT

artificial intelligence, information feedback, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Data Science (0.67)

Add feedback

a0e1c2c40fc245b5fe7251ea33fbb045-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 11:46:28 GMT

external factor, information feedback, partial information feedback, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Data Science (0.67)

Add feedback

Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach

Etesami, S. Rasoul, Srikant, R.

arXiv.org Artificial IntelligenceAug-14-2024

We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed gametheoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games. Index Terms Learning stable matchings, learning Nash equilibrium, weakly acyclic games, monotone games, decentralized learning, uncoordinated learning, online mirror descent, regret minimization. Learning stable matchings is one of the fundamental problems in computer science, economics, and engineering that has received considerable attention over the past decades. Stable matchings provide a desirable notion of stability in two-sided matching markets where agents on each side of the market have preferences over the other side.

algorithm, matching, stable matching, (15 more...)

arXiv.org Artificial Intelligence

2407.21294

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.56)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games

Li, Tao, Hammar, Kim, Stadler, Rolf, Zhu, Quanyan

arXiv.org Artificial IntelligenceMar-8-2024

Asymmetric information stochastic games (\textsc{aisg}s) arise in many complex socio-technical systems, such as cyber-physical systems and IT infrastructures. Existing computational methods for \textsc{aisg}s are primarily offline and can not adapt to equilibrium deviations. Further, current methods are limited to special classes of \textsc{aisg}s to avoid belief hierarchies. To address these limitations, we propose conjectural online learning (\textsc{col}), an online method for generic \textsc{aisg}s. \textsc{col} uses a forecaster-actor-critic (\textsc{fac}) architecture where subjective forecasts are used to conjecture the opponents' strategies within a lookahead horizon, and Bayesian learning is used to calibrate the conjectures. To adapt strategies to nonstationary environments, \textsc{col} uses online rollout with cost function approximation (actor-critic). We prove that the conjectures produced by \textsc{col} are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We also prove that the empirical strategy profile induced by \textsc{col} converges to the Berk-Nash equilibrium, a solution concept characterizing rationality under subjectivity. Experimental results from an intrusion response use case demonstrate \textsc{col}'s superiority over state-of-the-art reinforcement learning methods against nonstationary attacks.

artificial intelligence, machine learning, opponent, (17 more...)

arXiv.org Artificial Intelligence

2402.18781

Country:

North America > United States > New York (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (0.88)
Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Dynamic Budget Throttling in Repeated Second-Price Auctions

Chen, Zhaohua, Wang, Chang, Wang, Qian, Pan, Yuqi, Shi, Zhuming, Cai, Zheng, Ren, Yukun, Zhu, Zhihua, Deng, Xiaotie

arXiv.org Artificial IntelligenceDec-12-2023

In today's online advertising markets, a crucial requirement for an advertiser is to control her total expenditure within a time horizon under some budget. Among various budget control methods, throttling has emerged as a popular choice, managing an advertiser's total expenditure by selecting only a subset of auctions to participate in. This paper provides a theoretical panorama of a single advertiser's dynamic budget throttling process in repeated second-price auctions. We first establish a lower bound on the regret and an upper bound on the asymptotic competitive ratio for any throttling algorithm, respectively, when the advertiser's values are stochastic and adversarial. Regarding the algorithmic side, we propose the OGD-CB algorithm, which guarantees a near-optimal expected regret with stochastic values. On the other hand, when values are adversarial, we prove that this algorithm also reaches the upper bound on the asymptotic competitive ratio. We further compare throttling with pacing, another widely adopted budget control method, in repeated second-price auctions. In the stochastic case, we demonstrate that pacing is generally superior to throttling for the advertiser, supporting the well-known result that pacing is asymptotically optimal in this scenario. However, in the adversarial case, we give an exciting result indicating that throttling is also an asymptotically optimal dynamic bidding strategy. Our results bridge the gaps in theoretical research of throttling in repeated auctions and comprehensively reveal the ability of this popular budget-smoothing strategy.

advertiser, algorithm, auction, (15 more...)

arXiv.org Artificial Intelligence

2207.0469

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Add feedback

On the Re-Solving Heuristic for (Binary) Contextual Bandits with Knapsacks

Ai, Rui, Chen, Zhaohua, Deng, Xiaotie, Pan, Yuqi, Wang, Chang, Yang, Mingwei

arXiv.org Artificial IntelligenceNov-25-2022

In the problem of (binary) contextual bandits with knapsacks (CBwK), the agent receives an i.i.d. context in each of the $T$ rounds and chooses an action, resulting in a random reward and a random consumption of resources that are related to an i.i.d. external factor. The agent's goal is to maximize the accumulated reward under the initial resource constraints. In this work, we combine the re-solving heuristic, which proved successful in revenue management, with distribution estimation techniques to solve this problem. We consider two different information feedback models, with full and partial information, which vary in the difficulty of getting a sample of the external factor. Under both information feedback settings, we achieve two-way results: (1) For general problems, we show that our algorithm gets an $\widetilde O(T^{\alpha_u} + T^{\alpha_v} + T^{1/2})$ regret against the fluid benchmark. Here, $\alpha_u$ and $\alpha_v$ reflect the complexity of the context and external factor distributions, respectively. This result is comparable to existing results. (2) When the fluid problem is linear programming with a unique and non-degenerate optimal solution, our algorithm leads to an $\widetilde O(1)$ regret. To the best of our knowledge, this is the first $\widetilde O(1)$ regret result in the CBwK problem regardless of information feedback models. We further use numerical experiments to verify our results.

artificial intelligence, information feedback, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.13952

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

Distributed Online Optimization with Long-Term Constraints

Yuan, Deming, Proutiere, Alexandre, Shi, Guodong

arXiv.org Machine LearningDec-20-2019

We consider distributed online convex optimization problems, where the distributed system consists of various computing units connected through a time-varying communication graph. In each time step, each computing unit selects a constrained vector, experiences a loss equal to an arbitrary convex function evaluated at this vector, and may communicate to its neighbors in the graph. The objective is to minimize the system-wide loss accumulated over time. We propose a decentralized algorithm with regret and cumulative constraint violation in $\mathcal{O}(T^{\max\{c,1-c\} })$ and $\mathcal{O}(T^{1-c/2})$, respectively, for any $c\in (0,1)$, where $T$ is the time horizon. When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$. These regret scalings match those obtained by state-of-the-art algorithms and fundamental limits in the corresponding centralized online optimization problem (for both convex and strongly convex loss functions). In the case of bandit feedback, the proposed algorithms achieve a regret and constraint violation in $\mathcal{O}(T^{\max\{c,1-c/3 \} })$ and $\mathcal{O}(T^{1-c/2})$ for any $c\in (0,1)$. We numerically illustrate the performance of our algorithms for the particular case of distributed online regularized linear regression problems.

algorithm, constraint violation, optimization, (15 more...)

arXiv.org Machine Learning

1912.09705

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

Distributed Online Linear Regression

Yuan, Deming, Proutiere, Alexandre, Shi, Guodong

arXiv.org Machine LearningFeb-13-2019

We study online linear regression problems in a distributed setting, where the data is spread over a network. In each round, each network node proposes a linear predictor, with the objective of fitting the \emph{network-wide} data. It then updates its predictor for the next round according to the received local feedback and information received from neighboring nodes. The predictions made at a given node are assessed through the notion of regret, defined as the difference between their cumulative network-wide square errors and those of the best off-line network-wide linear predictor. Various scenarios are investigated, depending on the nature of the local feedback (full information or bandit feedback), on the set of available predictors (the decision set), and the way data is generated (by an oblivious or adaptive adversary). We propose simple and natural distributed regression algorithms, involving, at each node and in each round, a local gradient descent step and a communication and averaging step where nodes aim at aligning their predictors to those of their neighbors. We establish regret upper bounds typically in ${\cal O}(T^{3/4})$ when the decision set is unbounded and in ${\cal O}(\sqrt{T})$ in case of bounded decision set.

algorithm, avg, inequality, (17 more...)

arXiv.org Machine Learning

1902.04774

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)

Add feedback

Online Newton Step Algorithm with Estimated Gradient

Liu, Binbin, Li, Jundong, Song, Yunquan, Liang, Xijun, Jian, Ling, Liu, Huan

arXiv.org Machine LearningDec-11-2018

They have shown to be effective in handling large-scale and high-velocity streaming data and emerged to become popular in the big data era Hoi et al. [2018, 2014]. In recent years, a number of effective online learning algorithms have been investigated and applied in a variety of high impact domains,ranging from game theory, information theory to machine learning and data mining Ding et al. [2017], Shalev-Shwartz [2011], Wang et al. [2003]. Most previously proposed online learning algorithms fall into the wellestablished frameworkof online convex optimization Gordon [1999], Zinkevich [2003]. In terms of the optimization algorithms, online learning algorithms can be grouped into the following categories: (i) first-order algorithms which aim to optimize the objective function using the first-order (sub) gradient such as the well-known OGD algorithm Zinkevich [2003]; and (ii) second-order algorithms which aim to exploit second-order information to speed up the convergence of the optimization, such as the ONS algorithm Hazan et al. [2007]. In online convex optimization, previous approaches are mainly based on the first-order optimization, i.e., optimization using the first-order derivative of the cost function. Theregret bound achieved by these algorithms is proportional to the polynomial of the number of rounds T . For example, Zinkevich [2003] showed that with the simple OGD, we can achieve the regret bound of O( T). Later on, Hazan et al. [2007] introduced a new algorithm with ONS by exploiting the second-order derivative of the cost function, which can be viewed as an online 2 analogy of the Newton-Raphson method Ypma and Tjalling [1995] in the offline learning.Although the time complexity O(d

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1811.09955

Country: North America (0.46)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

Cesa-Bianchi, Nicolò, Gaillard, Pierre, Gentile, Claudio, Gerchinovitz, Sébastien

arXiv.org Machine LearningJun-30-2017

We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information. For full information feedback and Lipschitz losses, we design the first explicit algorithm achieving the minimax regret rate (up to log factors). In a partial feedback model motivated by second-price auctions, we obtain algorithms for Lipschitz and semi-Lipschitz losses with regret bounds improving on the known bounds for standard bandit feedback. Our analysis combines novel results for contextual second-price auctions with a novel algorithmic approach based on chaining. When the context space is Euclidean, our chaining approach is efficient and delivers an even better regret bound.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1702.08211

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Industry:

Marketing (0.67)
Information Technology > Services (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback