AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

On solutions of the distributional Bellman equation

Gerstenberg, Julian, Neininger, Ralph, Spiegel, Denis

arXiv.org Machine LearningFeb-15-2022

In distributional reinforcement learning not only expected returns but the complete return distributions of a policy are taken into account. The return distribution for a fixed policy is given as the solution of an associated distributional Bellman equation. In this note we consider general distributional Bellman equations and study existence and uniqueness of their solutions as well as tail properties of return distributions. We give necessary and sufficient conditions for existence and uniqueness of return distributions and identify cases of regular variation. We link distributional Bellman equations to multivariate affine distributional equations. We show that any solution of a distributional Bellman equation can be obtained as the vector of marginal laws of a solution to a multivariate affine distributional equation. This makes the general theory of such equations applicable to the distributional reinforcement learning setting.

distributional bellman equation

arXiv.org Machine Learning

2202.00081

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback

Probability and Statistics for Business and Data Science

#artificialintelligenceFeb-13-2022, 09:02:02 GMT

Welcome to Probability and Statistics for Business and Data Science! In this course we cover what you need to know about probability and statistics to succeed in business and the data science field! This practical course will go over theory and implementation of statistics to real world problems. Each section has example problems, in course quizzes, and assessment tests. We'll start by talking about the basics of data, understanding how to examine it with measurements of central tendency, dispersion, and also building an understanding of how bivariate data sources can relate to each other.

business and data science, hypothesis testing, probability and statistic, (8 more...)

#artificialintelligence

Genre: Instructional Material (0.37)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback

Linear transformations and matrices - Master Data Science

#artificialintelligenceFeb-11-2022, 19:31:52 GMT

This post will be quite an interesting one. We will show how a 2D plane can be transformed into another one. Understanding these concepts is a crucial step for some more advanced linear algebra/machine learning methods (e.g. So, let's proceed and we will learn how to connect a matrix-vector multiplication with a linear transformation. In this post we will introduce a linear transformation. A linear transformation can also be seen as a simple function.

linear transformation, transformation, vector, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.56)

Add feedback

Graphon-aided Joint Estimation of Multiple Graphs

Navarro, Madeline, Segarra, Santiago

arXiv.org Machine LearningFeb-11-2022

For instance, one would expect certain levels of similarities between the We consider the problem of estimating the topology of multiple networks brain networks of different healthy individuals or between the same from nodal observations, where these networks are assumed social network observed at different points in time. Prominent methods to be drawn from the same (unknown) random graph model. We for multiple network inference include statistical approaches, adopt a graphon as our random graph model, which is a nonparametric primarily consisting of the joint estimation of Gaussian graphical model from which graphs of potentially different sizes can models [13-17]. These methods typically involve modifications on be drawn. The versatility of graphons allows us to tackle the joint the graphical lasso formulation with additional encouragement of inference problem even for the cases where the graphs to be recovered structural similarity. Estimation of time-varying graphs is widely contain different number of nodes and lack precise alignment popular, as the relationship between graphs is typically straightforward across the graphs. Our solution is based on combining a maximum to implement by considering that graph variation is smooth likelihood penalty with graphon estimation schemes and can be used across time [18, 19]. The above methods for estimating multiple networks to augment existing network inference methods. We validate our typically enforce similar structure, such as promoting similar proposed approach by comparing its performance against competing sparsity patterns [20].

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Machine Learning

2202.05686

Country:

North America > United States (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance

Faw, Matthew, Tziotis, Isidoros, Caramanis, Constantine, Mokhtari, Aryan, Shakkottai, Sanjay, Ward, Rachel

arXiv.org Machine LearningFeb-11-2022

We study convergence rates of AdaGrad-Norm as an exemplar of adaptive stochastic gradient methods (SGD), where the step sizes change based on observed stochastic gradients, for minimizing non-convex, smooth objectives. Despite their popularity, the analysis of adaptive SGD lags behind that of non adaptive methods in this setting. Specifically, all prior works rely on some subset of the following assumptions: (i) uniformly-bounded gradient norms, (ii) uniformly-bounded stochastic gradient variance (or even noise support), (iii) conditional independence between the step size and stochastic gradient. In this work, we show that AdaGrad-Norm exhibits an order optimal convergence rate of $\mathcal{O}\left(\frac{\mathrm{poly}\log(T)}{\sqrt{T}}\right)$ after $T$ iterations under the same assumptions as optimally-tuned non adaptive SGD (unbounded gradient norms and affine noise variance scaling), and crucially, without needing any tuning parameters. We thus establish that adaptive gradient methods exhibit order-optimal convergence in much broader regimes than previously understood.

algorithm, comp, gradient, (16 more...)

arXiv.org Machine Learning

2202.05791

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.95)

Add feedback

Peng

AAAI ConferencesFeb-8-2022, 12:39:31 GMT

Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost for big data. In this paper, we propose a new Bayesian approach, EigenGP, that learns both basis dictionary elements -- eigenfunctions of a GP prior -- and prior precisions in a sparse finite model. It is well known that, among all orthogonal basis functions, eigenfunctions can provide the most compact representation. Unlike other sparse Bayesian finite models where the basis function has a fixed form, our eigenfunctions live in a reproducing kernel Hilbert space as a finite linear combination of kernel functions. We learn the dictionary elements -- eigenfunctions -- and the prior precisions over these elements as well as all the other hyperparameters from data by maximizing the model marginal likelihood. We explore computational linear algebra to simplify the gradient computation significantly. Our experimental results demonstrate improved predictive performance of EigenGP over alternative sparse GP methods as well as relevance vector machines.

dictionary element, eigenfunction, representation, (5 more...)

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)

Add feedback

Piacentini

AAAI ConferencesFeb-8-2022, 11:28:10 GMT

Compilation techniques in planning reformulate a problem into an alternative encoding for which efficient, off-the-shelf solvers are available. In this work, we present a novel mixed-integer linear programming (MILP) compilation for cost-optimal numeric planning with instantaneous actions. While recent works on the problem are restricted to actions that modify variables present in simple numeric conditions, our MILP formulation, in addition, handles linear conditions and linear action effects on numeric state variables. Such problems are particularly challenging due to the state-dependency of the action effects. Experiments show that our approach, in addition to being the state of the art for the more general problem class, is competitive with heuristic search-based planners on domains with only simple numeric conditions.

action effect, piacentini, simple numeric condition

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)

Add feedback

Azad

AAAI ConferencesFeb-8-2022, 09:30:10 GMT

A live interactive narrative (LIN) is an experience where multiple players take on fictional roles and interact with real-world objects and actors to participate in a pre-authored narrative. Temporal properties of LINs are important to its viability and aesthetic quality and hence deserve special design consideration. In this paper, we tackle the largely overlooked problem of scheduling a multiplayer interactive narrative and propose the Live Interactive Narrative Scheduling Problem (LINSP), which handles reasoning under temporal uncertainty, resource scheduling, and non-linear plot choices. We present a mixed-integer linear programming formulation of the problem and empirically evaluates its scalability over large narrative instances.

azad, narrative

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.72)

Add feedback

Spectral embedding and the latent geometry of multipartite networks

Modell, Alexander, Gallagher, Ian, Cape, Joshua, Rubin-Delanchy, Patrick

arXiv.org Machine LearningFeb-8-2022

Spectral embedding finds vector representations of the nodes of a network, based on the eigenvectors of its adjacency or Laplacian matrix, and has found applications throughout the sciences. Many such networks are multipartite, meaning their nodes can be divided into partitions and nodes of the same partition are never connected. When the network is multipartite, this paper demonstrates that the node representations obtained via spectral embedding live near partition-specific low-dimensional subspaces of a higher-dimensional ambient space. For this reason we propose a follow-on step after spectral embedding, to recover node representations in their intrinsic rather than ambient dimension, proving uniform consistency under a low-rank, inhomogeneous random graph model. Our method naturally generalizes bipartite spectral embedding, in which node representations are obtained by singular value decomposition of the biadjacency or bi-Laplacian matrix.

graph, matrix, spectral, (13 more...)

arXiv.org Machine Learning

2202.03945

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Bristol (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

Add feedback

Probability and Statistics for Business and Data Science

#artificialintelligenceJan-30-2022, 13:49:37 GMT

Probability for improved business decisions: Introduction, Combinatorics, Bayesian Inference, Distributions. Welcome to Probability and Statistics for Business and Data Science! In this course we cover what you need to know about probability and statistics to succeed in business and the data science field! This practical course will go over theory and implementation of statistics to real world problems. Each section has example problems, in course quizzes, and assessment tests.

business and data science, probability and statistic, statistics

#artificialintelligence

Genre: Instructional Material (0.39)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)

Add feedback