AITopics | determinantal

Reviews: Distributed estimation of the inverse Hessian by determinantal averaging

Neural Information Processing SystemsJan-26-2025, 13:11:22 GMT

Additional Comments: • Overall, the article is well-written and structured. It has a clear contribution and also significant theoretical justification. There are only a few mistyping and grammatical errors: Line 17: "tranformation" "transformation" Line 133: " its entries is of …" "its entries are of …" Line 146: " accross" "across" Line 150: " emprical" "empirical" • In line 54, it is better to define in the theorem 2 along with . If there is no constraint on the number of estimators, this means choosing subsamples is with replacement. However, in line 158, the authors claim that their method is without replacement.

determinantal, estimation, inverse hessian, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.73)

Add feedback

Distributed estimation of the inverse Hessian by determinantal averaging

Neural Information Processing SystemsOct-10-2024, 18:42:18 GMT

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not equal the inverse of the sum. An example of this occurs in distributed Newton's method, where we wish to compute (or implicitly work with) the inverse Hessian multiplied by the gradient. In this case, locally computed estimates are biased, and so taking a uniform average will not recover the correct solution. To address this, we propose determinantal averaging, a new approach for correcting the inversion bias. This approach involves reweighting the local estimates of the Newton's step proportionally to the determinant of the local Hessian estimate, and then averaging them together to obtain an improved global estimate.

determinantal, estimation, inverse hessian, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Mathematics of Computing (0.65)
Information Technology > Artificial Intelligence (0.41)

Add feedback

Signal reconstruction using determinantal sampling

Belhadji, Ayoub, Bardenet, Rémi, Chainais, Pierre

arXiv.org Machine LearningOct-13-2023

We study the approximation of a square-integrable function from a finite number of evaluations on a random set of nodes according to a well-chosen distribution. This is particularly relevant when the function is assumed to belong to a reproducing kernel Hilbert space (RKHS). This work proposes to combine several natural finite-dimensional approximations based two possible probability distributions of nodes. These distributions are related to determinantal point processes, and use the kernel of the RKHS to favor RKHS-adapted regularity in the random design. While previous work on determinantal sampling relied on the RKHS norm, we prove mean-square guarantees in $L^2$ norm. We show that determinantal point processes and mixtures thereof can yield fast convergence rates. Our results also shed light on how the rate changes as more smoothness is assumed, a phenomenon known as superconvergence. Besides, determinantal sampling generalizes i.i.d. sampling from the Christoffel function which is standard in the literature. More importantly, determinantal sampling guarantees the so-called instance optimality property for a smaller number of function evaluations than i.i.d. sampling.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2310.09437

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > New Zealand (0.04)
North America > United States > Rocky Mountains (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Mathematics of Computing (0.93)

Add feedback

Distributed estimation of the inverse Hessian by determinantal averaging

Derezinski, Michal, Mahoney, Michael W.

Neural Information Processing SystemsMar-19-2020, 01:16:54 GMT

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not equal the inverse of the sum. An example of this occurs in distributed Newton's method, where we wish to compute (or implicitly work with) the inverse Hessian multiplied by the gradient. In this case, locally computed estimates are biased, and so taking a uniform average will not recover the correct solution. To address this, we propose determinantal averaging, a new approach for correcting the inversion bias. This approach involves reweighting the local estimates of the Newton's step proportionally to the determinant of the local Hessian estimate, and then averaging them together to obtain an improved global estimate.

determinantal, estimation, inverse hessian, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Mathematics of Computing (0.65)
Information Technology > Artificial Intelligence (0.45)

Add feedback

Distributed estimation of the inverse Hessian by determinantal averaging

Dereziński, Michał, Mahoney, Michael W.

arXiv.org Machine LearningMay-27-2019

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not equal the inverse of the sum. An example of this occurs in distributed Newton's method, where we wish to compute (or implicitly work with) the inverse Hessian multiplied by the gradient. In this case, locally computed estimates are biased, and so taking a uniform average will not recover the correct solution. To address this, we propose determinantal averaging, a new approach for correcting the inversion bias. This approach involves reweighting the local estimates of the Newton's step proportionally to the determinant of the local Hessian estimate, and then averaging them together to obtain an improved global estimate. This method provides the first known distributed Newton step that is asymptotically consistent, i.e., it recovers the exact step in the limit as the number of distributed partitions grows to infinity. To show this, we develop new expectation identities and moment bounds for the determinant and adjugate of a random matrix. Determinantal averaging can be applied not only to Newton's method, but to computing any quantity that is a linear tranformation of a matrix inverse, e.g., taking a trace of the inverse covariance matrix, which is used in data uncertainty quantification.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1905.11546

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Mathematics of Computing (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Add feedback

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Shah, Amar, Ghahramani, Zoubin

arXiv.org Machine LearningSep-26-2013

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparameteric Bayesian kernel based method to cluster data points without the need to prespecify the number of clusters or to model complicated densities from which data points are assumed to be generated from. The key insight is to use determinants of submatrices of a kernel matrix as a measure of how close together a set of points are. We explore some theoretical properties of the model and derive a natural Gibbs based algorithm with MCMC hyperparameter learning. The model is implemented on a variety of synthetic and real world data sets.

artificial intelligence, determinant, machine learning, (17 more...)

arXiv.org Machine Learning

1309.6862

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology: