AITopics | l-ensemble

Collaborating Authors

l-ensemble

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Maximizing Induced Cardinality Under a Determinantal Point Process

Neural Information Processing SystemsMay-26-2025, 07:32:13 GMT

In that framework and once a DPP has been learned, the authors remark that finding the sample with maximum DPP likelihood (the so-called "MAP recommendation") does not lead to meaningful recommendations. The contributions are as follows. The authors introduce another utility function, the maximum induced cardinality (MIC), and explain how to approximately optimize it using submodular approximations. Algorithms to optimize the MIC criterion are compared on synthetic datasets.

determinantal point process, maximizing induced cardinality, section 1, (11 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.58)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.32)

Add feedback

Graph Convolutional Neural Networks with Diverse Negative Samples via Decomposed Determinant Point Processes

Duan, Wei, Xuan, Junyu, Qiao, Maoying, Lu, Jie

arXiv.org Artificial IntelligenceSep-6-2023

Graph convolutional networks (GCNs) have achieved great success in graph representation learning by extracting high-level features from nodes and their topology. Since GCNs generally follow a message-passing mechanism, each node aggregates information from its first-order neighbour to update its representation. As a result, the representations of nodes with edges between them should be positively correlated and thus can be considered positive samples. However, there are more non-neighbour nodes in the whole graph, which provide diverse and useful information for the representation update. Two non-adjacent nodes usually have different representations, which can be seen as negative samples. Besides the node representations, the structural information of the graph is also crucial for learning. In this paper, we used quality-diversity decomposition in determinant point processes (DPP) to obtain diverse negative samples. When defining a distribution on diverse subsets of all non-neighbouring nodes, we incorporate both graph structure information and node representations. Since the DPP sampling process requires matrix eigenvalue decomposition, we propose a new shortest-path-base method to improve computational efficiency. Finally, we incorporate the obtained negative samples into the graph convolution operation. The ideas are evaluated empirically in experiments on node classification tasks. These experiments show that the newly proposed methods not only improve the overall performance of standard representation learning but also significantly alleviate over-smoothing problems.

negative sample, node, representation, (16 more...)

arXiv.org Artificial Intelligence

2212.02055

Country:

North America > United States > Nevada (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(14 more...)

Genre:

Instructional Material (0.67)
Research Report (0.64)

Industry: Education > Curriculum (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Faster Sampler for Discrete Determinantal Point Processes

Barthelmé, Simon, Tremblay, Nicolas, Amblard, Pierre-Olivier

arXiv.org Artificial IntelligenceFeb-22-2023

Discrete Determinantal Point Processes (DPPs) have a wide array of potential applications for subsampling datasets. They are however held back in some cases by the high cost of sampling. In the worst-case scenario, the sampling cost scales as O(n^3) where n is the number of elements of the ground set. A popular workaround to this prohibitive cost is to sample DPPs defined by low-rank kernels. In such cases, the cost of standard sampling algorithms scales as O(np^2 + nm^2) where m is the (average) number of samples of the DPP (usually m << n) and p the rank of the kernel used to define the DPP (m \leq p \leq n). The first term, O(np^2), comes from a SVD-like step. We focus here on the second term of this cost, O(nm^2), and show that it can be brought down to O(nm + m^3 log m) without loss on the sampling's exactness. In practice, we observe very substantial speedups compared to the classical algorithm as soon as n > 1000. The algorithm described here is a close variant of the standard algorithm for sampling continuous DPPs, and uses rejection sampling. In the specific case of projection DPPs, we also show that any additional sample can be drawn in time O(m^3 log m). Finally, an interesting by-product of the analysis is that a realisation from a DPP is typically contained in a subset of size O(m log m) formed using leverage score i.i.d. sampling.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2210.17358

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > Pennsylvania (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Determinantal Point Processes Implicitly Regularize Semi-parametric Regression Problems

Fanuel, Michaël, Schreurs, Joachim, Suykens, Johan A. K.

arXiv.org Machine LearningNov-13-2020

Semi-parametric regression models are used in several applications which require comprehensibility without sacrificing accuracy. Examples are spline interpolation in geophysics, or non-linear time series problems, where the system includes for instance a linear and non-linear component. We discuss here the use of a finite Determinantal Point Process (DPP) sampling for approximating semi-parametric models in two cases. On the one hand, in the case of large training data sets, DPP sampling is used to reduce the number of model parameters. On the other hand, DPPs can determine experimental designs in the case of the optimal interpolation models. Recently, Barthelm\'e, Tremblay, Usevich, and Amblard introduced a novel representation of finite DPP's. They formulated extended $L$-ensembles that can conveniently represent for instance partial-projection DPPs and suggest their use for optimal interpolation. With the help of this formalism, we derive a key identity illustrating the implicit regularization effect of determinantal sampling for semi-parametric regression and interpolation. Also, a novel projected Nystr\"om approximation is defined and used to derive a bound on the expected risk for the corresponding approximation of semi-parametric regression. This work naturally extends similar results obtained for kernel ridge regression.

approximation, matrix, regression, (14 more...)

arXiv.org Machine Learning

2011.06964

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

On the Relationship Between Probabilistic Circuits and Determinantal Point Processes

Zhang, Honghua, Holtzen, Steven, Broeck, Guy Van den

arXiv.org Artificial IntelligenceJun-26-2020

Scaling probabilistic models to large realistic problems and datasets is a key challenge in machine learning. Central to this effort is the development of tractable probabilistic models (TPMs): models whose structure guarantees efficient probabilistic inference algorithms. The current landscape of TPMs is fragmented: there exist various kinds of TPMs with different strengths and weaknesses. Two of the most prominent classes of TPMs are determinantal point processes (DPPs) and probabilistic circuits (PCs). This paper provides the first systematic study of their relationship. We propose a unified analysis and shared language for discussing DPPs and PCs. Then we establish theoretical barriers for the unification of these two families, and prove that there are cases where DPPs have no compact representation as a class of PCs. We close with a perspective on the central problem of unifying these tractable models.

artificial intelligence, l-ensemble, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2006.15233

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Asymptotic Equivalence of Fixed-size and Varying-size Determinantal Point Processes

Barthelmé, Simon, Amblard, Pierre-Olivier, Tremblay, Nicolas

arXiv.org Machine LearningAug-21-2018

Determinantal Point Processes (DPPs) are popular models for point processes with repulsion. They appear in numerous contexts, from physics to graph theory, and display appealing theoretical properties. On the more practical side of things, since DPPs tend to select sets of points that are some distance apart (repulsion), they have been advocated as a way of producing random subsets with high diversity. DPPs come in two variants: fixed-size and varying-size. A sample from a varying-size DPP is a subset of random cardinality, while in fixed-size "$k$-DPPs" the cardinality is fixed. The latter makes more sense in many applications, but unfortunately their computational properties are less attractive, since, among other things, inclusion probabilities are harder to compute. In this work we show that as the size of the ground set grows, $k$-DPPs and DPPs become equivalent, meaning that their inclusion probabilities converge. As a by-product, we obtain saddlepoint formulas for inclusion probabilities in $k$-DPPs. These turn out to be extremely accurate, and suffer less from numerical difficulties than exact methods do. Our results also suggest that $k$-DPPs and DPPs also have equivalent maximum likelihood estimators. Finally, we obtain results on asymptotic approximations of elementary symmetric polynomials which may be of independent interest.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1803.01576

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > New York (0.04)
North America > Saint Martin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)

Add feedback

Optimized Algorithms to Sample Determinantal Point Processes

Tremblay, Nicolas, Barthelme, Simon, Amblard, Pierre-Olivier

arXiv.org Machine LearningFeb-23-2018

In this technical report, we discuss several sampling algorithms for Determinantal Point Processes (DPP). DPPs have recently gained a broad interest in the machine learning and statistics literature as random point processes with negative correlation, i.e., ones that can generate a "diverse" sample from a set of items. They are parametrized by a matrix $\mathbf{L}$, called $L$-ensemble, that encodes the correlations between items. The standard sampling algorithm is separated in three phases: 1/~eigendecomposition of $\mathbf{L}$, 2/~an eigenvector sampling phase where $\mathbf{L}$'s eigenvectors are sampled independently via a Bernoulli variable parametrized by their associated eigenvalue, 3/~a Gram-Schmidt-type orthogonalisation procedure of the sampled eigenvectors. In a naive implementation, the computational cost of the third step is on average $\mathcal{O}(N\mu^3)$ where $\mu$ is the average number of samples of the DPP. We give an algorithm which runs in $\mathcal{O}(N\mu^2)$ and is extremely simple to implement. If memory is a constraint, we also describe a dual variant with reduced memory costs. In addition, we discuss implementation details often missing in the literature.

algorithm, artificial intelligence, determinantal point process, (14 more...)

arXiv.org Machine Learning

1802.08471

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

Graphical structure of conditional independencies in determinantal point processes

Tadić, Tvrtko

arXiv.org Machine LearningJun-28-2014

Determinantal point process have recently been used as models in machine learning and this has raised questions regarding the characterizations of conditional independence. In this paper we investigate characterizations of conditional independence. We describe some conditional independencies through the conditions on the kernel of a determinantal point process, and show many can be obtained using the graph induced by a kernel of the L-ensemble. In recent years there have been several machine learning papers about the applications of determinantal point processes (DPP's) [4], [7], [8], [9]... An overview of theory, recent applications and problems in learning DPP's is given in a recent extensive survey [6] by Kulesza and Taskar. In a private communication with Ben Taskar, one of the questions from survey [6] (see §7.3), that remains for future research, was brought to author's attention: - Is there a simple characterization of the conditional independence relations encoded by a DPP? This question arises naturally having in mind conditional independence structure models (see [12]), such as graphical models (see [11]) that are often used. It turns out that, from the mathematical view point, elegant characterizations, similar to those in graphical models, exist. This paper provides two (main) characterizations: - the block in a Schur complement of the kernel has to be a 0-block (Theorem 16, Proposition 17); - we can use the structure of the graph induced by the kernel of the L-ensemble to read many conditional independencies in the process (Theorem 28, Proposition 30).

disjoint subset, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1406.5577

Country: Europe > Croatia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.55)

Add feedback

Becoming More Robust to Label Noise with Classifier Diversity

Smith, Michael R., Martinez, Tony

arXiv.org Machine LearningMar-7-2014

It is widely known in the machine learning community that class noise can be (and often is) detrimental to inducing a model of the data. Many current approaches use a single, often biased, measurement to determine if an instance is noisy. A biased measure may work well on certain data sets, but it can also be less effective on a broader set of data sets. In this paper, we present noise identification using classifier diversity (NICD) -- a method for deriving a less biased noise measurement and integrating it into the learning process. To lessen the bias of the noise measure, NICD selects a diverse set of classifiers (based on their predictions of novel instances) to determine which instances are noisy. We examine NICD as a technique for filtering, instance weighting, and selecting the base classifiers of a voting ensemble. We compare NICD with several other noise handling techniques that do not consider classifier diversity on a set of 54 data sets and 5 learning algorithms. NICD significantly increases the classification accuracy over the other considered approaches and is effective across a broad set of data sets and learning algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1403.1893

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Add feedback