AITopics | Bax, Eric

Collaborating Authors

Bax, Eric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Non-asymptotic approximations for Pearson's chi-square statistic and its application to confidence intervals for strictly convex functions of the probability weights of discrete distributions

Bax, Eric, Ouimet, Frédéric

arXiv.org Machine LearningSep-4-2023

In this paper, we develop a non-asymptotic local normal approximation for multinomial probabilities. First, we use it to find non-asymptotic total variation bounds between the measures induced by uniformly jittered multinomials and the multivariate normals with the same means and covariances. From the total variation bounds, we also derive a comparison of the cumulative distribution functions and quantile coupling inequalities between Pearson's chi-square statistic (written as the normalized quadratic form of a multinomial vector) and its multivariate normal analogue. We apply our results to find confidence intervals for the negative entropy of discrete distributions. Our method can be applied more generally to find confidence intervals for strictly convex functions of the weights of discrete distributions.

approximation, artificial intelligence, proposition 2, (16 more...)

arXiv.org Machine Learning

2309.01882

Country:

North America > United States (0.46)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

Selecting a number of voters for a voting ensemble

Bax, Eric

arXiv.org Machine LearningApr-23-2021

For a voting ensemble that selects an odd-sized subset of the ensemble classifiers at random for each example, applies them to the example, and returns the majority vote, we show that any number of voters may minimize the error rate over an out-of-sample distribution. The optimal number of voters depends on the out-of-sample distribution of the number of classifiers in error. To select a number of voters to use, estimating that distribution then inferring error rates for numbers of voters gives lower-variance estimates than directly estimating those error rates.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Machine Learning

2104.11833

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.83)

Add feedback

Speculate-Correct Error Bounds for k-Nearest Neighbor Classifiers

Bax, Eric, Weng, Lingjie, Tian, Xu

arXiv.org Machine LearningSep-15-2017

We introduce the speculate-correct method to derive error bounds for local classifiers. Using it, we show that k nearest neighbor classifiers, in spite of their famously fractured decision boundaries, have exponential error bounds with O(sqrt((k + ln n) / n)) error bound range for n in-sample examples.

artificial intelligence, machine learning, neighbor, (17 more...)

arXiv.org Machine Learning

1410.25

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Ensemble Validation: Selectivity has a Price, but Variety is Free

Bax, Eric, Kooti, Farshad

arXiv.org Machine LearningOct-4-2016

If classifiers are selected from a hypothesis class to form an ensemble, bounds on average error rate over the selected classifiers include a component for selectivity, which grows as the fraction of hypothesis classifiers selected for the ensemble shrinks, and a component for variety, which grows with the size of the hypothesis class or in-sample data set. We show that the component for selectivity asymptotically dominates the component for variety, meaning that variety is essentially free.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1610.01234

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.41)

Add feedback

Validation of Matching

Le, Ya, Bax, Eric, Barbieri, Nicola, Soriano, David Garcia, Mehta, Jitesh, Li, James

arXiv.org Machine LearningApr-11-2016

We introduce a technique to compute probably approximately correct (PAC) bounds on precision and recall for matching algorithms. The bounds require some verified matches, but those matches may be used to develop the algorithms. The bounds can be applied to network reconciliation or entity resolution algorithms, which identify nodes in different networks or values in a data set that correspond to the same entity. For network reconciliation, the bounds do not require knowledge of the network generation process.

algorithm, artificial intelligence, natural language, (19 more...)

arXiv.org Machine Learning

1411.0023

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.38)

Add feedback

Some Theory For Practical Classifier Validation

Bax, Eric, Le, Ya

arXiv.org Machine LearningOct-9-2015

We compare and contrast two approaches to validating a trained classifier while using all in-sample data for training. One is simultaneous validation over an organized set of hypotheses (SVOOSH), the well-known method that began with VC theory. The other is withhold and gap (WAG). WAG withholds a validation set, trains a holdout classifier on the remaining data, uses the validation data to validate that classifier, then adds the rate of disagreement between the holdout classifier and one trained using all in-sample data, which is an upper bound on the difference in error rates. We show that complex hypothesis classes and limited training data can make WAG a favorable alternative.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1510.02676

Country:

North America > Canada (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.31)

Add feedback

Improved Error Bounds Based on Worst Likely Assignments

Bax, Eric

arXiv.org Machine LearningMar-31-2015

Error bounds based on worst likely assignments use permutation tests to validate classifiers. Worst likely assignments can produce effective bounds even for data sets with 100 or fewer training examples. This paper introduces a statistic for use in the permutation tests of worst likely assignments that improves error bounds, especially for accurate classifiers, which are typically the classifiers of interest.

artificial intelligence, assignment, inductive learning, (17 more...)

arXiv.org Machine Learning

1504.00052

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback