AITopics | Györfi, László

Collaborating Authors

Györfi, László

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On rate-optimal classification from non-private and from private data

Csáji, Balázs Csanád, Györfi, László, Tamás, Ambrus

arXiv.org Machine LearningDec-22-2023

In this paper we revisit the classical problem of classification, but impose privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$ cannot be directly observed, and all classifiers are functions of the randomised outcome of a suitable local differential privacy mechanism. The statistician is free to choose the form of this privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of each feature vector $X_i$ and to its label $Y_i$. The classification rule is the privatized version of the well-studied partitioning classification rule. In addition to the standard Lipschitz and margin conditions, a novel characteristic is introduced, by which the exact rate of convergence of the classification error probability is calculated, both for non-private and private data.

artificial intelligence, machine learning, margin condition, (14 more...)

arXiv.org Machine Learning

2312.14889

Country: Europe > Hungary (0.16)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Lossless Transformations and Excess Risk Bounds in Statistical Inference

Györfi, László, Linder, Tamás, Walk, Harro

arXiv.org Machine LearningSep-27-2023

We study the excess minimum risk in statistical inference, defined as the difference between the minimum expected loss in estimating a random variable from an observed feature vector and the minimum expected loss in estimating the same random variable from a transformation (statistic) of the feature vector. After characterizing lossless transformations, i.e., transformations for which the excess risk is zero for all loss functions, we construct a partitioning test statistic for the hypothesis that a given transformation is lossless and show that for i.i.d. data the test is strongly consistent. More generally, we develop information-theoretic upper bounds on the excess risk that uniformly hold over fairly general classes of loss functions. Based on these bounds, we introduce the notion of a delta-lossless transformation and give sufficient conditions for a given transformation to be universally delta-lossless. Applications to classification, nonparametric regression, portfolio strategies, information bottleneck, and deep learning, are also surveyed.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2307.16735

Country:

North America > United States > New York (0.14)
Europe > United Kingdom > England (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Repeated Observations for Classification

Afşer, Hüseyin, Györfi, László, Walk, Harro

arXiv.org Artificial IntelligenceJul-19-2023

We study the problem nonparametric classification with repeated observations. Let $\bX$ be the $d$ dimensional feature vector and let $Y$ denote the label taking values in $\{1,\dots ,M\}$. In contrast to usual setup with large sample size $n$ and relatively low dimension $d$, this paper deals with the situation, when instead of observing a single feature vector $\bX$ we are given $t$ repeated feature vectors $\bV_1,\dots ,\bV_t $. Some simple classification rules are presented such that the conditional error probabilities have exponential convergence rate of convergence as $t\to\infty$. In the analysis, we investigate particular models like robust detection by nominal densities, prototype classification, linear transformation, linear classification, scaling.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.09896

Country:

North America > United States (0.14)
Europe > Hungary (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On the strong stability of ergodic iterations

Györfi, László, Lovas, Attila, Rásonyi, Miklós

arXiv.org Machine LearningApr-18-2023

We revisit processes generated by iterated random functions driven by a stationary and ergodic sequence. Such a process is called strongly stable if a random initialization exists, for which the process is stationary and ergodic, and for any other initialization, the difference of the two processes converges to zero almost surely. Under some mild conditions on the corresponding recursive map, without any condition on the driving sequence, we show the strong stability of iterations. Several applications are surveyed such as stochastic approximation and queuing. Furthermore, new results are deduced for Langevin-type iterations with dependent noise and for multitype branching processes.

artificial intelligence, iteration, machine learning, (15 more...)

arXiv.org Machine Learning

2304.04657

Country: North America > United States > New York (0.15)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

Add feedback

Tree density estimation

Györfi, László, Kontorovich, Aryeh, Weiss, Roi

arXiv.org Machine LearningDec-17-2021

A natural strategy for mitigating the curse of dimensionality in estimating probability distributions is to employ lowcomplexity family of approximation distributions. For discrete distributions, Chow and Liu [5] suggested a family of tree-based approximations and gave an efficient maximum-likelihood estimator based on Kruskal's optimal spanning tree algorithm [14]. We stress that this approach makes no structural assumptions about the sampling distribution, but rather constitutes a modeling choice. Consequently, in this paradigm, the goal is to approximate the optimal-tree distribution from the data, without any guarantees on how well the latter approximates the true sampling distribution. Extensions of the Chow-Liu approach to continuous distributions were studied by Bach and Jordan [1] and by Liu et al. [16] under various assumptions.

artificial intelligence, machine learning, mutual information, (17 more...)

arXiv.org Machine Learning

2111.11971

Country:

Europe (0.46)
Asia > Middle East > Israel (0.28)
Asia > Middle East > Jordan (0.24)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)

Add feedback

Strongly universally consistent nonparametric regression and classification with privatised data

Berrett, Thomas, Györfi, László, Walk, Harro

arXiv.org Machine LearningOct-31-2020

In recent years there has been a surge of interest in data analysis methodology that is able to achieve strong statistical performance without comprimising the privacy and security of individual data holders. This has often been driven by applications in modern technology, for example by Google (Erlingsson et al., 2014), Apple (Tang et al., 2017), and Microsoft (Ding et al., 2017), but the study goes at least as far back as Warner (1965) and is often used in more traditional fields of clinical trials (Vu and Slavkovic, 2009, Dankar and El Emam, 2013) and census data (Machanavajjhala et al., 2008, Dwork, 2019). While there has long been an awareness that sensitive data must be anonymised, it has become apparent only relatively recently that simply removing names and addresses is insufficient in many cases (e.g.

artificial intelligence, health & medicine, machine learning, (15 more...)

arXiv.org Machine Learning

2011.00216

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (0.66)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Security & Privacy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces

Györfi, László, Weiss, Roi

arXiv.org Machine LearningOct-1-2020

We study universal consistency and convergence rates of simple nearest-neighbor prototype rules for the problem of multiclass classification in metric paces. We first show that a novel data-dependent partitioning rule, named Proto-NN, is universally consistent in any metric space that admits a universally consistent rule. Proto-NN is a significant simplification of OptiNet, a recently proposed compression-based algorithm that, to date, was the only algorithm known to be universally consistent in such a general setting. Practically, Proto-NN is simpler to implement and enjoys reduced computational complexity. We then proceed to study convergence rates of the excess error probability. We first obtain rates for the standard $k$-NN rule under a margin condition and a new generalized-Lipschitz condition. The latter is an extension of a recently proposed modified-Lipschitz condition from $\mathbb R^d$ to metric spaces. Similarly to the modified-Lipschitz condition, the new condition avoids any boundness assumptions on the data distribution. While obtaining rates for Proto-NN is left open, we show that a second prototype rule that hybridizes between $k$-NN and Proto-NN achieves the same rates as $k$-NN while enjoying similar computational advantages as Proto-NN. We conjecture however that, as $k$-NN, this hybrid rule is not consistent in general.

artificial intelligence, machine learning, margin condition, (11 more...)

arXiv.org Machine Learning

2010.00636

Country: Europe > Hungary (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.83)

Add feedback