Goto

Collaborating Authors

 valiant


Entropy Rate Estimation for Markov Chains with Large State Space

Neural Information Processing Systems

Entropy estimation is one of the prototypical problems in distribution property testing. To consistently estimate the Shannon entropy of a distribution on $S$ elements with independent samples, the optimal sample complexity scales sublinearly with $S$ as $\Theta(\frac{S}{\log S})$ as shown by Valiant and Valiant \cite{Valiant--Valiant2011}. Extending the theory and algorithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with $S$ states from a sample path of $n$ observations.





Entropy Rate Estimation for Markov Chains with Large State Space

Neural Information Processing Systems

Entropy estimation is one of the prototypical problems in distribution property testing. To consistently estimate the Shannon entropy of a distribution on $S$ elements with independent samples, the optimal sample complexity scales sublinearly with $S$ as $\Theta(\frac{S}{\log S})$ as shown by Valiant and Valiant \cite{Valiant--Valiant2011}. Extending the theory and algorithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with $S$ states from a sample path of $n$ observations.


Learning CNF formulas from uniform random solutions in the local lemma regime

Feng, Weiming, Yang, Xiongxin, Yu, Yixiao, Zhang, Yiyao

arXiv.org Machine Learning

We study the problem of learning a $n$-variables $k$-CNF formula $Φ$ from its i.i.d. uniform random solutions, which is equivalent to learning a Boolean Markov random field (MRF) with $k$-wise hard constraints. Revisiting Valiant's algorithm (Commun. ACM'84), we show that it can exactly learn (1) $k$-CNFs with bounded clause intersection size under Lovász local lemma type conditions, from $O(\log n)$ samples; and (2) random $k$-CNFs near the satisfiability threshold, from $\widetilde{O}(n^{\exp(-\sqrt{k})})$ samples. These results significantly improve the previous $O(n^k)$ sample complexity. We further establish new information-theoretic lower bounds on sample complexity for both exact and approximate learning from i.i.d. uniform random solutions.


What Does It Really Mean to Learn?

The New Yorker

I read "Middlemarch" for the first time during my sophomore year of college. Why would Dorothea, a young and intelligent woman, marry that annoying old man? How could she be so stupid? No one else in the class seemed to get it, either, and this pushed our professor over the edge. "Of course you don't understand," he roared, swilling a Diet Coke.


The Perceptron Algorithm Is Fast for Non-Malicious Distributions

Neural Information Processing Systems

Within the context of Valiant's protocol for learning, the Perceptron algorithm is shown to learn an arbitrary half-space in time O(r;;) if D, the proba(cid:173) bility distribution of examples, is taken uniform over the unit sphere sn. Here f is the accuracy parameter. This is surprisingly fast, as "standard" approaches involve solution of a linear programming problem involving O( 7') constraints in n dimen(cid:173) sions. A modification of Valiant's distribution independent protocol for learning is proposed in which the distribution and the function to be learned may be cho(cid:173) sen by adversaries, however these adversaries may not communicate. It is argued that this definition is more reasonable and applicable to real world learning than Valiant's.


Agnostic PAC-Learning of Functions on Analog Neural Nets

Neural Information Processing Systems

There exist a number of negative results ([J), [BR), [KV]) about learning on neural nets in Valiant's model [V) for probably approx(cid:173) imately correct learning ("PAC-learning"). These negative results are based on an asymptotic analysis where one lets the number of nodes in the neural net go to infinit.y. Hence this analysis is less ad(cid:173) equate for the investigation of learning on a small fixed neural net. The latter type of learning problem gives rise to a different kind of asymptotic question: Can the true error of the neural net be brought arbitrarily close to that of a neural net with "optimal" weights through sufficiently long training? In this paper we employ some new arguments ill order to give a positive answer to this question in Haussler's rather realistic refinement of Valiant's model for PAC-learning ([H), [KSS)). In this more realistic model no a-priori assumptions are required about the "learning target", noise is permitted in the training data, and the inputs and outputs are not restricted to boolean values.


On the Evolvability of Monotone Conjunctions with an Evolutionary Mutation Mechanism

Diochnos, Dimitrios

Journal of Artificial Intelligence Research

Valiant (2009) introduced a framework for a quantitative approach to evolution, called evolvability. The idea is, roughly, that there is an ideal behavior in every environment and the feedback that the various organisms receive during evolution indicates how close their behavior is to ideal. Ultimately, evolvability aims at modeling and explaining mechanisms that allow near-optimal behavior of organisms while exploiting realistic computational resources. Due to a result by Feldman (2008), evolvability is equivalent to learning in the correlational statistical query (CSQ) model (Bshouty & Feldman, 2002). Thus, evolvability algorithms correspond to a special type of local search learning algorithms that fall under the umbrella of the probably approximately correct (PAC) model of learning (Valiant, 1984).