Goto

Collaborating Authors

 membership query


Limitations of Membership Queries in Testable Learning

Lange, Jane, Qiao, Mingda

arXiv.org Artificial Intelligence

Membership queries (MQ) often yield speedups for learning tasks, particularly in the distribution-specific setting. We show that in the \emph{testable learning} model of Rubinfeld and Vasilyan [RV23], membership queries cannot decrease the time complexity of testable learning algorithms beyond the complexity of sample-only distribution-specific learning. In the testable learning model, the learner must output a hypothesis whenever the data distribution satisfies a desired property, and if it outputs a hypothesis, the hypothesis must be near-optimal. We give a general reduction from sample-based \emph{refutation} of boolean concept classes, as presented in [Vadhan17, KL18], to testable learning with queries (TL-Q). This yields lower bounds for TL-Q via the reduction from learning to refutation given in [KL18]. The result is that, relative to a concept class and a distribution family, no $m$-sample TL-Q algorithm can be super-polynomially more time-efficient than the best $m$-sample PAC learner. Finally, we define a class of ``statistical'' MQ algorithms that encompasses many known distribution-specific MQ learners, such as those based on influence estimation or subcube-conditional statistical queries. We show that TL-Q algorithms in this class imply efficient statistical-query refutation and learning algorithms. Thus, combined with known SQ dimension lower bounds, our results imply that these efficient membership query learners cannot be made testable.



Active Learning of Symbolic Automata Over Rational Numbers

Hagedorn, Sebastian, Muñoz, Martín, Riveros, Cristian, Icarte, Rodrigo Toro

arXiv.org Artificial Intelligence

Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the $L^*$ algorithm, introduced by Angluin. The $L^*$ algorithm learns deterministic finite-state automata (DFAs) in polynomial time when provided with a minimally adequate teacher. Unfortunately, the $L^*$ algorithm can only learn DFAs over finite alphabets, which limits its applicability. In this paper, we extend $L^*$ to learn symbolic automata whose transitions use predicates over rational numbers, i.e., over infinite and dense alphabets. Our result makes the $L^*$ algorithm applicable to new settings like (real) RGX, and time series. Furthermore, our proposed algorithm is optimal in the sense that it asks a number of queries to the teacher that is at most linear with respect to the number of transitions, and to the representation size of the predicates.


On the Hardness of Robust Classification

Pascale Gourdeau, Varun Kanade, Marta Kwiatkowska, James Worrell

Neural Information Processing Systems

We start with two negative results. We show that no non-trivial concept class can be robustly learned in the distribution-free setting against an adversary who can perturb just a single input bit.


We will fix all minor comments and typos without explicitly addressing them in the rebuttal

Neural Information Processing Systems

We will fix all minor comments and typos without explicitly addressing them in the rebuttal. Practical Impact: Our primary aim in this work is indeed theoretical. Appendix, but we will add a reference to it in the main paper. P AC T erminology: We have assumed that readers will be familiar with standard terminology from P AC learning. Non-trivial Class: The definition of non-trivial class appears just before the statement of Theorem 5 (in lines 182-183).



The Probably Approximately Correct Learning Model in Computational Learning Theory

Servedio, Rocco A.

arXiv.org Machine Learning

Leslie Valiant's 1984 paper "A Theory of the Learnable" [Val84], reproduced in this volume, has the unusual distinction of having changed the course of several scientific disciplines. Within theoretical computer science it was one of the key works giving rise to the field now known as computational learning theory, which may loosely be defined as the rigorous study of learning processes and phenomena from the computer science perspective of efficient algorithms and computational complexity. In the decades since the publication of [Val84], computational learning theory has grown into a rich field with strong connections to many other theoretical disciplines such as mathematical probability and statistics, information theory, decision theory and more. Beyond the realm of theory, Valiant's paper and the Probably Approximately Correct (PAC) model which he introduced in it have also had a great impact on the subsequent development of machine learning, a field which has already transformed many aspects of science and human society and seems certain to have an even greater influence in the future. This chapter gives an overview of the Probably Approximately Correct learning model that Valiant introduced in [Val84], explaining some of the major results and directions that the field has taken in the years since that work.



On the Hardness of Learning Regular Expressions

Attias, Idan, Reyzin, Lev, Srebro, Nathan, Vardi, Gal

arXiv.org Artificial Intelligence

Despite the theoretical significance and wide practical use of regular expressions, the computational complexity of learning them has been largely unexplored. We study the computational hardness of improperly learning regular expressions in the PAC model and with membership queries. We show that PAC learning is hard even under the uniform distribution on the hypercube, and also prove hardness of distribution-free learning with membership queries. Furthermore, if regular expressions are extended with complement or intersection, we establish hardness of learning with membership queries even under the uniform distribution. We emphasize that these results do not follow from existing hardness results for learning DFAs or NFAs, since the descriptive complexity of regular languages can differ exponentially between DFAs, NFAs, and regular expressions.


On the Hardness of Robust Classification

Pascale Gourdeau, Varun Kanade, Marta Kwiatkowska, James Worrell

Neural Information Processing Systems

We start with two negative results. We show that no non-trivial concept class can be robustly learned in the distribution-free setting against an adversary who can perturb just a single input bit.