Goto

Collaborating Authors

 Computational Learning Theory


Optimal learning with Bernstein Online Aggregation

arXiv.org Machine Learning

We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). The exponential weights include an accuracy term and a second order term that is a proxy of the quadratic variation as in Hazan and Kale (2010). This second term stabilizes the procedure that is optimal in different senses. We first obtain optimal regret bounds in the deterministic context. Then, an adaptive version is the first exponential weights algorithm that exhibits a second order bound with excess losses that appears first in Gaillard et al. (2014). The second order bounds in the deterministic context are extended to a general stochastic context using the cumulative predictive risk. Such conversion provides the main result of the paper, an inequality of a novel type comparing the procedure with any deterministic aggregation procedure for an integrated criteria. Then we obtain an observable estimate of the excess of risk of the BOA procedure. To assert the optimality, we consider finally the iid case for strongly convex and Lipschitz continuous losses and we prove that the optimal rate of aggregation of Tsybakov (2003) is achieved. The batch version of the BOA procedure is then the first adaptive explicit algorithm that satisfies an optimal oracle inequality with high probability.


Refined Error Bounds for Several Learning Algorithms

arXiv.org Machine Learning

This article studies the achievable guarantees on the error rates of certain learning algorithms, with particular focus on refining logarithmic factors. Many of the results are based on a general technique for obtaining bounds on the error rates of sample-consistent classifiers with monotonic error regions, in the realizable case. We prove bounds of this type expressed in terms of either the VC dimension or the sample compression size. This general technique also enables us to derive several new bounds on the error rates of general sample-consistent learning algorithms, as well as refined bounds on the label complexity of the CAL active learning algorithm. Additionally, we establish a simple necessary and sufficient condition for the existence of a distribution-free bound on the error rates of all sample-consistent learning rules, converging at a rate inversely proportional to the sample size. We also study learning in the presence of classification noise, deriving a new excess error rate guarantee for general VC classes under Tsybakov's noise condition, and establishing a simple and general necessary and sufficient condition for the minimax excess risk under bounded noise to converge at a rate inversely proportional to the sample size.


European Commission : CORDIS : News and Events : How maggots are influencing the future of robotics

#artificialintelligence

What can software designers and ICT specialists learn from maggots? Quite a lot, it would appear. Through understanding how complex learning processes in simple organisms work, EU-funded scientists hope to usher in an era of self-learning robots and predictive computing. Even with limited brain power, an organism can choose the right thing to do in response to external stimuli, which is something that current computational learning theory cannot fully account for. Learning from maggots The EU-funded MINIMAL project, launched in 2014, has focused on the learning processes in a relatively simple animal, the fruit fly larva (maggots).


The Mathematics of Machine Learning

#artificialintelligence

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.


We are outnumbered, yet strong, says Bitdefender's artificial...

#artificialintelligence

When it comes to artificial intelligence, people typically envision a Sci-Fi world where robots take over humanity as we know it. But artificial intelligence is already here, improving everyday technologies such as ecommerce, surveillance systems and many others. To shed some light on how AI is used in this industry, we've asked Cristina Vatamanu, malware researcher at Bitdefender's Antimalware Labs, to answer a few questions. For the past 6 years, Cristina has demonstrated strong expertise in reverse engineering, exploit analysis, threat analysis and automated systems. She is now pursuing a PhD in Machine Learning theory in malware detection systems at "Gheorghe Asachi" Technical University in Iasi.


Conditional Sparse Linear Regression

arXiv.org Machine Learning

Machine learning and statistics typically focus on building models that capture the vast majority of the data, possibly ignoring a small subset of data as "noise" or "outliers." By contrast, here we consider the problem of jointly identifying a significant (but perhaps small) segment of a population in which there is a highly sparse linear regression fit, together with the coefficients for the linear fit. We contend that such tasks are of interest both because the models themselves may be able to achieve better predictions in such special cases, but also because they may aid our understanding of the data. We give algorithms for such problems under the sup norm, when this unknown segment of the population is described by a k-DNF condition and the regression fit is s-sparse for constant k and s. For the variants of this problem when the regression fit is not so sparse or using expected error, we also give a preliminary algorithm and highlight the question as a challenge for future work.


The Mathematics of Machine Learning

#artificialintelligence

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I have observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow, R-caret etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.


Outnumbered, yet Strong: Artificial Intelligence as a Force Multiplier in Cyber-Security

#artificialintelligence

For the past 6 years, she has demonstrated strong expertise in reverse engineering, exploit analysis, threat analysis and automated systems. She has a graduate degree in Computer Science from the "Gheorghe Asachi" Technical University in Iasi and is now pursuing a PhD in Machine Learning theory in malware detection systems.


Artificial Intelligence and Machine Learning in Big Data and IoT: AI Powered Predictive Analytics Market Will Reach 18.5 Billion by 2021 - Research and Markets

#artificialintelligence

DUBLIN--(BUSINESS WIRE)--Research and Markets has announced the addition of the "Artificial Intelligence and Machine Learning in Big Data and IoT: The Market for Data Capture, Analytics, and Decision Making 2016 - 2021" report to their offering. More than 50% of enterprise IT organizations are experimenting with Artificial Intelligence (AI) in various forms such as Machine Learning, Deep Learning, Computer Vision, Image Recognition, Voice Recognition, Artificial Neural Networks, and more. AI is not a single technology but a convergence of various technologies, statistical models, algorithms, and approaches. Machine Learning is a sub-field of computer science that evolved from the study of pattern recognition and computational learning theory in AI. Every large corporation collects and maintains a huge amount of human-oriented data associated with its customers including their preferences, purchases, habits, and other personal information.


Minimum Description Length Principle in Supervised Learning with Application to Lasso

arXiv.org Machine Learning

The minimum description length (MDL) principle in supervised learning is studied. One of the most important theories for the MDL principle is Barron and Cover's theory (BC theory), which gives a mathematical justification of the MDL principle. The original BC theory, however, can be applied to supervised learning only approximately and limitedly. Though Barron et al. recently succeeded in removing a similar approximation in case of unsupervised learning, their idea cannot be essentially applied to supervised learning in general. To overcome this issue, an extension of BC theory to supervised learning is proposed. The derived risk bound has several advantages inherited from the original BC theory. First, the risk bound holds for finite sample size. Second, it requires remarkably few assumptions. Third, the risk bound has a form of redundancy of the two-stage code for the MDL procedure. Hence, the proposed extension gives a mathematical justification of the MDL principle to supervised learning like the original BC theory. As an important example of application, new risk and (probabilistic) regret bounds of lasso with random design are derived. The derived risk bound holds for any finite sample size $n$ and feature number $p$ even if $n\ll p$ without boundedness of features in contrast to the past work. Behavior of the regret bound is investigated by numerical simulations. We believe that this is the first extension of BC theory to general supervised learning with random design without approximation.