AITopics | Fagan, Francois

Collaborating Authors

Fagan, Francois

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Implicit Backpropagation

Fagan, Francois, Iyengar, Garud

arXiv.org Machine LearningAug-7-2018

Arguably the biggest challenge in applying neural networks is tuning the hyperparameters, in particular the learning rate. The sensitivity to the learning rate is due to the reliance on backpropagation to train the network. In this paper we present the first application of Implicit Stochastic Gradient Descent (ISGD) to train neural networks, a method known in convex optimization to be unconditionally stable and robust to the learning rate. Our key contribution is a novel layer-wise approximation of ISGD which makes its updates tractable for neural networks. Experiments show that our method is more robust to high learning rates and generally outperforms standard backpropagation on a variety of tasks.

dataset, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1808.02433

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Unbiased scalable softmax optimization

Fagan, Francois, Iyengar, Garud

arXiv.org Machine LearningMar-22-2018

Recent neural network and language models rely on softmax distributions with an extremely large number of categories. Since calculating the softmax normalizing constant in this context is prohibitively expensive, there is a growing literature of efficiently computable but biased estimates of the softmax. In this paper we propose the first unbiased algorithms for maximizing the softmax likelihood whose work per iteration is independent of the number of classes and datapoints (and no extra work is required at the end of each epoch). We show that our proposed unbiased methods comprehensively outperform the state-of-the-art on seven real world datasets.

artificial intelligence, implicit sgd, machine learning, (16 more...)

arXiv.org Machine Learning

1803.08577

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

TripleSpin - a generic compact paradigm for fast machine learning computations

Choromanski, Krzysztof, Fagan, Francois, Gouy-Pailler, Cedric, Morvan, Anne, Sarlos, Tamas, Atif, Jamal

arXiv.org Machine LearningJun-6-2016

We present a generic compact computational framework relying on structured random matrices that can be applied to speed up several machine learning algorithms with almost no loss of accuracy. The applications include new fast LSH-based algorithms, efficient kernel computations via random feature maps, convex optimization algorithms, quantization techniques and many more. Certain models of the presented paradigm are even more compressible since they apply only bit matrices. This makes them suitable for deploying on mobile devices. All our findings come with strong theoretical guarantees. In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance"). These guarantees as well as theoretical results for other aforementioned applications follow from the same general theoretical principle that we present in the paper. Our structured family contains as special cases all previously considered structured schemes, including the recently introduced $P$-model. Experimental evaluation confirms the accuracy and efficiency of TripleSpin matrices.

artificial intelligence, matrix, optimization problem, (20 more...)

arXiv.org Machine Learning

1605.09046

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Curriculum > Subject-Specific Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Fast nonlinear embeddings via structured matrices

Choromanski, Krzysztof, Fagan, Francois

arXiv.org Machine LearningApr-25-2016

We present a new paradigm for speeding up randomized computations of several frequently used functions in machine learning. In particular, our paradigm can be applied for improving computations of kernels based on random embeddings. Above that, the presented framework covers multivariate randomized functions. As a byproduct, we propose an algorithmic approach that also leads to a significant reduction of space complexity. Our method is based on careful recycling of Gaussian vectors into structured matrices that share properties of fully random matrices. The quality of the proposed structured approach follows from combinatorial properties of the graphs encoding correlations between rows of these structured matrices. Our framework covers as special cases already known structured approaches such as the Fast Johnson-Lindenstrauss Transform, but is much more general since it can be applied also to highly nonlinear embeddings. We provide strong concentration results showing the quality of the presented paradigm.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1604.07356

Country: North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Add feedback