AITopics | George, Thomas

Collaborating Authors

George, Thomas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark

George, Thomas, Nodet, Pierre, Bondu, Alexis, Lemaire, Vincent

arXiv.org Artificial IntelligenceOct-21-2024

Mislabeled examples are ubiquitous in real-world machine learning datasets, advocating the development of techniques for automatic detection. We show that most mislabeled detection methods can be viewed as probing trained machine learning models using a few core principles. We formalize a modular framework that encompasses these methods, parameterized by only 4 building blocks, as well as a Python library that demonstrates that these principles can actually be implemented. The focus is on classifier-agnostic concepts, with an emphasis on adapting methods developed for deep learning models to non-deep classifiers for tabular data. We benchmark existing methods on (artificial) Completely At Random (NCAR) as well as (realistic) Not At Random (NNAR) labeling noise from a variety of tasks with imperfect labeling rules. This benchmark provides new insights as well as limitations of existing methods in this setup.

artificial intelligence, inductive learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.15772

Country: North America > United States (1.00)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty

George, Thomas, Lajoie, Guillaume, Baratin, Aristide

arXiv.org Artificial IntelligenceNov-21-2022

Among attempts at giving a theoretical account of the success of deep neural networks, a recent line of work has identified a so-called'lazy' training regime in which the network can be well approximated by its linearization around initialization. Here we investigate the comparative effect of the lazy (linear) and feature learning (non-linear) regimes on subgroups of examples based on their difficulty. Specifically, we show that easier examples are given more weight in feature learning mode, resulting in faster training compared to more difficult ones. In other words, the non-linear dynamics tends to sequentialize the learning of examples of increasing difficulty. We illustrate this phenomenon across different ways to quantify example difficulty, including c-score, label noise, and in the presence of easy-to-learn spurious correlations. Our results reveal a new understanding of how deep networks prioritize resources across example difficulty.

artificial intelligence, machine learning, regime, (16 more...)

arXiv.org Artificial Intelligence

2209.09658

Country: North America > United States (0.68)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Li, Irene, George, Thomas, Fabbri, Alexander, Liao, Tammy, Chen, Benjamin, Kawamura, Rina, Zhou, Richard, Yan, Vanessa, Hingmire, Swapnil, Radev, Dragomir

arXiv.org Artificial IntelligenceJan-6-2022

Effective human learning depends on a wide selection of educational materials that align with the learner's current understanding of the topic. While the Internet has revolutionized human learning or education, a substantial resource accessibility barrier still exists. Namely, the excess of online information can make it challenging to navigate and discover high-quality learning materials. In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains. The pipeline consists of three main steps: data collection, feature extraction, and resource classification. We start with a known source domain and conduct resource discovery on two unseen target domains via transfer learning. We first collect frequent queries from a set of seed documents and search on the web to obtain candidate resources, such as lecture slides and introductory blog posts. Then we introduce a novel pretrained information retrieval deep neural network model, query-document masked language modeling (QD-MLM), to extract deep features of these candidate resources. We apply a tree-based classifier to decide whether the candidate is a positive learning resource. The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains. Finally, we demonstrate how this pipeline can benefit an application: leading paragraph generation for surveys. This is the first study that considers various web resources for survey generation, to the best of our knowledge. We also release a corpus of 39,728 manually labeled web resources and 659 queries from NLP, Computer Vision (CV), and Statistics (STATS).

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2201.02312

Country:

North America > United States > Hawaii (0.14)
North America > Canada > Ontario (0.14)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting > Online (0.47)
Education > Educational Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Continual Learning in Deep Networks: an Analysis of the Last Layer

Lesort, Timothée, George, Thomas, Rish, Irina

arXiv.org Artificial IntelligenceJun-3-2021

We study how different output layer types of a deep neural network learn and forget in continual learning settings. We describe the three factors affecting catastrophic forgetting in the output layer: (1) weights modifications, (2) interferences, and (3) projection drift. Our goal is to provide more insights into how different types of output layers can address (1) and (2). We also propose potential solutions and evaluate them on several benchmarks. We show that the best-performing output layer type depends on the data distribution drifts or the amount of data available. In particular, in some cases where a standard linear layer would fail, it is sufficient to change the parametrization and get significantly better performance while still training with SGD. Our results and analysis shed light on the dynamics of the output layer in continual learning scenarios and help select the best-suited output layer for a given scenario.

deep learning, experiment, neural network, (19 more...)

arXiv.org Artificial Intelligence

2106.01834

Country:

Europe > Hungary (0.14)
Oceania > Australia (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Implicit Regularization via Neural Feature Alignment

Baratin, Aristide, George, Thomas, Laurent, César, Hjelm, R Devon, Lajoie, Guillaume, Vincent, Pascal, Lacoste-Julien, Simon

arXiv.org Machine LearningOct-28-2020

We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. This can be interpreted as a combined mechanism of feature selection and model compression. By extrapolating a new analysis of Rademacher complexity bounds for linear models, we motivate and study a heuristic complexity measure that captures this phenomenon, in terms of sequences of tangent kernel classes along the optimization paths.

deep learning, neural network, tangent kernel, (18 more...)

arXiv.org Machine Learning

2008.00938

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Revisiting Loss Modelling for Unstructured Pruning

Laurent, César, Ballas, Camille, George, Thomas, Ballas, Nicolas, Vincent, Pascal

arXiv.org Machine LearningJun-22-2020

By removing parameters from deep neural networks, unstructured pruning methods aim at cutting down memory footprint and computational cost, while maintaining prediction accuracy. In order to tackle this otherwise intractable problem, many of these methods model the loss landscape using first or second order Taylor expansions to identify which parameters can be discarded. We revisit loss modelling for unstructured pruning: we show the importance of ensuring locality of the pruning steps. We systematically compare first and second order Taylor expansions and empirically show that both can reach similar levels of performance. Finally, we show that better preserving the original network function does not necessarily transfer to better performing networks after fine-tuning, suggesting that only considering the impact of pruning on the loss might not be a sufficient objective to design good pruning criteria.

deep learning, fine-tuning, neural network, (18 more...)

arXiv.org Machine Learning

2006.12279

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

George, Thomas, Laurent, César, Bouthillier, Xavier, Ballas, Nicolas, Vincent, Pascal

Neural Information Processing SystemsFeb-14-2020, 20:43:33 GMT

For models with many parameters, the covari- ance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approxima- tions and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

artificial intelligence, gradient descent, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

George, Thomas, Laurent, César, Bouthillier, Xavier, Ballas, Nicolas, Vincent, Pascal

Neural Information Processing SystemsDec-31-2018

Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions. For models with many parameters, the covari- ance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approxima- tions and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

George, Thomas, Laurent, César, Bouthillier, Xavier, Ballas, Nicolas, Vincent, Pascal

Neural Information Processing SystemsDec-31-2018

approximation, artificial intelligence, neural network, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

George, Thomas, Laurent, César, Bouthillier, Xavier, Ballas, Nicolas, Vincent, Pascal

arXiv.org Machine LearningJun-11-2018

Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions. For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approximations and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

approximation, artificial intelligence, neural network, (18 more...)

arXiv.org Machine Learning

1806.03884

Country: North America > Canada (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback