AITopics | Sucholutsky, Ilia

Collaborating Authors

Sucholutsky, Ilia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analyzing Diffusion as Serial Reproduction

Marjieh, Raja, Sucholutsky, Ilia, Langlois, Thomas A., Jacoby, Nori, Griffiths, Thomas L.

arXiv.org Machine LearningSep-29-2022

Diffusion models are a class of generative models that learn to synthesize samples by inverting a diffusion process that gradually maps data into noise. While these models have enjoyed great success recently, a full theoretical understanding of their observed properties is still lacking, in particular, their weak sensitivity to the choice of noise family and the role of adequate scheduling of noise levels for good synthesis. By identifying a correspondence between diffusion models and a well-known paradigm in cognitive science known as serial reproduction, whereby human agents iteratively observe and reproduce stimuli from memory, we show how the aforementioned properties of diffusion models can be explained as a natural consequence of this correspondence. We then complement our theoretical analysis with simulations that exhibit these key features. Our work highlights how classic paradigms in cognitive science can shed light on state-of-the-art machine learning problems.

artificial intelligence, diffusion model, machine learning, (18 more...)

arXiv.org Machine Learning

2209.14821

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Can Humans Do Less-Than-One-Shot Learning?

Malaviya, Maya, Sucholutsky, Ilia, Oktar, Kerem, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceFeb-9-2022

Being able to learn from small amounts of data is a key characteristic of human intelligence, but exactly {\em how} small? In this paper, we introduce a novel experimental paradigm that allows us to examine classification in an extremely data-scarce setting, asking whether humans can learn more categories than they have exemplars (i.e., can humans do "less-than-one shot" learning?). An experiment conducted using this paradigm reveals that people are capable of learning in such settings, and provides several insights into underlying mechanisms. First, people can accurately infer and represent high-dimensional feature spaces from very little data. Second, having inferred the relevant spaces, people use a form of prototype-based categorization (as opposed to exemplar-based) to make categorical inferences. Finally, systematic, machine-learnable patterns in responses indicate that people may have efficient inductive biases for dealing with this class of data-scarce problems.

artificial intelligence, machine learning, participant, (18 more...)

arXiv.org Artificial Intelligence

2202.0467

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes

Sucholutsky, Ilia, Kim, Nam-Hwui, Browne, Ryan P., Schonlau, Matthias

arXiv.org Machine LearningFeb-15-2021

Increasingly large datasets are rapidly driving up the computational costs of machine learning. Prototype generation methods aim to create a small set of synthetic observations that accurately represent a training dataset but greatly reduce the computational cost of learning from it. Assigning soft labels to prototypes can allow increasingly small sets of prototypes to accurately represent the original training dataset. Although foundational work on `less than one'-shot learning has proven the theoretical plausibility of learning with fewer than one observation per class, developing practical algorithms for generating such prototypes remains an unexplored territory. We propose a novel, modular method for generating soft-label prototypical lines that still maintains representational accuracy even when there are fewer prototypes than the number of classes in the data. In addition, we propose the Hierarchical Soft-Label Prototype k-Nearest Neighbor classification algorithm based on these prototypical lines. We show that our method maintains high classification accuracy while greatly reducing the number of prototypes required to represent a dataset, even when working with severely imbalanced and difficult data. Our code is available at https://github.com/ilia10000/SLkNN.

artificial intelligence, machine learning, prototype, (17 more...)

arXiv.org Machine Learning

2102.07834

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Add feedback

Optimal 1-NN Prototypes for Pathological Geometries

Sucholutsky, Ilia, Schonlau, Matthias

arXiv.org Machine LearningOct-31-2020

Using prototype methods to reduce the size of training datasets can drastically reduce the computational cost of classification with instance-based learning algorithms like the k-Nearest Neighbour classifier. The number and distribution of prototypes required for the classifier to match its original performance is intimately related to the geometry of the training data. As a result, it is often difficult to find the optimal prototypes for a given dataset, and heuristic algorithms are used instead. However, we consider a particularly challenging setting where commonly used heuristic algorithms fail to find suitable prototypes and show that the optimal prototypes can instead be found analytically. We also propose an algorithm for finding nearly-optimal prototypes in this setting, and use it to empirically validate the theoretical results.

artificial intelligence, machine learning, prototype, (17 more...)

arXiv.org Machine Learning

2011.00228

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)

Add feedback

SecDD: Efficient and Secure Method for Remotely Training Neural Networks

Sucholutsky, Ilia, Schonlau, Matthias

arXiv.org Machine LearningSep-18-2020

We leverage what are typically considered the worst qualities of deep learning algorithms - high computational cost, requirement for large data, no explainability, high dependence on hyper-parameter choice, overfitting, and vulnerability to adversarial perturbations - in order to create a method for the secure and efficient training of remotely deployed neural networks over unsecured channels.

deep learning, neural network, target network, (19 more...)

arXiv.org Machine Learning

2009.09155

Country: North America > Canada > Ontario (0.15)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

'Less Than One'-Shot Learning: Learning N Classes From M

Sucholutsky, Ilia, Schonlau, Matthias

arXiv.org Machine LearningSep-17-2020

Deep neural networks require large training sets but suffer from high computational cost and long training times. Training on much smaller training sets while maintaining nearly the same accuracy would be very beneficial. In the few-shot learning setting, a model must learn a new class given only a small number of samples from that class. One-shot learning is an extreme form of few-shot learning where the model must learn a new class from a single example. We propose the `less than one'-shot learning task where models must learn $N$ new classes given only $M

artificial intelligence, machine learning, prototype, (15 more...)

arXiv.org Machine Learning

2009.08449

Country: North America > Canada > Ontario (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Improving Dataset Distillation

Sucholutsky, Ilia, Schonlau, Matthias

arXiv.org Artificial IntelligenceOct-6-2019

Dataset distillation is a method for reducing dataset sizes: the goal is to learn a small number of synthetic samples containing all the information of a large dataset. This has several benefits: speeding up model training in deep learning, reducing energy consumption, and reducing required storage space. Currently, each synthetic sample is assigned a single `hard' label, which limits the accuracies that models trained on distilled datasets can achieve. Also, currently dataset distillation can only be used with image data. We propose to simultaneously distill both images and their labels, and thus to assign each synthetic sample a `soft' label (a distribution of labels) rather than a single `hard' label. Our improved algorithm increases accuracy by 2-4% over the original dataset distillation algorithm for several image classification tasks. For example, training a LeNet model with just 10 distilled images (one per class) results in over 96% accuracy on the MNIST data. Using `soft' labels also enables distilled datasets to consist of fewer samples than there are classes as each sample can encode information for more than one class. For example, we show that LeNet achieves almost 92% accuracy on MNIST after being trained on just 5 distilled images. We also propose an extension of the dataset distillation algorithm that allows it to distill sequential datasets including texts. We demonstrate that text distillation outperforms other methods across multiple datasets. For example, we are able to train models to almost their original accuracy on the IMDB sentiment analysis task using just 20 distilled sentences.

accuracy, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1910.02551

Country: North America > United States > Oregon (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning for System Trace Restoration

Sucholutsky, Ilia, Narayan, Apurva, Schonlau, Matthias, Fischmeister, Sebastian

arXiv.org Machine LearningApr-10-2019

Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when given imperfect data. We propose a method for accurately reconstructing discrete temporal or sequential system traces affected by data loss, using Long Short-Term Memory Networks (LSTMs). The model works by learning to predict the next event in a sequence of events, and uses its own output as an input to continue predicting future events. As a result, this method can be used for data restoration even with streamed data. Such a method can reconstruct even long sequence of missing events, and can also help validate and improve data quality for noisy data. The output of the model will be a close reconstruction of the true data, and can be fed to algorithms that rely on clean data. We demonstrate our method by reconstructing automotive CAN traces consisting of long sequences of discrete events. We show that given even small parts of a CAN trace, our LSTM model can predict future events with an accuracy of almost 90%, and can successfully reconstruct large portions of the original trace, greatly outperforming a Markov Model benchmark. We separately feed the original, lossy, and reconstructed traces into a specification mining framework to perform downstream analysis of the effect of our method on state-of-the-art models that use these traces for understanding the behavior of complex systems.

deep learning, neural network, prediction, (22 more...)

arXiv.org Machine Learning

1904.05411

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (0.46)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback