AITopics

2106.04627

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
(10 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Mama, Rayhane, Tyndel, Marc S., Kadhim, Hashiam, Clifford, Cole, Thurairatnam, Ragavan

NWT: Towards natural audio-to-video generation with representation learning

arXiv.org Artificial IntelligenceJun-8-2021

In this work we introduce NWT, an expressive speech-to-video model. Unlike approaches that use domain-specific intermediate representations such as pose keypoints, NWT learns its own latent representations, with minimal assumptions about the audio and video content. To this end, we propose a novel discrete variational autoencoder with adversarial loss, dVAE-Adv, which learns a new discrete latent representation we call Memcodes. Memcodes are straightforward to implement, require no additional loss terms, are stable to train compared with other approaches, and show evidence of interpretability. To predict on the Memcode space, we use an autoregressive encoder-decoder model conditioned on audio. Additionally, our model can control latent attributes in the generated video that are not annotated in the data. We train NWT on clips from HBO's Last Week Tonight with John Oliver. NWT consistently scores above other approaches in Mean Opinion Score (MOS) on tests of overall video naturalness, facial naturalness and expressiveness, and lipsync quality. This work sets a strong baseline for generalized audio-to-video synthesis.

representation, stride 1, video, (14 more...)

2106.04283

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Hollenstein, Nora, Beinborn, Lisa

Relative Importance in Sentence Processing

arXiv.org Artificial IntelligenceJun-7-2021

Determining the relative importance of the elements in a sentence is a key factor for effortless natural language understanding. For human language processing, we can approximate patterns of relative importance by measuring reading fixations using eye-tracking technology. In neural language models, gradient-based saliency methods indicate the relative importance of a token for the target objective. In this work, we compare patterns of relative importance in English language processing by humans and models and analyze the underlying linguistic patterns. We find that human processing patterns in English correlate strongly with saliency-based importance in language models and not with attention-based importance. Our results indicate that saliency could be a cognitively more plausible metric for interpreting neural language models. The code is available on GitHub: https://github.com/beinborn/relative_importance

computational linguistic, proceedings, relative importance, (13 more...)

2106.03471

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.05)
Asia > China > Hong Kong (0.04)
(9 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Agarwal, Ananye, Shenoy, Pradeep, Mausam, null

End-to-End Neuro-Symbolic Architecture for Image-to-Image Reasoning Tasks

arXiv.org Artificial IntelligenceJun-6-2021

Neural models and symbolic algorithms have recently been combined for tasks requiring both perception and reasoning. Neural models ground perceptual input into a conceptual vocabulary, on which a classical reasoning algorithm is applied to generate output. A key limitation is that such neural-to-symbolic models can only be trained end-to-end for tasks where the output space is symbolic. In this paper, we study neural-symbolic-neural models for reasoning tasks that require a conversion from an image input (e.g., a partially filled sudoku) to an image output (e.g., the image of the completed sudoku). While designing such a three-step hybrid architecture may be straightforward, the key technical challenge is end-to-end training -- how to backpropagate without intermediate supervision through the symbolic component. We propose NSNnet, an architecture that combines an image reconstruction loss with a novel output encoder to generate a supervisory signal, develops update algorithms that leverage policy gradient methods for supervision, and optimizes loss using a novel subsampling heuristic. We experiment on problem settings where symbolic algorithms are easily specified: a visual maze solving task and a visual Sudoku solver where the supervision is in image form. Experiments show high accuracy with significantly less data compared to purely neural approaches.

architecture, gradient, supervision, (16 more...)

2106.03121

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Sudoku (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Vechtomova, Olga, Sahu, Gaurav, Kumar, Dhruv

LyricJam: A system for generating lyrics for live instrumental music

arXiv.org Artificial IntelligenceJun-3-2021

We describe a real-time system that receives a live audio stream from a jam session and generates lyric lines that are congruent with the live music being played. Two novel approaches are proposed to align the learned latent spaces of audio and text representations that allow the system to generate novel lyric lines matching live instrumental music. One approach is based on adversarial alignment of latent representations of audio and lyrics, while the other approach learns to transfer the topology from the music latent space to the lyric latent space. A user study with music artists using the system showed that the system was useful not only in lyric composition, but also encouraged the artists to improvise and find new musical expressions. Another user study demonstrated that users preferred the lines generated using the proposed methods to the lines generated by a baseline model.

lyric, lyric line, music, (16 more...)

2106.0196

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Artificial IntelligenceJun-3-2021

Semi-supervised Conditional Density Estimation for Imputation and Classification of Incomplete Instances

Huang, Buliao

Incomplete instances with various missing attributes in many real-world scenes have brought challenges to the classification task. There are some missing values imputation methods to fill the missing values with substitute values before classification. However, the separation between imputation and classification may lead to inferior performance since label information are ignored during imputation. Moreover, these imputation methods tend to initialize these missing values with strong prior assumptions, while the unreliability of such initialization is rarely considered. To tackle these problems, a novel semi-supervised conditional normalizing flow (SSCFlow) is proposed in this paper. SSCFlow explicitly utilizes the observed labels to facilitate the imputation and classification simultaneously by employing a semi-supervised algorithm to estimate the conditional probability density of missing values. Moreover, SSCFlow takes the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate the true conditional probability density of missing values. Experiments have been conducted with real-world datasets to demonstrate the robustness and efficiency of the proposed algorithm.

classification, imputation, sscflow, (15 more...)

2106.01708

Country:

North America > United States > New York > New York County > New York City (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Tennessee (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Wright, Matthew A., Gonzalez, Joseph E.

Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines

arXiv.org Machine LearningJun-2-2021

Despite their ubiquity in core AI fields like natural language processing, the mechanics of deep attention-based neural networks like the Transformer model are not fully understood. In this article, we present a new perspective towards understanding how Transformers work. In particular, we show that the "dot-product attention" that is the core of the Transformer's operation can be characterized as a kernel learning method on a pair of Banach spaces. In particular, the Transformer's kernel is characterized as having an infinite feature dimension. Along the way we consider an extension of the standard kernel learning problem to a binary setting, where data come from two input domains and a response is defined for every cross-domain pair. We prove a new representer theorem for these binary kernel machines with non-Mercer (indefinite, asymmetric) kernels (implying that the functions learned are elements of reproducing kernel Banach spaces rather than Hilbert spaces), and also prove a new universal approximation theorem showing that the Transformer calculation can learn any binary non-Mercer reproducing kernel Banach space pair. We experiment with new kernels in Transformers, and obtain results that suggest the infinite dimensionality of the standard Transformer kernel is partially responsible for its performance. This paper's results provide a new theoretical understanding of a very important but poorly understood model in modern machine~learning.

banach space, kernel, transformer, (16 more...)

2106.01506

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
(16 more...)

Genre: Research Report (0.82)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMay-30-2021

BABA: Beta Approximation for Bayesian Active Learning

Woo, Jae Oh

This paper introduces a new acquisition function under the Bayesian active learning framework, namely BABA. It is motivated by previously well-established works BALD, and BatchBALD which capture the mutual information between the model parameters and the predictive outputs of the data. Our proposed measure, BABA, endeavors to quantify the normalized mutual information by approximating the stochasticity of predictive probabilities using Beta distributions. BABA outperforms the well-known family of acquisition functions, including BALD and BatchBALD. We demonstrate this by showing extensive experimental results obtained from MNIST and EMNIST datasets.

experiment, information, mutual information, (14 more...)

2105.14559

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Shan, Haozhe, Bordelon, Blake

Rapid Feature Evolution Accelerates Learning in Neural Networks

arXiv.org Machine LearningMay-29-2021

Neural network (NN) training and generalization in the infinite-width limit are well-characterized by kernel methods with a neural tangent kernel (NTK) that is stationary in time. However, finite-width NNs consistently outperform corresponding kernel methods, suggesting the importance of feature learning, which manifests as the time evolution of NTKs. Here, we analyze the phenomenon of kernel alignment of the NTK with the target functions during gradient descent. We first provide a mechanistic explanation for why alignment between task and kernel occurs in deep linear networks. We then show that this behavior occurs more generally if one optimizes the feature map over time to accelerate learning while constraining how quickly the features evolve. Empirically, gradient descent undergoes a feature learning phase, during which top eigenfunctions of the NTK quickly align with the target function and the loss decreases faster than power law in time; it then enters a kernel gradient descent (KGD) phase where the alignment does not improve significantly and the training loss decreases in power law. We show that feature evolution is faster and more dramatic in deeper networks. We also found that networks with multiple output nodes develop separate, specialized kernels for each output channel, a phenomenon we termed kernel specialization. We show that this class-specific alignment is does not occur in linear networks.

alignment, kernel, neural network, (15 more...)

2105.14301

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Nødland, Bernt Ivar Utstøl

Measuring global properties of neural generative model outputs via generating mathematical objects

arXiv.org Machine LearningMay-28-2021

We train deep generative models on datasets of reflexive polytopes. This enables us to compare how well the models have picked up on various global properties of generated samples. Our datasets are complete in the sense that every single example, up to changes of coordinate, is included in the dataset. Using this property we also perform tests checking to what extent the models are merely memorizing the data. We also train models on the same dataset represented in two different ways, enabling us to measure which form is easiest to learn from. We use these experiments to show that deep generative models can learn to generate geometric objects with non-trivial global properties, and that the models learn some underlying properties of the objects rather than simply memorizing the data.

dataset, polytope, representation, (16 more...)

2105.13669

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)