AITopics | input capsule

Collaborating Authors

input capsule

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Algorithm for Routing Capsules in All Domains

Heinsen, Franz A.

arXiv.org Artificial IntelligenceNov-7-2019

Building on recent work on capsule networks, we propose a new, general-purpose form of "routing by agreement" that activates output capsules in a layer as a function of their net benefit to use and net cost to ignore input capsules from earlier layers. To illustrate the usefulness of our routing algorithm, we present two capsule networks that apply it in different domains: vision and language. The first network achieves new state-of-the-art accuracy of 99.1% on the smallNORB visual recognition task with fewer parameters and an order of magnitude less training than previous capsule models, and we find evidence that it learns to perform a form of "reverse graphics." The second network achieves new state-of-the-art accuracies on the root sentences of the Stanford Sentiment Treebank: 58.5% on fine-grained and 95.6% on binary labels with a single-task model that routes frozen embeddings from a pretrained transformer as capsules. In both domains, we train with the same regime. Code is available at https://github.com/glassroom/heinsen_routing along with replication instructions.

capsule, input capsule, output capsule, (15 more...)

arXiv.org Artificial Intelligence

1911.00792

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning to compute inner consensus -- A noble approach to modeling agreement between Capsules

Faria, Gonçalo

arXiv.org Machine LearningSep-27-2019

The now called field of Deep Learning has expanded these ideas by creating models that stack multiple layers of Perceptrons. These Multilayer Perceptrons, commonly known as Neural Networks [7], achieve greater representation capacity, due to the layered manner the computational complexity is added, especially when compared with its precursor. Attributable to this compositional approach they are especially hard-wired to learn a nested hierarchy of concepts [27]. As an approach to soft-computing, Neural Networks stand in opposition to the precisely stated view of analytical algorithms that, unlike the human mind, are not tolerant of imprecision, uncertainty, partial truth and approximation [5]. In conjunction with other Deep Learning models, they stand at the vanguard of Artificial Intelligence Research, employed in tasks that previously have been found computationally intractable.

capsule, compatibility probability, procedure, (15 more...)

arXiv.org Machine Learning

1909.12737

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Information Aggregation for Multi-Head Attention with Routing-by-Agreement

Li, Jian, Yang, Baosong, Dou, Zi-Yi, Wang, Xing, Lyu, Michael R., Tu, Zhaopeng

arXiv.org Artificial IntelligenceApr-5-2019

Multi-head attention is appealing for its ability to jointly extract different types of information from multiple representation subspaces. Concerning the information aggregation, a common practice is to use a concatenation followed by a linear transformation, which may not fully exploit the expressiveness of multi-head attention. In this work, we propose to improve the information aggregation for multi-head attention with a more powerful routing-by-agreement algorithm. Specifically, the routing algorithm iteratively updates the proportion of how much a part (i.e. the distinct information learned from a specific subspace) should be assigned to a whole (i.e. the final output representation), based on the agreement between parts and wholes. Experimental results on linguistic probing tasks and machine translation tasks prove the superiority of the advanced information aggregation over the standard linear transformation.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

1904.031

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Macao (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement

Dou, Zi-Yi, Tu, Zhaopeng, Wang, Xing, Wang, Longyue, Shi, Shuming, Zhang, Tong

arXiv.org Artificial IntelligenceFeb-15-2019

With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation. However, most of the previous methods combine layers in a static fashion in that their aggregation strategy is independent of specific hidden states. Inspired by recent progress on capsule networks, in this paper we propose to use routing-by-agreement strategies to aggregate layers dynamically. Specifically, the algorithm learns the probability of a part (individual layer representations) assigned to a whole (aggregated representations) in an iterative way and combines parts accordingly. We implement our algorithm on top of the state-of-the-art neural machine translation model TRANSFORMER and conduct experiments on the widely-used WMT14 English-German and WMT17 Chinese-English translation datasets. Experimental results across language pairs show that the proposed approach consistently outperforms the strong baseline model and a representative static aggregation model.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1902.0577

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback