AITopics | omninet

Collaborating Authors

omninet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

S-Omninet: Structured Data Enhanced Universal Multimodal Learning Architecture

Xue, Ye, Klabjan, Diego, Utke, Jean

arXiv.org Artificial IntelligenceJul-1-2023

Multimodal multitask learning has attracted an increasing interest in recent years. Singlemodal models have been advancing rapidly and have achieved astonishing results on various tasks across multiple domains. Multimodal learning offers opportunities for further improvements by integrating data from multiple modalities. Many methods are proposed to learn on a specific type of multimodal data, such as vision and language data. A few of them are designed to handle several modalities and tasks at a time. In this work, we extend and improve Omninet, an architecture that is capable of handling multiple modalities and tasks at a time, by introducing cross-cache attention, integrating patch embeddings for vision inputs, and supporting structured data. The proposed Structured-data-enhanced Omninet (S-Omninet) is a universal model that is capable of learning from structured data of various dimensions effectively with unstructured data through cross-cache attention, which enables interactions among spatial, temporal, and structured features. We also enhance spatial representations in a spatial cache with patch embeddings. We evaluate the proposed model on several multimodal datasets and demonstrate a significant improvement over the baseline, Omninet.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2307.00226

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

OmniNet: Omnidirectional Representations from Transformers

Tay, Yi, Dehghani, Mostafa, Aribandi, Vamsi, Gupta, Jai, Pham, Philip, Qin, Zhen, Bahri, Dara, Juan, Da-Cheng, Metzler, Donald

arXiv.org Artificial IntelligenceMar-1-2021

This paper proposes Omnidirectional Representations from Transformers (OmniNet). In OmniNet, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in the entire network. This process can also be interpreted as a form of extreme or intensive attention mechanism that has the receptive field of the entire width and depth of the network. To this end, the omnidirectional attention is learned via a meta-learner, which is essentially another self-attention based model. In order to mitigate the computationally expensive costs of full receptive field attention, we leverage efficient self-attention models such as kernel-based (Choromanski et al.), low-rank attention (Wang et al.) and/or Big Bird (Zaheer et al.) as the meta-learner. Extensive experiments are conducted on autoregressive language modeling (LM1B, C4), Machine Translation, Long Range Arena (LRA), and Image Recognition. The experiments show that OmniNet achieves considerable improvements across these tasks, including achieving state-of-the-art performance on LM1B, WMT'14 En-De/En-Fr, and Long Range Arena. Moreover, using omnidirectional representation in Vision Transformers leads to significant improvements on image recognition tasks on both few-shot learning and fine-tuning setups.

omnidirectional representation, omninet, representation, (13 more...)

arXiv.org Artificial Intelligence

2103.01075

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.55)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.50)

Add feedback

OmniNet: If Ben's Omnitrix had a better Machine Learning/Artificial Intelligence inbuilt?

#artificialintelligenceAug-9-2020, 07:10:05 GMT

I am a big fan of the Ben 10 Series and I have always wondered why Ben's Omnitrix fails to change into an alien that Ben chooses to be(This is largely due to a weak A.I system already built into the watch). To help Ben, We will devise "OmniNet", a neural network capable of predicting an appropriate alien according to the given situation. As discussed on the show, the Omnitrix is basically a server that connects to the Planet Primus to harness the DNA of around 10000 aliens! If I was the engineer of the device, I would certainly add one or more features to the watch. Why: The Omnitrix/Ultimatrix is a special case as it is not aware of the surrounding environment.

machine learning artificial intelligence inbuilt, omninet, omnitrix, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

OmniNet: A unified architecture for multi-modal multi-task learning

Pramanik, Subhojeet, Agrawal, Priyanka, Hussain, Aman

arXiv.org Machine LearningJul-17-2019

Transformer is a popularly used neural network architecture, especially for language understanding. We introduce an extended and unified architecture which can be used for tasks involving a variety of modalities like image, text, videos, etc. We propose a spatio-temporal cache mechanism that enables learning spatial dimension of the input in addition to the hidden states corresponding to the temporal input sequence. The proposed architecture further enables a single model to support tasks with multiple input modalities as well as asynchronous multi-task learning, thus we refer to it as OmniNet. For example, a single instance of OmniNet can concurrently learn to perform the tasks of part-of-speech tagging, image captioning, visual question answering and video activity recognition. We demonstrate that training these four tasks together results in about three times compressed model while retaining the performance in comparison to training them individually. We also show that using this neural network pre-trained on some modalities assists in learning an unseen task. This illustrates the generalization capacity of the self-attention mechanism on the spatio-temporal cache present in OmniNet.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1907.07804

Country:

Europe (0.68)
Asia (0.46)
North America > United States (0.29)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback