AITopics | avt

Collaborating Authors

avt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization

Bruno Korbar, Du Tran, Lorenzo Torresani

Neural Information Processing SystemsFeb-14-2026, 12:27:06 GMT

Toeasethelearning, wedemonstrate thatitisbeneficial toadoptacurriculum learning strategy [23], where harder negatives are introduced after an initial stage of learning on easiernegatives.

artificial intelligence, machine learning, video, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OntheConvergenceofSmoothRegularized ApproximateValueIterationSchemes

Neural Information Processing SystemsFeb-8-2026, 07:36:06 GMT

In practical settings, the reinforcement learning (RL) algorithms are faced with a challenge of maximizing the cumulative reward given a finite sample of environment transitions and inexact representation ofpolicyandvaluefunction. This givesrisetoerrors thatpropagateacross learning iterations and, combined, can result in divergence. Recently, state-of-the-art RL algorithms have been successful in solving complex environments and, hence, overcoming inaccuracies and their accumulation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

GPS-IDS: An Anomaly-based GPS Spoofing Attack Detection Framework for Autonomous Vehicles

Abrar, Murad Mehrab, Islam, Raian, Satam, Shalaka, Shao, Sicong, Hariri, Salim, Satam, Pratik

arXiv.org Artificial IntelligenceMay-14-2024

Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AVs, making it crucial to detect and mitigate such attacks. This paper proposes GPS Intrusion Detection System, or GPS-IDS, an Anomaly Behavior Analysis (ABA)-based intrusion detection framework to detect GPS spoofing attacks on AVs. The framework uses a novel physics-based vehicle behavior model where a GPS navigation model is integrated into the conventional dynamic bicycle model for accurate AV behavior representation. Temporal features derived from this behavior model are analyzed using machine learning to detect normal and abnormal navigation behavior. The performance of the GPS-IDS framework is evaluated on the AV-GPS-Dataset - a real-world dataset collected by the team using an AV testbed. The dataset has been publicly released for the global research community. To the best of our knowledge, this dataset is the first of its kind and will serve as a useful resource to address such security challenges.

gps, university, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2405.08359

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > California (0.14)
Asia > India > Maharashtra > Mumbai (0.04)
(9 more...)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
(3 more...)

Add feedback

Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification

Zhu, Wentao

arXiv.org Artificial IntelligenceJan-8-2024

Audio and video are two most common modalities in the mainstream media platforms, e.g., YouTube. To learn from multimodal videos effectively, in this work, we propose a novel audio-video recognition approach termed audio video Transformer, AVT, leveraging the effective spatio-temporal representation by the video Transformer to improve action recognition accuracy. For multimodal fusion, simply concatenating multimodal tokens in a cross-modal Transformer requires large computational and memory resources, instead we reduce the cross-modality complexity through an audio-video bottleneck Transformer. To improve the learning efficiency of multimodal Transformer, we integrate self-supervised objectives, i.e., audio-video contrastive learning, audio-video matching, and masked audio and video learning, into AVT training, which maps diverse audio and video representations into a common multimodal representation space. We further propose a masked audio segment loss to learn semantic audio activities in AVT. Extensive experiments and ablation studies on three public datasets and two in-house datasets consistently demonstrate the effectiveness of the proposed AVT. Specifically, AVT outperforms its previous state-of-the-art counterparts on Kinetics-Sounds by 8%. AVT also surpasses one of the previous state-of-the-art video Transformers [25] by 10% on VGGSound by leveraging the audio signal. Compared to one of the previous state-of-the-art multimodal methods, MBT [32], AVT is 1.3% more efficient in terms of FLOPs and improves the accuracy by 3.8% on Epic-Kitchens-100.

computer vision, proceedings, transformer, (14 more...)

arXiv.org Artificial Intelligence

2401.04154

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Anticipative Video Transformer

Girdhar, Rohit, Grauman, Kristen

arXiv.org Artificial IntelligenceJun-3-2021

We propose Anticipative Video Transformer (AVT), an end-to-end attention-based video modeling architecture that attends to the previously observed video in order to anticipate future actions. We train the model jointly to predict the next action in a video sequence, while also learning frame feature encoders that are predictive of successive future frames' features. Compared to existing temporal aggregation strategies, AVT has the advantage of both maintaining the sequential progression of observed actions while still capturing long-range dependencies--both critical for the anticipation task. Through extensive experiments, we show that AVT obtains the best reported performance on four popular action anticipation benchmarks: EpicKitchens-55, EpicKitchens-100, EGTEA Gaze+, and 50-Salads, including outperforming all submissions to the EpicKitchens-100 CVPR'21 challenge.

anticipation, architecture, prediction, (15 more...)

arXiv.org Artificial Intelligence

2106.02036

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization

Korbar, Bruno, Tran, Du, Torresani, Lorenzo

Neural Information Processing SystemsDec-31-2018

There is a natural correlation between the visual and auditive elements of a video. In this work we leverage this connection to learn general and effective models for both audio and video analysis from self-supervised temporal synchronization. We demonstrate that a calibrated curriculum learning scheme, a careful choice of negative examples, and the use of a contrastive loss are critical ingredients to obtain powerful multi-sensory representations from models optimized to discern temporal synchronization of audio-video pairs. Without further fine-tuning, the resulting audio features achieve performance superior or comparable to the state-of-the-art on established audio classification benchmarks (DCASE2014 and ESC-50). At the same time, our visual subnet provides a very effective initialization to improve the accuracy of video-based action recognition models: compared to learning from scratch, our self-supervised pretraining yields a remarkable gain of +19.9% in action recognition accuracy on UCF101 and a boost of +17.7% on HMDB51.

artificial intelligence, computer vision, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > Canada (0.68)
North America > United States > Minnesota (0.28)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)

Add feedback