AITopics

MAViL: Masked Audio-Video Learners Po-Yao Huang 1 Chaitanya Ryali

Neural Information Processing SystemsMar-21-2025, 21:14:41 GMT

We present Masked Audio-Video Learners (MAViL) to learn audio-visual representations with three complementary forms of self-supervision: (1) reconstructing masked raw audio and video inputs, (2) intra-modal and inter-modal contrastive learning with masking, and (3) self-training to predict aligned and contextualized audio-video representations learned from the first two objectives. Empirically, MAViL achieves state-of-the-art audio-video classification performance on AudioSet (53.3 mAP) and VGGSound (67.1% accuracy), surpassing recent self-supervised models and supervised models that utilize external labeled data. Notably, pre-training with MAViL not only enhances performance in multimodal classification and retrieval tasks, but it also improves the representations of each modality in isolation, without relying on information from the other modality during uni-modal fine-tuning or inference.

artificial intelligence, machine learning, representation, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

f50a6c02a3fc5a3a5d4d9391f05f3efc-Paper.pdf

Neural Information Processing SystemsMar-21-2025, 21:02:53 GMT

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)

Add feedback

f50a6c02a3fc5a3a5d4d9391f05f3efc-AuthorFeedback.pdf

Neural Information Processing SystemsMar-21-2025, 21:02:41 GMT

We thank the reviewers for their feedback. We're glad that all reviewers agree that the paper is well-written and that side effect avoidance is an important AI safety We will include results for DQN and for additional auxiliary reward functions. Unfortunately, neither approach is remotely viable in SafeLife. We estimate that there are billions of reachable states in any given SafeLife level. We share their interest in this prospect.

artificial intelligence, aup, auxiliary reward function, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

Supplementary Materials for MLP-Mixer: An all-MLP Architecture for Vision, Lucas Beyer

Neural Information Processing SystemsMar-21-2025, 21:02:38 GMT

A.1 Modifying the token-mixing MLPs We ablated a number of ideas trying to improve the token-mixing MLPs for Mixer models of various scales pre-trained on JFT-300M. Instead, we could introduce C separate MLPs with independent weights, effectively multiplying the number of parameters by C. We did not observe any noticeable improvements. Grouping the channels together Token-mixing MLPs take S-dimensional vectors as inputs. Every such vector contains values of a single feature across S different spatial locations. In other words, token-mixing MLPs operate by looking at only one channel at once.

artificial intelligence, image understanding, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.77)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Specific Architectures (0.77)

Add feedback

cba0a4ee5ccd02fda0fe3f9a3e7b89fe-Paper.pdf

Neural Information Processing SystemsMar-21-2025, 21:02:35 GMT

artificial intelligence, convolution, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.69)

Add feedback

MSA Generation with Seqs2Seqs Pretraining: Advancing Protein Structure Predictions

Neural Information Processing SystemsMar-21-2025, 21:02:29 GMT

Deep learning models like AlphaFold2 (Jumper et al., 2021) have revolutionized protein structure prediction, achieving unprecedented accuracy. However, the dependence on robust multiple sequence alignments (MSAs) continues to pose a challenge, especially for proteins that lack a wealth of homologous sequences. To overcome this limitation, we introduce MSA-Generator, a self-supervised generative protein language model. Trained on a sequence-to-sequence task using an automatically constructed dataset, MSA-Generator employs protein-specific attention mechanisms to harness large-scale protein databases, generating virtual MSAs that enrich existing ones and boost prediction accuracy. Our experiments on CASP14 and CASP15 benchmarks reveal significant improvements in LDDT scores, particularly for complex and challenging sequences, enhancing the performance of both AlphaFold2 and RoseTTAFold. The code is released at https://github.com/lezhang7/MSAGen.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada > Quebec (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Neural Information Processing SystemsMar-21-2025, 21:02:07 GMT

Most applications of machine learning to classification assume a closed set of balanced classes. This is at odds with the real world, where class occurrence statistics often follow a long-tailed power-law distribution, rarely revealing the entire problem domain in a single sample. Nonparametric Bayesian models naturally capture this phenomenon, but have significant practical barriers to widespread adoption, namely implementation complexity and computational inefficiency. To address this, we present a method for extracting the inductive bias from a nonparametric Bayesian model and transferring it to an artificial neural network. By simulating data with a nonparametric Bayesian prior, we can metalearn a sequence model that performs inference over an unlimited set of classes. After training, this "neural circuit" has distilled the corresponding inductive bias and can successfully perform sequential inference over an open set of classes. Our experimental results show that the metalearned neural circuit achieves comparable or better performance than particle filter-based methods that explicitly perform Bayesian nonparametric inference while being faster and simpler to use.

artificial intelligence, machine learning, neural circuit, (16 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Personalized Federated Learning with Moreau Envelopes: Supplementary Materials, Nguyen H. Tran 1

Neural Information Processing SystemsMar-21-2025, 21:01:47 GMT

In this appendix we provide proofs for the theorems and lemmas in the paper "Personalized Federated Learning with Moreau Envelopes", as well as additional experimental settings and results.

artificial intelligence, machine learning, pfedme, (16 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Oceania > Australia (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f4f1f13c8289ac1b1ee0ff176b56fc60-Paper.pdf

Neural Information Processing SystemsMar-21-2025, 21:01:40 GMT

australia government, canada government, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Generative Neural Fields by Mixtures of Neural Implicit Functions

Neural Information Processing SystemsMar-21-2025, 21:01:27 GMT

We propose a novel approach to learning the generative neural fields represented by linear combinations of implicit basis networks. Our algorithm learns basis networks in the form of implicit neural representations and their coefficients in a latent space by either conducting meta-learning or adopting auto-decoding paradigms. The proposed method easily enlarges the capacity of generative neural fields by increasing the number of basis networks while maintaining the size of a network for inference to be small through their weighted model averaging. Consequently, sampling instances using the model is efficient in terms of latency and memory footprint. Moreover, we customize denoising diffusion probabilistic model for a target task to sample latent mixture coefficients, which allows our final model to generate unseen data effectively. Experiments show that our approach achieves competitive generation performance on diverse benchmarks for images, voxel data, and NeRF scenes without sophisticated designs for specific modalities and domains.

artificial intelligence, machine learning, mnif, (16 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

MAViL: Masked Audio-Video Learners Po-Yao Huang 1 Chaitanya Ryali

f50a6c02a3fc5a3a5d4d9391f05f3efc-Paper.pdf

f50a6c02a3fc5a3a5d4d9391f05f3efc-AuthorFeedback.pdf

Supplementary Materials for MLP-Mixer: An all-MLP Architecture for Vision, Lucas Beyer

cba0a4ee5ccd02fda0fe3f9a3e7b89fe-Paper.pdf

MSA Generation with Seqs2Seqs Pretraining: Advancing Protein Structure Predictions

A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Personalized Federated Learning with Moreau Envelopes: Supplementary Materials, Nguyen H. Tran 1

f4f1f13c8289ac1b1ee0ff176b56fc60-Paper.pdf

Generative Neural Fields by Mixtures of Neural Implicit Functions