AITopics

a92e9165b22d4456fc6d87236e04c266-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 11:13:54 GMT

artificial intelligence, machine learning, st-adapter, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning

Neural Information Processing SystemsMar-27-2025, 11:13:50 GMT

Capitalizing on large pre-trained models for various downstream tasks of interest have recently emerged with promising performance. Due to the ever-growing model size, the standard full fine-tuning based task adaptation strategy becomes prohibitively costly in terms of model training and storage. This has led to a new research direction in parameter-efficient transfer learning. However, existing attempts typically focus on downstream tasks from the same modality (e.g., image understanding) of the pre-trained model. This creates a limit because in some specific modalities, (e.g., video understanding) such a strong pre-trained model with sufficient knowledge is less or not available. In this work, we investigate such a novel cross-modality transfer learning setting, namely parameter-efficient image-to-video transfer learning. To solve this problem, we propose a new Spatio-Temporal Adapter (ST-Adapter) for parameter-efficient fine-tuning per video task. With a built-in spatio-temporal reasoning capability in a compact design, ST-Adapter enables a pre-trained image model without temporal knowledge to reason about dynamic video content at a small ( 8%) per-task parameter cost, requiring approximately 20 times fewer updated parameters compared to previous work. Extensive experiments on video action recognition tasks show that our ST-Adapter can match or even outperform the strong full fine-tuning strategy and state-of-theart video models, whilst enjoying the advantage of parameter efficiency.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
(2 more...)

Add feedback

Appendix for Wukong: A100 Million Large-scale Chinese Cross-modal Pretraining Benchmark A Examples in Wukong Dataset

Neural Information Processing SystemsMar-27-2025, 11:13:40 GMT

A diverse range of concepts are included. Figure 2: The word cloud generated with texts in Wukong dataset. For example, " 月 " means month; " 日 " is day; " 做 " is do and " 一个 " means one. Figure 1 shows some examples in our dataset. These image-text pairs involve many types of content, e.g., social news, sporting events, product introduction, et al.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: Asia (0.29)

Industry: Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a90b9a09a6ee43d6631cf42e225d73b4-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMar-27-2025, 11:13:36 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

a8f2713b5c6bdcd3d264f1aa9b9c6f03-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 11:13:17 GMT

artificial intelligence, europe government, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections

Neural Information Processing SystemsMar-27-2025, 11:13:13 GMT

Inverse rendering of an object under entirely unknown capture conditions is a fundamental challenge in computer vision and graphics. Neural approaches such as NeRF have achieved photorealistic results on novel view synthesis, but they require known camera poses. Solving this problem with unknown camera poses is highly challenging as it requires joint optimization over shape, radiance, and pose. This problem is exacerbated when the input images are captured in the wild with varying backgrounds and illuminations. Standard pose estimation techniques fail in such image collections in the wild due to very few estimated correspondences across images. Furthermore, NeRF cannot relight a scene under any illumination, as it operates on radiance (the product of reflectance and illumination). We propose a joint optimization framework to estimate the shape, BRDF, and per-image camera pose and illumination. Our method works on inthe-wild online image collections of an object and produces relightable 3D assets for several use-cases such as AR/VR. To our knowledge, our method is the first to tackle this severely unconstrained task with minimal user interaction.

artificial intelligence, illumination, machine learning, (13 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Coresets for Wasserstein Distributionally Robust Optimization Problems Ruomin Huang 1 Jiawei Huang Hu Ding

Neural Information Processing SystemsMar-27-2025, 11:13:01 GMT

Wasserstein distributionally robust optimization (WDRO) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of WDRO can be prohibitive in practice since solving its "minimax" formulation requires a great amount of computation. Recently, several fast WDRO training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale WDROs is still quite limited, to the best of our knowledge. Coreset is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems.

anc, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Genre:

Research Report > New Finding (0.35)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Robust Contrastive Multi-view Clustering against Dual Noisy Correspondence

Neural Information Processing SystemsMar-27-2025, 11:12:56 GMT

Recently, contrastive multi-view clustering (MvC) has emerged as a promising avenue for analyzing data from heterogeneous sources, typically leveraging the off-the-shelf instances as positives and randomly sampled ones as negatives. In practice, however, this paradigm would unavoidably suffer from the Dual Noisy Correspondence (DNC) problem, where noise compromises the constructions of both positive and negative pairs. Specifically, the complexity of data collection and transmission might mistake some unassociated pairs as positive (namely, false positive correspondence), while the intrinsic one-to-many contrast nature of contrastive MvC would sample some intra-cluster samples as negative (namely, false negative correspondence). To handle this daunting problem, we propose a novel method, dubbed Contextually-spectral based correspondence refinery (CANDY). CANDY dexterously exploits inter-view similarities as context to uncover false negatives. Furthermore, it employs a spectral-based module to denoise correspondence, alleviating the negative influence of false positives. Extensive experiments on five widely-used multi-view benchmarks, in comparison with eight competitive multi-view clustering methods, verify the effectiveness of our method in addressing the DNC problem.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.99)

Add feedback

A Notation and Definitions

Neural Information Processing SystemsMar-27-2025, 11:12:47 GMT

For ERM learnability and losses satisfying Definition 18, we can obtain the next result. Theorem 5. Let ` be a loss function satisfying Definition 18.

artificial intelligence, learner, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Education (0.46)

Technology: