AITopics | Europe

Collaborating Authors

Europe

REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing

Neural Information Processing SystemsJun-16-2026, 22:16:58 GMT

Short videos are an effective tool for promoting contents and improving knowledge accessibility. While existing extractive video summarization methods struggle to produce a coherent narrative, existing abstractive methods cannot'quote' from the input videos, i.e., inserting short video clips in their outputs. In this work, we explore new video editing models for generating shorts that feature a coherent narrative with embedded video insertions extracted from a long input video. We propose a novel retrieval-embedded generation (REG) framework that allows a large language model to quote multimodal resources while maintaining a coherent narrative. Our proposed REGen system first generates the output story script with quote placeholders using a finetuned large language model, and then uses a multimodal retrieval model to replace the quote placeholders by selecting a video clip that best supports the narrative from a pool of candidate quotable video clips. We examine the proposed method on the task of documentary teaser generation, where short interview insertions are commonly used to support the narrative of a documentary. Our objective evaluations show that the proposed method can effectively insert short video clips while maintaining a coherent narrative. In a subjective survey, we show that our proposed method outperforms existing abstractive and extractive approaches in terms of coherence, alignment, and realism in documentary teaser generation.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.46)
North America > United States (0.28)
Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment (1.00)
Education (0.93)
Energy > Power Industry (0.60)
Media > Film (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Shapley-Based Data Valuation for Weighted k-Nearest Neighbors

Neural Information Processing SystemsJun-16-2026, 22:11:45 GMT

Data valuation quantifies the impact of individual data points on model performance, and Shapley values provide a principled approach to this important task due to their desirable axiomatic properties, albeit with high computational complexity. Recent breakthroughs have enabled fast computation of exact Shapley values for unweighted k-nearest neighbor (kNN) classifiers. However, extending this to weighted kNN models has remained a significant open challenge. The state-of-theart methods either require quadratic time complexity or resort to approximation via sampling. In this paper, we show that a conceptually simple but overlooked approach -- data duplication -- can be applied to this problem, yielding a natural variant of weighted kNN-Shapley. However, a straightforward application of the data-duplication idea leads to increased data size and prohibitive computational and memory costs. We develop an efficient algorithm that avoids materializing the duplicated dataset by exploiting the structural properties of weighted kNN models, reducing the complexity to near-linear time in the original data size. Besides, we establish theoretical foundations for this approach through axiomatic characterization of the resulting values, and empirically validate the effectiveness and efficiency of our method.

artificial intelligence, machine learning, ztest, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)
Asia (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Why England Player Djed Spence Is Wearing a Mask at the World Cup

TIME - TechJun-16-2026, 21:44:04 GMT

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?

artificial intelligence, open follow modal personalized content, spence, (10 more...)

TIME - Tech

Country:

Europe > United Kingdom > England (0.45)
North America > United States (0.32)

Industry: Leisure & Entertainment > Sports > Soccer (0.98)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.44)

Add feedback

Matchings Under Biased and Correlated Evaluations

Neural Information Processing SystemsJun-16-2026, 21:29:52 GMT

We study a two-institution stable matching model in which candidates from two distinct groups are evaluated using partially correlated signals that are groupbiased. This extends prior work (which assumes institutions evaluate candidates in an identical manner) to a more realistic setting in which institutions rely on overlapping, but independently processed, criteria. These evaluations could consist of a variety of informative tools such as standardized tests, shared recommendation systems, or AI-based assessments with local noise. Two key parameters govern evaluations: the bias parameter β (0,1], which models systematic disadvantage faced by one group, and the correlation parameter γ [0,1], which captures the alignment between institutional rankings. We study the representation ratio R(β,γ), i.e., the ratio of disadvantaged to advantaged candidates selected by the matching process in this setting.

artificial intelligence, institution 1, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Law (0.92)
Education > Educational Setting (0.67)
Information Technology > Services (0.45)
Education > Assessment & Standards > Student Performance (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Scalable Signature Kernel Computations via Local Neumann Series Expansions

Neural Information Processing SystemsJun-16-2026, 21:19:59 GMT

The signature kernel [10] is a recent state-of-the-art tool for analyzing highdimensional sequential data, valued for its theoretical guarantees and strong empirical performance. In this paper, we present a novel method for efficiently computing the signature kernel of long, high-dimensional time series via adaptively truncated recursive local power series expansions. Building on the characterization of the signature kernel as the solution of a Goursat PDE [17], our approach employs tilewise Neumann-series expansions to derive rapidly converging power series approximations of the signature kernel that are locally defined on subdomains and propagated iteratively across the entire domain of the Goursat solution by exploiting the geometry of the time series. Algorithmically, this involves solving a system of interdependent Goursat PDEs via adaptively truncated local power series expansions and recursive propagation of boundary conditions along a directed graph in a topological ordering.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Adversarial Paraphrasing: AUniversal Attack for Humanizing AI-Generated Text

Neural Information Processing SystemsJun-16-2026, 21:12:49 GMT

The increasing capabilities of Large Language Models (LLMs) have raised concerns about their misuse in AI-generated plagiarism and social engineering. While various AI-generated text detectors have been proposed to mitigate these risks, many remain vulnerable to simple evasion techniques such as paraphrasing. However, recent detectors have shown greater robustness against such basic attacks. In this work, we introduce Adversarial Paraphrasing, a training-free attack framework that universally humanizes any AI-generated text to evade detection more effectively. Our approach leverages an off-the-shelf instruction-following LLM to paraphrase AI-generated content under the guidance of an AI text detector, producing adversarial examples that are specifically optimized to bypass detection. Extensive experiments show that our attack is both broadly effective and highly transferable across several detection systems. For instance, compared to simple paraphrasing attack--which, ironically, increases the true positive at 1% false positive (T@1%F) by 8.57% on RADAR and 15.03% on Fast-DetectGPT--adversarial paraphrasing, guided by OpenAI-RoBERTa-Large, reduces T@1%F by 64.49% on RADAR and a striking 98.96% on Fast-DetectGPT. Across a diverse set of detectors--including neural network-based, watermark-based, and zero-shot approaches--our attack achieves an average T@1%F reduction of 87.88% under the guidance of OpenAI-RoBERTa-Large. We also analyze the tradeoff between text quality and attack success to find that our method can significantly reduce detection rates, with mostly a slight degradation in text quality. Our adversarial setup highlights the need for more robust and resilient detection strategies in the light of increasingly sophisticated evasion techniques.

detector, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Consumer Health (1.00)
Government > Regional Government (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion

Neural Information Processing SystemsJun-16-2026, 21:11:31 GMT

In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model. Two main strategies have emerged for learning invariant distributions: designing equivariant network architectures and using data augmentation to approximate equivariance. While equivariant architectures preserve symmetry by design, they often involve greater complexity and pose optimization challenges. Data augmentation, on the other hand, offers flexibility but may fall short in fully capturing symmetries. Our framework enhances both approaches by reducing training variance and providing a provably lower-variance gradient estimator.

machine learning, natural language, qgt, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Germany (0.28)
Europe > Austria (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Show-o2: Improved Native Unified Multimodal Models

Neural Information Processing SystemsJun-16-2026, 21:03:28 GMT

This paper presents improved native unified multimodal models, i.e., Show-o2, that leverage autoregressive modeling and flow matching. Built upon a 3D causal variational autoencoder space, unified visual representations are constructed through a dual-path of spatial (-temporal) fusion, enabling scalability across image and video modalities while ensuring effective multimodal understanding and generation. Based on a language model, autoregressive modeling and flow matching are natively applied to the language head and flow head, respectively, to facilitate text token prediction and image/video generation. A two-stage training recipe is designed to effectively learn and scale to larger models. The resulting Show-o2 models demonstrate versatility in handling a wide range of multimodal understanding and generation tasks across diverse modalities, including text, images, and videos. Code and models are released at https://github.com/showlab/Show-o.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Learning Shared Representations from Unpaired Data

Neural Information Processing SystemsJun-16-2026, 19:58:14 GMT

Learning shared representations is a primary area of multimodal representation learning. The current approaches to achieve a shared embedding space rely heavily on paired samples from each modality, which are significantly harder to obtain than unpaired ones. In this work, we demonstrate that shared representations can be learned almost exclusively from unpaired data. Our arguments are grounded in the spectral embeddings of the random walk matrices constructed independently from each unimodal representation. Empirical results in computer vision and natural language processing domains support its potential, revealing the effectiveness of unpaired data in capturing meaningful cross-modal relations, demonstrating high capabilities in retrieval tasks, generation, arithmetics, zero-shot, and cross-domain classification. This work, to the best of our knowledge, is the first to demonstrate these capabilities almost exclusively from unpaired samples, giving rise to a crossmodal embedding that could be viewed as universal, i.e., independent of the specific modalities of the data.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Embeddings as Probabilistic Equivalence in Logic Programs

Neural Information Processing SystemsJun-16-2026, 19:56:51 GMT

The integration of logic programs with embedding models resulted in a class of neurosymbolic frameworks that jointly learn symbolic rules and representations for the symbols in the logic (constant or predicate). The key idea that enabled this integration was the differentiable relaxation of unification, the algorithm for variable instantiation during inference in logic programs. Unlike unification, its relaxed counterpart exploits the similarity between symbols in the embedding space to decide when two symbols are semantically equivalent. We show that this similarity between symbols violates the transitive law of equivalence, leading to undesirable side effects in learning and inference. To alleviate those side effects, we are the first to revamp the well-known possible world semantics of probabilistic logic programs into new semantics called equivalence semantics. In our semantics, a probabilistic logic program induces a probability distribution over all possible equivalence relations between symbols, instead of a probability distribution over all possible subsets of probabilistic facts. We propose a factorization of the equivalence distribution using latent random variables and characterize its expressivity. Additionally, we propose both exact and approximate techniques for reasoning in our semantics. Experiments on well-known benchmarks show that the equivalence semantics leads to neurosymbolic models with up to 42% higher results than state-of-the-art baselines.

artificial intelligence, fuzzy logic, logic & formal reasoning, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.67)

Add feedback