Goto

Collaborating Authors

 Genre


Social networks, online video outweigh traditional media in 2026

The Japan Times

News consumers around the world are now turning more to social media and video platforms than traditional outlets for information, a report has found. News consumers around the world are now turning more to social media and video platforms than traditional outlets for information, a report said Tuesday, warning that old-style business models are under threat. The year 2026 marks "a significant milestone: for the first time, social media and video network consumption is now ahead of other news sources as the most widely used source of news globally," at 54%, wrote Jim Egan, lead author of the report from the Reuters Institute for the Study of Journalism. The annual report from the institute, attached to the University of Oxford, is a closely-watched tracker of trends reshaping the news media. Researchers based their findings on online surveys of almost 100,000 people in 48 countries, run earlier this year by pollster YouGov. This year's edition found 54% of respondents said they got news from social media or video platforms in the week before the survey -- rising to 56% if AI chatbots like ChatGPT were included.


2ea18fdc667e0ef2ad82b2b4d65147ad-Paper-Conference.pdf

Neural Information Processing Systems

Digitizing offers significant the physical opportunities world into in accurate a variety simulation of fields such -ready as virtual augmented environments and virtual understanding as geometry reality, g completeness, aming, methods and commonly robotics.


UtilGen: Utility-Centric Generative Data Augmentation with Dual-Level Task Adaptation

Neural Information Processing Systems

Data augmentation using generative models has emerged as a powerful paradigm for enhancing performance in computer vision tasks. However, most existing augmentation approaches primarily focus on optimizing intrinsic data attributes - such as fidelity and diversity - to generate visually high-quality synthetic data, while often neglecting task-specific requirements. Yet, it is essential for data generators to account for the needs of downstream tasks, as training data requirements can vary significantly across different tasks and network architectures. To address these limitations, we propose UTILGEN, a novel utility-centric data augmentation framework that adaptively optimizes the data generation process to produce taskspecific, high-utility training data via downstream task feedback. Specifically, we first introduce a weight allocation network to evaluate the task-specific utility of each synthetic sample. Guided by these evaluations, UTILGEN iteratively refines the data generation process using a dual-level optimization strategy to maximize the synthetic data utility: (1) model-level optimization tailors the generative model to the downstream task, and (2) instance-level optimization adjusts generation policies - such as prompt embeddings and initial noise - at each generation round. Extensive experiments on eight benchmark datasets of varying complexity and granularity demonstrate that UTILGEN consistently achieves superior performance, with an average accuracy improvement of 3.87% over previous SOTA. Further analysis of data influence and distribution reveals that UTILGEN produces more impactful and task-relevant synthetic data, validating the effectiveness of the paradigm shift from visual characteristics-centric to task utility-centric data augmentation.


ShoeFit: ANew Dataset and Dual-image-stream DiT Framework for Virtual Footwear Try-On

Neural Information Processing Systems

Virtual footwear try-on (VFTON), a critical yet underexplored area in virtual try-on (VTON), aims to synthesize faithful try-on results given diverse footwear and model (1) Data Scarimages while maintaining 3D consistency and texture authenticity. Unlike convenwith difficult matchtional garment-focused VTON methods, VFTON presents unique challenges due to (1) Data Scarcity, which arises from the difficulty of perfectly matching product shoes with models wearing the identical ones, (2) Viewpoint Misalignment, where the target foot pose and source shoe views are always misaligned, leading to incomplete texture information and detail distortion, and (3) Background-induced iewpoint Color Distortion, where complex material of footwear interacts with environmental lighting, causing unintended color contamination.


Is Problem Solving Induces in LLMs

Neural Information Processing Systems

The development of reasoning capabilities represents a critical frontier in large language models (LLMs) research, where reinforcement learning (RL) and process reward models (PRMs) have emerged as predominant methodological frameworks. Contrary to conventional wisdom, empirical evidence from DeepSeek-R1 demonstrates that pure RL training focused on mathematical problem-solving can progressively enhance reasoning abilities without PRM integration, challenging the perceived necessity of process supervision. In this study, we conduct a systematic investigation of the relationship between RL training and PRM capabilities. Our findings demonstrate that problem-solving proficiency and process supervision capabilities represent complementary dimensions of reasoning that co-evolve synergistically during pure RL training. In particular, current PRMs underperform simple baselines like majority voting when applied to state-of-the-art models such as DeepSeek-R1 and QwQ-32B.


Conformal Prediction Beyond the Seen: AMissing Mass Perspective for Uncertainty Quantification in Generative Models

Neural Information Processing Systems

Uncertainty quantification (UQ) is essential for safe deployment of generative AI models such as large language models (LLMs), especially in high-stakes applications. Conformal prediction (CP) offers a principled uncertainty quantification framework, but classical methods focus on regression and classification, relying on geometric distances or softmax scores-tools that presuppose structured outputs. We depart from this paradigm by studying CP in a query-only setting, where prediction sets must be constructed solely from finite queries to a black-box generative model, introducing a new trade-off between coverage, test-time query budget, and informativeness. We introduce Conformal Prediction with Query Oracle (CPQ), a framework characterizing the optimal interplay between these objectives. Our finite-sample algorithm is built on two core principles: one governs the optimal query policy, and the other defines the optimal mapping from queried samples to prediction sets.


DecompNet: Enhancing Time Series Forecasting Models with Implicit Decomposition

Neural Information Processing Systems

And based on this idea, we propose a powerful decomposition-based enhancement framework, namely DecompNet. Our method converts the time series decomposition into an implicit process, where it can give a time series model the decomposition-related knowledge during inference, even though this model does not actually decompose the input time series. Thus, our DecompNet can enable a model to inherit the performance promotion brought by time series decomposition but will not introduce any additional inference costs, successfully enhancing the model performance while enjoying better efficiency. Experimentally, our DecompNet exhibits promising enhancement capability and compelling framework generality. Especially, it can also enhance the performance of the latest and state-of-the-art models, greatly pushing the performance limit of time series forecasting. Through comprehensive comparisons, DecompNet also shows excellent performance and efficiency superiority, making the decomposition-based enhancement framework surpass the well-recognized normalization-based frameworks for the first time.


Dimension-free Score Matching and Time Bootstrapping for Diffusion Models

Neural Information Processing Systems

Diffusion models generate samples by estimating the score function of the target distribution at various noise levels. The model is trained using samples drawn from the target distribution, progressively adding noise. Previous sample complexity bounds have a polynomial dependence on the dimension d, apart from log(|H|), where H is the hypothesis class. In this work, we establish the first (nearly) dimension-free sample complexity bounds, modulo any dependence due to log(|H|), for learning these score functions, achieving a double exponential improvement in dimension over prior results. A key aspect of our analysis is to use a single function approximator to jointly estimate scores across noise levels, a critical feature in practice which enables generalization across timesteps. We introduce a novel martingale-based error decomposition and sharp variance bounds, enabling efficient learning from dependent data generated by Markov processes, which may be of independent interest. Building on these insights, we propose Bootstrapped Score Matching (BSM), a variance reduction technique that utilizes previously learned scores to improve accuracy at higher noise levels. These results provide crucial insights into the efficiency and effectiveness of diffusion models for generative modeling.


Precise Diffusion Inversion: Towards Novel Samples and Few-Step Models

Neural Information Processing Systems

The diffusion inversion problem seeks to recover the latent generative trajectory of a diffusion model given a real image. Faithful inversion is critical for ensuring consistency in diffusion-based image editing. Prior works formulate this task as a fixed-point problem and solve it using numerical methods. However, achieving both accuracy and efficiency remains challenging, especially for few-step models and novel samples. In this paper, we propose PreciseInv, a general-purpose testtime optimization framework that enables fast and faithful inversion in as few as two inference steps.


Efficient semantic uncertainty quantification in language models via diversity-steered sampling

Neural Information Processing Systems

Accurately estimating semantic aleatoric and epistemic uncertainties in large language models (LLMs) is particularly challenging in free-form question answering (QA), where obtaining stable estimates often requires many expensive generations. We introduce a diversity-steered sampler that discourages semantically redundant outputs during decoding, covers both autoregressive and masked diffusion paradigms, and yields substantial sampleefficiency gains. The key idea is to inject a continuous semantic-similarity penalty into the model's proposal distribution using a natural language inference (NLI) model lightly fine-tuned on partial prefixes or intermediate diffusion states. We debias downstream uncertainty estimates with importance reweighting and shrink their variance with control variates. Across four QA benchmarks, our method matches or surpasses baselines while covering more semantic clusters with the same number of samples. Being modular and requiring no gradient access to the base LLM, the framework promises to serve as a drop-in enhancement for uncertainty estimation in risk-sensitive model deployments.