Goto

Collaborating Authors

 Genre


You Only Communicate Once: One-shot Federated Low-Rank Adaptation of MLLM

Neural Information Processing Systems

Multimodal Large Language Models (MLLMs) with Federated Learning (FL) can quickly adapt to privacy-sensitive tasks, but face significant challenges such as high communication costs and increased attack risks, due to their reliance on multiround communication. To address this, One-shot FL (OFL) has emerged, aiming to complete adaptation in a single client-server communication. However, existing adaptive ensemble OFL methods still need more than one round of communication, because correcting heterogeneity-induced local bias relies on aggregated global supervision, meaning they still do not achieve true one-shot communication. In this work, we make the first attempt to achieve true one-shot communication for MLLMs under OFL, by investigating whether implicit (i.e., initial rather than aggregated) global supervision alone can effectively correct local training bias. Our key finding from the empirical study is that imposing directional supervision on local training substantially mitigates client conflicts and local bias. Building on this insight, we propose YOCO, in which directional supervision with sign-regularized LoRAB enforces global consistency, while sparsely regularized LoRAA preserves client-specific adaptability. Experiments demonstrate that YOCO cuts communication to 0.03% of multi-round FL while surpassing those methods in several multimodal scenarios and consistently outperforming all one-shot competitors.


O(T) Static Regret and Instance Dependent Constraint Violation for Constrained Online Convex Optimization

Neural Information Processing Systems

The constrained version of the standard online convex optimization (OCO) framework, called COCO is considered, where on every round, a convex cost function and a convex constraint function are revealed to the learner after it chooses the action for that round. The objective is to simultaneously minimize the static regret and cumulative constraint violation (CCV). An algorithm is proposed that guarantees a static regret of O( T) and a CCV of min{V,O( Tlog T)}, where V depends on the distance between the consecutively revealed constraint sets, the shape of constraint sets, dimension of action space and the diameter of the action space. When constraint sets have additional structure, V = O(1). Compared to the state of the art results, static regret of O( T) and CCV of O( T log T), that were universal, the new result on CCV is instance dependent, which is derived by exploiting the geometric properties of the constraint sets.


MS-GS: Multi-Appearance Sparse-View 3DGaussian Splatting in the Wild

Neural Information Processing Systems

In-the-wild photo collections often contain limited volumes of imagery and exhibit multiple appearances, e.g., taken at different times of day or seasons, posing significant challenges to scene reconstruction and novel view synthesis. Although recent adaptations of Neural Radiance Field (NeRF) and 3DGaussian Splatting (3DGS) have improved in these areas, they tend to oversmooth and are prone to overfitting. In this paper, we present MS-GS, a novel framework designed with Multi-appearance capabilities in Sparse-view scenarios using 3DGS. To address the lack of support due to sparse initializations, our approach is built on the geometric priors elicited from monocular depth estimations. The key lies in extracting and utilizing local semantic regions with a Structure-from-Motion (SfM) points anchored algorithm for reliable alignment and geometry cues. Then, to introduce multi-view constraints, we propose a series of geometry-guided supervision steps at virtual views in pixel and feature levels to encourage 3D consistency and reduce overfitting. We also introduce a dataset and an in-the-wild experiment setting to set up more realistic benchmarks. We demonstrate that MS-GS achieves photorealistic renderings under various challenging sparse-view and multi-appearance conditions, and outperforms existing approaches significantly across different datasets.


Who will win the World Cup? Mathematician's 11 models predict four possible champions (but NOT England!)

Daily Mail - Science & tech

Embattled Gavin Newsom's stunning confession to Justin Trudeau caught on camera at World Cup when he thought no one was watching Secret list of celebrities attending billionaire Peter Thiel's invite-only society where elites learn about sex, cults and the next world war Malia and Sasha Obama steal the show during rare family outing for grand opening of dad Barack's library Haunting final video of beloved Bay Area coffee shop owner, 52, who vanished without a trace: Investigator reveals'unnerving' new clues found inside her home Watch horrifying drone video that follows woman's plunge to death after bungee team threw her from bridge without rope Tragic final moments of Hollywood legend's daughter and her husband revealed before being mysteriously found dead in their running SUV Ivanka Trump's youngest son, 8, spotted in middle of Knicks victory parade Scientists create first-ever'map' of female pleasure center that's confused men for centuries All my friends are suddenly getting divorced. Mid-life wives share taboo sex confessions about why they really leave... including common position that made one hate her husband: JANA HOCKING Taylor Swift's bottomless thirst for attention, her greed and sheer tackiness are now truly unbearable... this latest stunt has shown her true colors: MAUREEN CALLAHAN Mystery surrounds JD Vance's dash to Switzerland as world holds breath for Iranians to confirm peace deal Male Israeli hostage sexually assaulted by Hamas captor describes multiple attacks he suffered - blindfolded and stripped naked at knifepoint... and'brutal' 20-minute ordeal Boy, three, is thrown into crocodile enclosure at zoo: Man, 30, 'not known to him' arrested on suspicion of attempted murder Infection found in wildlife evolved to spread between humans, experts fear... after two clusters are identified Florida man hailed as a hero for jumping off of his bike to wrangle a dangerous 8-foot python... only to then be slapped with a $180 FINE Sensational REAL reason Jelly Roll is divorcing Bunnie XO: Insiders reveal'preacher's wife' bombshell that's the talk of Nashville... truth about legendary rocker cuckolding rumor... and G-string mishap Who will win the World Cup? Mathematician's 11 models predict four possible champions (but NOT England!) READ MORE: Supercomputer predicts England's World Cup journey England's World Cup journey begins tonight, but a mathematician warns that fans shouldn't get their hopes up. Dr Ari Joury, a particle physicist and founder of AI firm Wangari, created 11 different models to predict who will win this year's tournament. These digital tipsters crowned four different champions between them, but not a single one picked England. Seven models backed Spain, two singled out Argentina as the likeliest winner, while France and the Netherlands were each the favourite of one prediction system.


Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders

Neural Information Processing Systems

The introduction of generative models has significantly advanced image superresolution (SR) in handling real-world degradations. However, they often incur fidelity-related issues, particularly distorting textual structures. In this paper, we introduce a novel diffusion-based SR framework, namely TADiSR, which integrates text-aware attention and joint segmentation decoders to recover not only natural details but also the structural fidelity of text regions in degraded real-world images. Moreover, we propose a complete pipeline for synthesizing high-quality images with fine-grained full-image text masks, combining realistic foreground text regions with detailed background content. Extensive experiments demonstrate that our approach substantially enhances text legibility in super-resolved images, achieving state-of-the-art performance across multiple evaluation metrics and exhibiting strong generalization to real-world scenarios. Our code is available at here.


ReDit: Reward Dithering for Improved LLMPolicy Optimization

Neural Information Processing Systems

DeepSeek-R1 has successfully enhanced Large Language Models (LLMs) reasoning capabilities through its rule-based reward system. While it's a "perfect" reward system that effectively mitigates reward hacking, such reward functions are often discrete. Our experimental observations suggest that discrete rewards can lead to gradient anomaly, unstable optimization, and slow convergence. To address this issue, we propose ReDit (Reward Dithering), a method that dithers the discrete reward signal by adding simple random noise. With this perturbed reward, exploratory gradients are continuously provided throughout the learning process, enabling smoother gradient updates and accelerating convergence.


Smooth Quadratic Prediction Markets

Neural Information Processing Systems

When agents trade in a Duality-based Cost Function prediction market, they collectively implement the learning algorithm Follow-The-Regularized-Leader [Abernethy et al., 2013]. We ask whether other learning algorithms could be used to inspire the design of prediction markets. By decomposing and modifying the Duality-based Cost Function Market Maker's (DCFMM) pricing mechanism, we propose a new prediction market, called the Smooth Quadratic Prediction Market, the incentivizes agents to collectively implement general steepest gradient descent. Relative to the DCFMM, the Smooth Quadratic Prediction Market has a better worst-case monetary loss for AD securities while preserving axiom guarantees such as the existence of instantaneous price, information incorporation, expressiveness, no arbitrage, and a form of incentive compatibility. To motivate the application of the Smooth Quadratic Prediction Market, we independently examine agents' trading behavior under two realistic constraints: bounded budgets and buy-only securities. Finally, we provide an introductory analysis of an approach to facilitate adaptive liquidity using the Smooth Quadratic Prediction Market. Our results suggest future designs where the price update rule is separate from the fee structure, yet guarantees are preserved.


REOBench: Benchmarking Robustness of Earth Observation Foundation Models

Neural Information Processing Systems

Earth observation foundation models have shown strong generalization across multiple Earth observation tasks, but their robustness under real-world perturbations remains underexplored. To bridge this gap, we introduce REOBench, the first comprehensive benchmark for evaluating the robustness of Earth observation foundation models across six tasks and twelve types of image corruptions, including both appearance-based and geometric perturbations. To ensure realistic and fine-grained evaluation, our benchmark focuses on high-resolution optical remote sensing images, which are widely used in critical applications such as urban planning and disaster response. We conduct a systematic evaluation of a broad range of models trained using masked image modeling, contrastive learning, and vision-language pre-training paradigms. Our results reveal that existing Earth observation foundation models experience significant performance degradation when exposed to input corruptions. The severity of degradation varies across tasks, model architectures, backbone sizes, and types of corruption, with performance drop varying from less than 1% to over 25%. Vision-language models show enhanced robustness, particularly in multimodal tasks. REOBench underscores the vulnerability of current Earth observation foundation models to real-world corruptions and provides actionable insights for developing more robust and reliable models. Code and data are publicly available at https://github.com/lx709/REOBench.


Learning Simple Interpolants for Linear Integer Arithmetic

Neural Information Processing Systems

Craig interpolation plays a central role in formal verification tasks such as model checking, invariant generation, and abstraction refinement. In the domain of linear integer arithmetic (LIA), interpolants are crucial for deriving inductive invariants that characterize unreachable or safe program states, enabling scalable and precise reasoning about software and hardware correctness. Despite progress in interpolation algorithms, generating concise and interpretable interpolants remains a key challenge. We propose a lightweight learning-based approach to generating simple interpolants for LIA. Our model learns to lazily sample input problems directly and is complementary to existing logical methods. We show that when Z3 is guided by our learned model, the complexity of the interpolants it produces can be reduced by up to 47.3%. For older solvers, the reduction rate can reach up to 69.1%.


RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

Neural Information Processing Systems

While latent diffusion models (LDMs), such as Stable Diffusion, are designed for high-resolution (HR) image generation, they often struggle with significant structural one. Instead distortions of relying when generating on extensiv images e retraining, at resolutions a more resource-ef higher than ficient their approach training is to reprogram the pretrained model for HR image generation; however, existing methods often result in poor image quality and long inference time. We introduce RepLDM, high-quality a, no high-ef vel reprogramming ficiency, high-r frame esolution work image for pretrained generation; LDMs see that Fig. enables 1. RepLDM consists of two stages: (i) an attention guidance stage, which generates a latent training-free representa self-attention tion of a higher mechanism -quality to training-resolution enhance the structural image consistenc using a y; no and vel (ii) a progressive upsampling stage, which progressively performs upsampling in pixel space to mitigate the severe artifacts caused by latent space upsampling.