Technology
Prior-Guided Diffusion Planning for Offline Reinforcement Learning
Diffusion models have recently gained prominence in offline reinforcement learning due to their ability to effectively learn high-performing, generalizable policies from static datasets. Diffusion-based planners facilitate long-horizon decisionmaking by generating high-quality trajectories through iterative denoising, guided by return-maximizing objectives. However, existing guided sampling strategies such as Classifier Guidance, Classifier-Free Guidance, and Monte Carlo Sample Selection either produce suboptimal multi-modal actions, struggle with distributional drift, or incur prohibitive inference-time costs. To address these challenges, we propose Prior Guidance (PG), a novel guided sampling framework that replaces the standard Gaussian prior of a behavior-cloned diffusion model with a learnable distribution, optimized via a behavior-regularized objective. PG directly generates high-value trajectories without costly reward optimization of the diffusion model itself, and eliminates the need to sample multiple candidates at inference for sample selection. We present an efficient training strategy that applies behavior regularization in latent space, and empirically demonstrate that PG outperforms state-of-the-art diffusion policies and planners across diverse long-horizon offline RL benchmarks. Our code is available at https://github.com/ku-dmlab/PG.
FlowerTune: ACross-Domain Benchmark for Federated Fine-Tuning of Large Language Models
Large Language Models (LLMs) have achieved state-of-the-art results across diverse domains, yet their development remains reliant on vast amounts of publicly available data, raising concerns about data scarcity and the lack of access to domainspecific, sensitive information. Federated Learning (FL) presents a compelling framework to address these challenges by enabling decentralized fine-tuning on pre-trained LLMs without sharing raw data. However, the compatibility and performance of pre-trained LLMs in FL settings remain largely under explored. We introduce the FlowerTune LLMLeaderboard, a first-of-its-kind benchmarking suite designed to evaluate federated fine-tuning of LLMs across four diverse domains: general NLP, finance, medical, and coding. Each domain includes federated instruction-tuning datasets and domain-specific evaluation metrics. Our results, obtained through a collaborative, open-source and community-driven approach, provide the first comprehensive comparison across 26 pre-trained LLMs with different aggregation and fine-tuning strategies under federated settings, offering actionable insights into model performance, resource constraints, and domain adaptation. This work lays the foundation for developing privacy-preserving, domain-specialized LLMs for real-world applications.
Object State Recognition Initial StatearT nsitioning State End State LLMPlease provide the initial, transitioning, and end states for slicing a lemon
Recognizing the physical states of objects and their transformations within videos is crucial for structured video understanding and enabling robust real-world applications, such as robotic manipulation. However, pretrained vision-language models often struggle to capture these nuanced dynamics and their temporal context, and specialized object state recognition frameworks may not generalize to unseen actions or objects. We introduce SAGE (State-Action Graph Embeddings), a novel framework that offers a unified model of physical state transitions by decomposing states into fine-grained, language-described visual concepts that are sharable across different objects and actions. SAGE initially leverages Large Language Models to construct a State-Action Graph, which is then multimodally refined using Vision-Language Models. Extensive experiments show that our method significantly outperforms baselines, generalizes effectively to unseen objects and actions in open-world settings. SAGE improves the prior state-of-the-art by as much as 14.6% on novel state recognition with less than 5% of its inference time.
Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
Direct preference optimization (DPO) methods have shown strong potential in aligning text-to-image diffusion models with human preferences by training on paired comparisons. These methods improve training stability by avoiding the REINFORCE algorithm but still struggle with challenges such as accurately estimating image probabilities due to the non-linear nature of the sigmoid function and the limited diversity of offline datasets. In this paper, we introduce Diffusion Denoising Ranking Optimization (Diffusion-DRO), a new preference learning framework grounded in inverse reinforcement learning. Diffusion-DRO removes the dependency on a reward model by casting preference learning as a ranking problem, thereby simplifying the training objective into a denoising formulation and overcoming the non-linear estimation issues found in prior methods. Moreover, Diffusion-DRO uniquely integrates offline expert demonstrations with online policy-generated negative samples, enabling it to effectively capture human preferences while addressing the limitations of offline data. Comprehensive experiments show that Diffusion-DRO delivers improved generation quality across a range of challenging and unseen prompts, outperforming state-of-the-art baselines in both both quantitative metrics and user studies.
Israel launches fresh strikes on Lebanon despite Trump criticism
Israeli forces have carried out new strikes in southern Lebanon, state media say, despite renewed criticism from US President Donald Trump of Israel's actions in the country. Israeli drone strikes injured several people in Mansouri and Aaziyyeh on Wednesday, while jets attacked Nabatieh al-Fawqa and Kfar Tebnit, Lebanon's National News Agency reported. Israel's military has not commented, but it did say five soldiers were injured in a drone attack in Lebanon by the Iran-backed armed group Hezbollah. Mediator Pakistan has said the deal between the US and Iran to end the war includes Lebanon. On Tuesday, Trump said Israel's prime minister needed to be more responsible with respect to Lebanon.
Will it take a 'Chernobyl-scale disaster' for us to regulate cyber weapons of mass destruction? Stuart Russell
'The CEOs are telling us, "We're on track to create superhuman intelligence, which has a good chance of causing human extinction."' 'The CEOs are telling us, "We're on track to create superhuman intelligence, which has a good chance of causing human extinction."' Will it take a'Chernobyl-scale disaster' for us to regulate cyber weapons of mass destruction? T he AI company Anthropic has been making major headlines recently. Its trillion-dollar IPO plan and its blood feud with secretary of defense Pete Hegseth have attracted much attention, but two other events may be even more consequential.
Interactive. Violent. Gross. Inside Fishtank, the Unhinged Future of Reality TV
WIRED goes on location--and on camera--with the cult hit. On March 16, 2026, at 5:45 pm in a leafy suburb of Atlanta called Sandy Springs, police pound on the door of a neglected French Country-style mansion, rifles at the ready, bodycams rolling. Minutes earlier, a distress call came from someone claiming to be hiding from a gunman in the mansion's downstairs bathroom. The dispatcher heard a gunshot ring out in the distance, then the line disconnected. "Open the door!" an officer yells. A calm young man with a mullet and woolly eyebrows steps out, hands raised. The police ask him who else is in the house. "Just my friends," he replies, as seven other young people, men and women, silently file out behind him, less evidently relaxed. They remain outside while two officers search the house. Inside the mansion there are no immediate signs of a massacre, but the decor alone arouses suspicion. All of the windows are frosted over, so only a chilly light leaks in. The place is a mess, and the walls are adorned with lurid, seemingly AI-generated art: a frowning baby holding an assault rifle, a rubber ducky bobbing in a mug of what looks like black coffee, a lidless and levitating eyeball crying into a martini glass. The rooms are painted primary colors, grass green and cherry red, like a kindergarten class. A vape dangles from a doorframe by a chain, suspended at mouth level. The pantry is practically empty. The bedroom is a dormitory featuring seven identical twin beds. No one is hiding in the bathroom. The call, it seems, was a prank. The police return to the driveway and ask, "What is it that you guys are doing here?" "We're just livestreaming," says a man in a camo hat named Matt. "You guys don't have any firearms or anything inside the house?" There are guns in the house, Matt says, for self-defense. Fans of their livestream can be obsessive, he explains, and tend to have perverse ideas about jokes. The officer asks to see their weapons, and they go downstairs. The room is cluttered with ergonomic swivel chairs, desks strewn with takeout containers and energy drinks, two flatscreen TVs, and a dozen computer monitors.
Fair Matroid Selection
We investigate the problem of sequentially selecting elements of an unknown matroid in an online manner to form an independent set, with the goal of maximizing the minimum probability of acceptance across all elements, a property we define as f-fairness. Under adversarial arrival orders, we design an α(lnk + 1)-fair algorithm, where α is the arboricity of the matroid and k is the rank, a result that is nearly optimal. For laminar matroids, we develop a (2α 1)-fair algorithm, which is optimal up to constant factors, achieved through a novel online coloring scheme. In the random arrival order setting, we achieve a (4+o(1))α-fair algorithm for graphic matroids, matching the optimal result up to constant factors, relying on a novel technique for learning a degeneracy ordering using a sampled subset of edges. We further generalize our result to p-matchoids, obtaining a β(plnk + 1)-fair algorithm for the adversarial arrival model, where β is the optimal offline fairness. Notably, all our results can be extended to a setting with no prior knowledge of the matroid with only a logarithmic increase in the fairness factor.
Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry
Generating realistic brain connectivity matrices is key to analyzing population heterogeneity in brain organization, understanding disease, and augmenting data in challenging classification problems. Functional connectivity matrices lie in constrained spaces--such as the set of symmetric positive definite or correlation matrices--that can be modeled as Riemannian manifolds. However, using Riemannian tools typically requires redefining core operations (geodesics, norms, integration), making generative modeling computationally inefficient. In this work, we propose DIFFEOCFM, an approach that enables conditional flow matching (CFM) on matrix manifolds by exploiting pullback metrics induced by global diffeomorphisms on Euclidean spaces. We show that Riemannian CFM with such metrics is equivalent to applying standard CFM after data transformation. This equivalence allows efficient vector field learning, and fast sampling with standard ODE solvers.