Goto

Collaborating Authors

 Technology


Overleaf Example

Neural Information Processing Systems

Vision-Language Models (VLMs) acquire real-world knowledge and general reasoning ability through Internet-scale image-text corpora. They can augment robotic systems with scene understanding and task planning, and assist visuomotor policies that are trained on robot trajectory data. We explore the reverse paradigm -- using rich, real, multi-modal robot trajectory data to enhance and evaluate VLMs.


Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning

Neural Information Processing Systems

The unusual properties of in-context learning (ICL) have prompted investigations into the internal mechanisms of large language models. Prior work typically focuses on either special attention heads or task vectors at specific layers, but lacks a unified framework linking these components to the evolution of hidden states across layers that ultimately produce the model's output. In this paper, we propose such a framework for ICL primarily in classification tasks by analyzing two geometric factors that govern performance: the separability and alignment of query hidden states. A fine-grained analysis of layer-wise dynamics reveals a striking two-stage mechanism--separability emerges in early layers, while alignment develops in later layers. Ablation studies further show that Previous Token Heads drive separability, while Induction Heads and task vectors enhance alignment. Our findings thus bridge the gap between attention heads and task vectors, offering a unified account of ICL's underlying mechanisms.1


ImageSentinel: Protecting Visual Datasets from Unauthorized Retrieval-Augmented Image Generation

Neural Information Processing Systems

The widespread adoption of Retrieval-Augmented Image Generation (RAIG) has raised significant concerns about the unauthorized use of private image datasets. While these systems have shown remarkable capabilities in enhancing generation quality through reference images, protecting visual datasets from unauthorized use in such systems remains a challenging problem. Traditional digital watermarking approaches face limitations in RAIG systems, as the complex feature extraction and recombination processes fail to preserve watermark signals during generation. To address these challenges, we propose ImageSentinel, a novel framework for protecting visual datasets in RAIG. Our framework synthesizes sentinel images that maintain visual consistency with the original dataset. These sentinels enable protection verification through randomly generated character sequences that serve as retrieval keys. To ensure seamless integration, we leverage vision-language models to generate the sentinel images. Experimental results demonstrate that ImageSentinel effectively detects unauthorized dataset usage while preserving generation quality for authorized applications.


TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration

Neural Information Processing Systems

Trajectory modeling, which includes research on trajectory data pattern mining and future prediction, has widespread applications in areas such as life services, urban transportation, and public administration. Numerous methods have been proposed to address specific problems within trajectory modeling. However, the heterogeneity of data and the diversity of trajectory tasks make effective and reliable trajectory modeling an important yet highly challenging endeavor, even for domain experts. In this paper, we propose TrajAgent, a agent framework powered by large language models (LLMs), designed to facilitate robust and efficient trajectory modeling through automation modeling. This framework leverages and optimizes diverse specialized models to address various trajectory modeling tasks across different datasets effectively. In TrajAgent, we first develop UniEnv, an execution environment with a unified data and model interface, to support the execution and training of various models. Building on UniEnv, we introduce an agentic workflow designed for automatic trajectory modeling across various trajectory tasks and data. Furthermore, we introduce collaborative learning schema between LLM-based agents and small speciallized models, to enhance the performance of the whole framework effectively. Extensive experiments on four tasks using four real-world datasets demonstrate the effectiveness of TrajAgent in automated trajectory modeling, achieving a performance improvement of 2.38%-69.91%


Continuous Thought Machines

Neural Information Processing Systems

Biological brains demonstrate complex neural activity, where neural dynamics are critical to how brains process information. Most artificial neural networks ignore the complexity of individual neurons .


LinPrim: Linear Primitives for Differentiable Volumetric Rendering

Neural Information Processing Systems

Volumetric rendering has become central to modern novel view synthesis methods, which use differentiable rendering to optimize 3D scene representations directly from observed views. While many recent works build on NeRF [18] or 3DGaussians [13], we explore an alternative volumetric scene representation. More specifically, we introduce two new scene representations based on linear primitives--octahedra and tetrahedra--both of which define homogeneous volumes bounded by triangular faces. To optimize these primitives, we present a differentiable rasterizer that runs efficiently on GPUs, allowing end-to-end gradientbased optimization while maintaining real-time rendering capabilities. Through experiments on real-world datasets, we demonstrate comparable performance to state-of-the-art volumetric methods while requiring fewer primitives to achieve similar reconstruction fidelity. Our findings deepen the understanding of 3D representations by providing insights into the fidelity and performance characteristics of transparent polyhedra and suggest that adopting novel primitives can expand the available design space. 1


This compact 100W charger hits its lowest price yet at 33

PCWorld

When you purchase through links in our articles, we may earn a small commission. Ugreen's Nexode 100W charger is down to $33.23 from $54.99, the lowest price we've seen for this four-port charger. One thing worth noting, though: in recent months, it has topped out closer to $43, so the real discount isn't quite as Amazon suggests. The big appeal of this charger is just how much power it delivers from such a small device. Thanks to GaN technology, you get a full 100W of power from a charger that fits into the palm of your hand.


RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers

Neural Information Processing Systems

We propose RoPECraft, a training-free video motion transfer method for diffusion transformers that operates solely by modifying their rotary positional embeddings (RoPE). We first extract dense optical flow from a reference video, and utilize the resulting motion offsets to warp the complex-exponential tensors of RoPE, effectively encoding motion into the generation process. These embeddings are then further optimized during denoising time steps via trajectory alignment between the predicted and target velocities using a flow-matching objective. To keep the output faithful to the text prompt and prevent duplicate generations, we incorporate a regularization term based on the phase components of the reference video's Fourier transform, projecting the phase angles onto a smooth manifold to suppress highfrequency artifacts. Experiments on benchmarks reveal that RoPECraft outperforms all recently published methods, both qualitatively and quantitatively.


Enhancing Contrastive Learning with Variable Similarity

Neural Information Processing Systems

Contrastive learning has achieved remarkable success in self-supervised learning by pretraining a generalizable feature representation based on the augmentation invariance. Most existing approaches assume that different augmented views of the same instance (i.e., the positive pairs) remain semantically invariant. However, the augmentation results with varying extent may introduce semantic discrepancies or even content distortion, and thus the conventional (pseudo) supervision from augmentation invariance may lead to misguided learning objectives. In this paper, we propose a novel method called Contrastive Learning with Variable Similarity (CLVS) to accurately characterize the intrinsic similarity relationships between different augmented views. Our method dynamically adjusts the similarity based on the augmentation extent, and it ensures that strongly augmented views are always assigned lower similarity scores than weakly augmented ones. We provide a theoretical analysis to guarantee the effectiveness of the variable similarity in improving model generalizability. Extensive experiments demonstrate the superiority of our approach, achieving gains of 2.1% on ImageNet-100 and 1.4% on ImageNet-1k compared with the state-of-the-art methods.


The animal with a face 'not even a mother would love': Elusive goblin shark is seen alive in its natural habitat for the first time

Daily Mail - Science & tech

Former Olympian seen in handcuffs as Trump threatens'years in jail' and more arrests after vandals SABOTAGE Reflecting Pool with'corrosive and destructive chemicals' Angelina Jolie's son Pax, 22, surfaces in LA after bombshell revelation about his relationship to Brad Pitt Mortifying truth about Clavicular's'botched' nose job: Infertile influencer's'trans' admission to friends... as insider reveals what's said behind closed doors - and twisted secrets that'll leave fans floored Keir Starmer'will announce as early as Monday that he is quitting as Prime Minister' after spending weekend locked in tense talks about his future with his wife Victoria at Chequers Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Call me cynical, but the real reason Gruesome Twosome Harry and Meghan are returning to the UK is just so obvious... and highly humiliating: MAUREEN CALLAHAN I lost 50lb without jabs using this easy but overlooked method. But I still felt dowdy - until I discovered these expert anti-ageing fashion and beauty tips. Giorgia Meloni rips'senseless' attacks from Trump as Italian Prime Minister refuses to back down amid G7 feud No one can see the real reason Jelly Roll divorced Bunnie XO. Wyndham Clark's stunning girlfriend pays tribute to polarizing golfer as he stands on the brink of US Open glory TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives Embattled Alexi Lalas makes controversial World Cup declaration amid tension with Fox colleagues: 'Makes you look like a weak poser' Scientists propose radical new theory of consciousness - and claim it doesn't depend on flesh and blood Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why Blake Lively runs errands in frumpy outfit after reconciling with ex-BFF Taylor Swift... miles away from reported'bachelorette party' Grace Kelly's lookalike granddaughter, 27, wows in bikini snaps...as she packs on the PDA during beach getaway The animal with a face'not even a mother would love': Elusive goblin shark is seen alive in its natural habitat for the first time A goblin shark has been seen alive in its natural habitat - not just once, but twice. Scientists were reviewing footage captured at a seamount near Jarvis Island in 2019 when they spotted the elusive animal.