Media
InFlux: ABenchmark for Self-Calibration of Dynamic Intrinsics of Video Cameras
Accurately tracking camera intrinsics is crucial for achieving 3D understanding from 2D video. However, most 3D algorithms assume that camera intrinsics stay constant throughout a video, which is often not true for many real-world in-the-wild videos. A major obstacle in this field is a lack of dynamic camera intrinsics benchmarks-existing benchmarks typically offer limited diversity in scene content and intrinsics variation, and none provide per-frame intrinsic changes for consecutive video frames. In this paper, we present Intrinsics in Flux (InFlux), a real-world benchmark that provides per-frame ground truth intrinsics annotations for videos with dynamic intrinsics. Compared to prior benchmarks, InFlux captures a wider range of intrinsic variations and scene diversity, featuring 143K+ annotated frames from 386 high-resolution indoor and outdoor videos with dynamic camera intrinsics. To ensure accurate per-frame intrinsics, we build a comprehensive lookup table of calibration experiments and extend the Kalibr toolbox to improve its accuracy and robustness. Using our benchmark, we evaluate existing baseline methods for predicting camera intrinsics and find that most struggle to achieve accurate predictions on videos with dynamic intrinsics. For the dataset, code, videos, and submission, please visit https://influx.cs.princeton.edu/.
Looking Into the Water by Unsupervised Learning of the Surface Shape
We address the problem of looking into the water from the air, where we seek to remove image distortions caused by refractions at the water surface. Our approach is based on modeling the different water surface structures at various points in time, assuming the underlying image is constant. To this end, we propose a model that consists of two neural-field networks. The first network predicts the height of the water surface at each spatial position and time, and the second network predicts the image color at each position. Using both networks, we reconstruct the observed sequence of images and can therefore use unsupervised training.
Over-reliance on chatbots can diminish critical-thinking skills, study finds
TECHNOLOGY IT ARTIFICIAL INTELLIGENCE CHATGPT Illustration picture shows the ChatGPT artificial intelligence software, which generates human-like conversation, Friday 03 February 2023 in Lierde. TECHNOLOGY IT ARTIFICIAL INTELLIGENCE CHATGPT Illustration picture shows the ChatGPT artificial intelligence software, which generates human-like conversation, Friday 03 February 2023 in Lierde. A new study from the Massachusetts Institute of Technology is the latest research to find that relying too much on chatbots can diminish critical-thinking skills, and potentially decrease our ability to discern misinformation for ourselves. As AI tools are becoming more sophisticated and accessible, manipulated images and misleading headlines are becoming more common. AI can be part of the solution, and has proved useful in helping users identify fake content - but there's a cost to using it this way, the new research suggests.
The Best Art TVs
After you're done bingeing your favorite movies, these art televisions are designed to liven up your wall. I have watched so many times I've lost count. For years, the Andrew Wyeth painting took a prominent place in my living room. Art televisions--the category of TV pioneered by Samsung's Frame and now rapidly expanding with models from many of the major TV producers --combine my passion for movies and shows with an even greater interest in art and photography. When it comes to their performance as televisions, even the best art TVs don't have quite the same punchy colors and speedy refresh rates found on similarly priced standard televisions. However, when the movie is finished, art TVs look a lot better in a room, displaying art and photos on a matte screen with a pristine clarity in a space otherwise wasted by a black box. Art televisions are typically just a little more expensive than a normal 4K TV.
Humanoid robot is spotted BEGGING on a street in China - claiming it has 'no money to recharge'
Gilgo Beach serial killer Rex Heuermann's ex-wife reacts to his sentencing as monster who killed eight women is transferred to new prison to begin life behind bars Boy, three, 'attacked by at least one crocodile' after being'thrown into zoo pit by man with learning difficulties who broke away from carers' - as suspect'not fit for interview' is bailed Jelly Roll stops concert to respond to wife Bunnie XO's bombshell podcast on their divorce Hegseth puts NATO on notice as he launches review of US troops in Europe and blasts allies for'shameful' behavior I was mortified when my husband always said no to sex. Then I realised the mistake I was making. This is the change that's completely transformed marital love-making in middle age: ALICE SNAPE Mom thought popular'natural' health supplement was safer than Xanax. She took it... then never woke up. Don't make the same mistake JD Vance turns on Israeli allies who are criticizing Trump's Iran deal: 'Wake up and smell reality' The other women left devastated by Jelly Roll's divorce... why his daughter is now'disgusted'... and Bunnie XO's one red-line demand before she would agree to the split Joe Biden mumbles to himself and requires stage direction as he aimlessly wanders off at Obama's library debut Tourists run for their lives as gunfire erupts in New York's Times Square as terrified parents drag children to safety Heartbroken family of college girls who drowned dispute account of their final moments before they were swept out to sea as they mourn'responsible and kind' students Oscar-winning director's daughter and her husband's deaths'medically related' as cops give grim update after couple were found in SUV on California highway Furious woke woman storms out of restaurant because customers were singing National Anthem ...and vows never to return A bold new experiment to streamline how Americans buy new cars... and auto dealerships are already scared Secret White House blacklist leaked by insider: 'Worst' influencers named and shamed... as foul-mouthed backstabbing erupts Watch horrifying drone video that follows woman's plunge to death after bungee team threw her from bridge without rope Bill Clinton's VERY cozy moment with Michelle while Hillary looks the other way... and the best UNSEEN moments from Obama public library opening Farce of Obama's $850m'monstrosity': As clucking liberal elite cheer Barack's grand opening, outraged Chicago locals tell HARRIET ALEXANDER awkward truth about library Humiliating new joke about Trump that's the talk of Washington... as White House moles tell me there's more to this story than meets the eye: MARK HALPERIN Humanoid robot is spotted BEGGING on a street in China - claiming it has'no money to recharge' READ MORE: China unveils the world's first self-driving TOILET While many people worry that robots are coming to take their jobs, one unlucky bot seems to have fallen on hard times.
Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models
Text-to-image diffusion models rely on text embeddings from a pre-trained text encoder, but these embeddings remain fixed across all diffusion timesteps, limiting their adaptability to the generative process. We propose Diffusion Adaptive Text Embedding (DATE), which dynamically updates text embeddings at each diffusion timestep based on intermediate perturbed data. We formulate an optimization problem and derive an update rule that refines the text embeddings at each sampling step to improve alignment and preference between the mean predicted image and the text. This allows DATE to dynamically adapts the text conditions to the reverse-diffused images throughout diffusion sampling without requiring additional model training. Through theoretical analysis and empirical results, we show that DATE maintains the generative capability of the model while providing superior text-image alignment over fixed text embeddings across various tasks, including multi-concept generation and text-guided image editing.
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
Conditional generative modeling aims to learn a conditional data distribution from samples containing data-condition pairs. For this, diffusion and flow-based methods have attained compelling results. These methods use a learned (flow) model to transport an initial standard Gaussian noise that ignores the condition to the conditional data distribution. The model is hence required to learn both mass transport and conditional injection. To ease the demand on the model, we propose Condition-Aware Reparameterization for Flow Matching (CAR-Flow) - a lightweight, learned shift that conditions the source, the target, or both distributions. By relocating these distributions, CAR-Flow shortens the probability path the model must learn, leading to faster training in practice. On low-dimensional synthetic data, we visualize and quantify the effects of CAR-Flow. On higher-dimensional natural image data (ImageNet-256), equipping SiT-XL/2 with CAR-Flow reduces FID from 2.07 to 1.68, while introducing less than 0.6% additional parameters.
VividFace: ARobost and High-Fidelity Video Face Swapping Framework
Video face swapping has seen increasing adoption in diverse applications, yet existing methods primarily trained on static images struggle to address temporal consistency and complex real-world scenarios. To overcome these limitations, we propose the first video face swapping framework, VividFace, a robust and high-fidelity diffusion-based framework. VividFace employs a novel hybrid training strategy that leverages abundant static image data alongside temporal video sequences, enabling it to effectively model temporal coherence and identity consistency in videos. Central to our approach is a carefully designed diffusion model integrated with a specialized VAE, capable of processing image-video hybrid data efficiently. To further enhance identity and pose disentanglement, we introduce and release the Attribute-Identity Disentanglement Triplet (AIDT) dataset, comprising a large-scale collection of triplets where each set contains three face images--two sharing the same pose and two sharing the same identity. Augmented comprehensively with occlusion scenarios, AIDT significantly boosts the robustness of VividFace against occlusions.
Precise Information Control in Long-Form Text Generation
A central challenge in language models (LMs) is faithfulness hallucination: the generation of information unsubstantiated by input context. To study this problem, we propose Precise Information Control (PIC), a new task formulation that requires models to generate long-form outputs grounded in a provided set of short self-contained statements, without adding any unsupported ones. PIC includes a full setting that tests a model's ability to include exactly all input claims, and a partial setting that requires the model to selectively incorporate only relevant claims. We present PIC-Bench, a benchmark of eight long-form generation tasks (e.g., summarization, biography generation) adapted to the PIC setting, where LMs are supplied with well-formed, verifiable input claims. Our evaluation of a range of open and proprietary LMs on PIC-Bench reveals that, surprisingly, state-of-the-art LMs still hallucinate against user-provided input in over 70% of generations. To alleviate this lack of faithfulness, we introduce a post-training framework that uses a weakly supervised preference data construction method to train an 8BPIC-LM with stronger PIC ability--improving from 69.1% to 91.0% F1 in the full PIC setting. When integrated into end-to-end factual generation pipelines, PIC-LM improves exact match recall by 17.1% on ambiguous QA with retrieval, and factual precision by 30.5% on a birthplace fact-checking task, underscoring the potential of precisely grounded generation.