Goto

Collaborating Authors

 dive


Learning To Dive In Branch And Bound

Neural Information Processing Systems

Primal heuristics are important for solving mixed integer linear programs, because they find feasible solutions that facilitate branch and bound search. A prominent group of primal heuristics are diving heuristics. They iteratively modify and resolve linear programs to conduct a depth-first search from any node in the search tree. Existing divers rely on generic decision rules that fail to exploit structural commonality between similar problem instances that often arise in practice. Therefore, we propose L2Dive to learn specific diving heuristics with graph neural networks: We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions based on the model's predictions. L2Dive is fully integrated into the open-source solver SCIP. We find that L2Dive outperforms standard divers to find better feasible solutions on a range of combinatorial optimization problems. For real-world applications from server load balancing and neural network verification, L2Dive improves the primal-dual integral by up to 7% (35%) on average over a tuned (default) solver baseline and reduces average solving time by 20% (29%).


Learning Disentangled Representations of Videos with Missing Data

Neural Information Processing Systems

Missing data poses significant challenges while learning representations of video sequences. We present Disentangled Imputed Video autoEncoder (DIVE), a deep generative model that imputes and predicts future video frames in the presence of missing data. Specifically, DIVE introduces a missingness latent variable, disentangles the hidden video representations into static and dynamic appearance, pose, and missingness factors for each object, while it imputes each object trajectory where data is missing. On a moving MNIST dataset with various missing scenarios, DIVE outperforms the state of the art baselines by a substantial margin. We also present comparisons on a real-world MOTSChallenge pedestrian dataset, which demonstrates the practical value of our method in a more realistic setting.


Dive into 2025's most stunning deep-sea wildlife encounters

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. There are plenty of annual recap lists circulating around this time of year, but few of them involve the amount of work put in by California's Monterey Bay Aquarium Research Institute (MBARI). Over the past year, researchers guided remotely operated vehicles more than 3,000 feet down to survey the vast biodiversity within some of the oceans' deepest and darkest regions. The data and footage collected during these trips will help experts fill in the gaps towards understanding the planet's hardest-to-reach ecosystems. To celebrate the past 12 months of discoveries, MBARI released a video highlighting some of 2025's most stunning, strange, and mysterious creature sightings.



Difference Vector Equalization for Robust Fine-tuning of Vision-Language Models

Suzuki, Satoshi, Yamaguchi, Shin'ya, Takeda, Shoichiro, Yamane, Taiga, Makishima, Naoki, Kawata, Naotaka, Ihori, Mana, Tanaka, Tomohiro, Orihashi, Shota, Masumura, Ryo

arXiv.org Artificial Intelligence

Contrastive pre-trained vision-language models, such as CLIP, demonstrate strong generalization abilities in zero-shot classification by leveraging embeddings extracted from image and text encoders. This paper aims to robustly fine-tune these vision-language models on in-distribution (ID) data without compromising their generalization abilities in out-of-distribution (OOD) and zero-shot settings. Current robust fine-tuning methods tackle this challenge by reusing contrastive learning, which was used in pre-training, for fine-tuning. However, we found that these methods distort the geometric structure of the embeddings, which plays a crucial role in the generalization of vision-language models, resulting in limited OOD and zero-shot performance. To address this, we propose Difference Vector Equalization (DiVE), which preserves the geometric structure during fine-tuning. The idea behind DiVE is to constrain difference vectors, each of which is obtained by subtracting the embeddings extracted from the pre-trained and fine-tuning models for the same data sample. By constraining the difference vectors to be equal across various data samples, we effectively preserve the geometric structure. Therefore, we introduce two losses: average vector loss (AVL) and pairwise vector loss (PVL). AVL preserves the geometric structure globally by constraining difference vectors to be equal to their weighted average. PVL preserves the geometric structure locally by ensuring a consistent multimodal alignment. Our experiments demonstrate that DiVE effectively preserves the geometric structure, achieving strong results across ID, OOD, and zero-shot metrics.


OceanGate's 'Titan' went on 7 dives with a damaged hull before implosion

Popular Science

Technology Engineering OceanGate's'Titan' went on 7 dives with a damaged hull before implosion Investigators found that the submersible's exterior featured'multiple anomalies' as early as 2022. Breakthroughs, discoveries, and DIY tips sent every weekday. The United States National Transportation Safety Board (NTSB) recently concluded its investigation into the OceanGate submersible disaster . According to the summary report released on October 15, an already weakened hull caused the deep sea tourist vessel to implode while it was en route to visit the wreckage of the RMS in June 2023, killing all five passengers on board. But according to their findings, investigators noted that the submersible wasn't damaged shortly before its final voyage.



Dense Video Understanding with Gated Residual Tokenization

Zhang, Haichao, Chai, Wenhao, He, Shwai, Li, Ang, Fu, Yun

arXiv.org Artificial Intelligence

High temporal resolution is essential for capturing fine-grained details in video understanding. However, current video large language models (VLLMs) and benchmarks mostly rely on low-frame-rate sampling, such as uniform sampling or keyframe selection, discarding dense temporal information. This compromise avoids the high cost of tokenizing every frame, which otherwise leads to redundant computation and linear token growth as video length increases. While this trade-off works for slowly changing content, it fails for tasks like lecture comprehension, where information appears in nearly every frame and requires precise temporal alignment. To address this gap, we introduce Dense Video Understanding (DVU), which enables high-FPS video comprehension by reducing both tokenization time and token overhead. Existing benchmarks are also limited, as their QA pairs focus on coarse content changes. We therefore propose DIVE (Dense Information Video Evaluation), the first benchmark designed for dense temporal reasoning. To make DVU practical, we present Gated Residual Tokenization (GRT), a two-stage framework: (1) Motion-Compensated Inter-Gated Tokenization uses pixel-level motion estimation to skip static regions during tokenization, achieving sub-linear growth in token count and compute. (2) Semantic-Scene Intra-Tokenization Merging fuses tokens across static regions within a scene, further reducing redundancy while preserving dynamic semantics. Experiments on DIVE show that GRT outperforms larger VLLM baselines and scales positively with FPS. These results highlight the importance of dense temporal information and demonstrate that GRT enables efficient, scalable high-FPS video understanding.


Is the AI bubble about to burst – and send the stock market into freefall? Phillip Inman

The Guardian

There are growing fears of an imminent stock market crash – one that will transform from a dip to a dive when euphoric headlines about the wonders of artificial intelligence begin to wane. Shares in US tech stocks have fallen in recent weeks and the prospect is that a flood of negative numbers will become the norm before the month is out. It could be 2000 all over again, and just like the bursting of the dotcom bubble it may be ugly, with investors junking businesses that once looked good on paper but now resemble a huge liability. Jerome Powell, the Federal Reserve chair, is one of the policymakers tasked with keeping the wolf from the door. Speaking on Friday at the annual Jackson Hole gathering of central bank governors in Wyoming, he tried to calm nerves.


108-year-old submarine wreck seen in stunning detail in new footage

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. In 1917, two US submarines collided off the coast of San Diego and submarine USS F-1 sank to the bottom of the Pacific Ocean, along with 19 crew members aboard. The horrible accident, whose wreckage was discovered in 1975, represents the US Naval Submarine Force's first wartime submarine loss. Now, researchers from Woods Hole Oceanographic Institution have captured new footage of the 1,300 feet-deep underwater archaeological site. "They were technical dives requiring specialized expertise and equipment," Anna Michel, a co-lead of the expedition and chief scientist at the National Deep Submergence Facility, said in a statement. "We were careful and methodical in surveying these historical sites so that we could share these stunning images, while also maintaining the reverence these sites deserve."