Goto

Collaborating Authors

 pedestal


Towards Understanding Camera Motions in Any Video

Lin, Zhiqiu, Cen, Siyuan, Jiang, Daniel, Karhade, Jay, Wang, Hewei, Mitra, Chancharik, Ling, Tiffany, Huang, Yuhan, Liu, Sifan, Chen, Mingyu, Zawar, Rushikesh, Bai, Xue, Du, Yilun, Gan, Chuang, Ramanan, Deva

arXiv.org Artificial Intelligence

We introduce CameraBench, a large-scale dataset and benchmark designed to assess and improve camera motion understanding. CameraBench consists of ~3,000 diverse internet videos, annotated by experts through a rigorous multi-stage quality control process. One of our contributions is a taxonomy of camera motion primitives, designed in collaboration with cinematographers. We find, for example, that some motions like "follow" (or tracking) require understanding scene content like moving subjects. We conduct a large-scale human study to quantify human annotation performance, revealing that domain expertise and tutorial-based training can significantly enhance accuracy. For example, a novice may confuse zoom-in (a change of intrinsics) with translating forward (a change of extrinsics), but can be trained to differentiate the two. Using CameraBench, we evaluate Structure-from-Motion (SfM) and Video-Language Models (VLMs), finding that SfM models struggle to capture semantic primitives that depend on scene content, while VLMs struggle to capture geometric primitives that require precise estimation of trajectories. We then fine-tune a generative VLM on CameraBench to achieve the best of both worlds and showcase its applications, including motion-augmented captioning, video question answering, and video-text retrieval. We hope our taxonomy, benchmark, and tutorials will drive future efforts towards the ultimate goal of understanding camera motions in any video.


LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation

Ye, Xi, Yin, Fangcong, He, Yinghui, Zhang, Joie, Yen, Howard, Gao, Tianyu, Durrett, Greg, Chen, Danqi

arXiv.org Artificial Intelligence

Existing benchmarks for evaluating long-context language models (LCLMs) primarily focus on long-context recall, requiring models to produce short responses based on a few critical snippets while processing thousands of irrelevant tokens. We introduce LongProc (Long Procedural Generation), a new benchmark that requires both the integration of highly dispersed information and long-form generation. LongProc consists of six diverse procedural generation tasks, such as extracting structured information from HTML pages into a TSV format and executing complex search procedures to create travel plans. These tasks challenge LCLMs by testing their ability to follow detailed procedural instructions, synthesize and reason over dispersed information, and generate structured, long-form outputs (up to 8K tokens). Furthermore, as these tasks adhere to deterministic procedures and yield structured outputs, they enable reliable rule-based evaluation. We evaluate 17 LCLMs on LongProc across three difficulty levels, with maximum numbers of output tokens set at 500, 2K, and 8K. Notably, while all tested models claim a context window size above 32K tokens, open-weight models typically falter on 2K-token tasks, and closed-source models like GPT-4o show significant degradation on 8K-token tasks. Further analysis reveals that LCLMs struggle to maintain long-range coherence in long-form generations. These findings highlight critical limitations in current LCLMs and suggest substantial room for improvement. Data and code available at: https://princeton-pli.github.io/LongProc


Drones examine Japan's damaged Fukushima nuclear reactor for the first time

FOX News

U.S. Ambassador to Japan Rahm Emanuel visited a Fukushima coastal city to support the local fishing industry after China and South Korea raised the alarm over water discharge began from the Fukushima Daiichi nuclear plant. Images taken by miniature drones from deep inside a badly damaged reactor at the Fukushima nuclear plant show displaced control equipment and misshapen materials but leave many questions unanswered, underscoring the daunting task of decommissioning the plant. The 12 photos released by the plant's operator are the first from inside the main structural support called the pedestal in the hardest-hit No. 1 reactor's primary containment vessel, an area directly under the reactor's core. Officials had long hoped to reach the area to examine the core and melted nuclear fuel which dripped there when the plant's cooling systems were damaged by a massive earthquake and tsunami in 2011. Earlier attempts with robots were unable to reach the area.


EuroPED-NN: Uncertainty aware surrogate model

Alvarez, A. Panera, Ho, A., Jarvinen, A., Saarelma, S., Wiesen, S., Contributors, JET

arXiv.org Artificial Intelligence

This work successfully generates uncertainty aware surrogate models, via the Bayesian neural network with noise contrastive prior (BNN-NCP) technique, of the EuroPED plasma pedestal model using data from the JET-ILW pedestal database and subsequent model evaluations. All this conform EuroPED-NN. The BNN-NCP technique is proven to be a good fit for uncertainty aware surrogate models, matching the output results as a regular neural network, providing prediction's confidence as uncertainties, and highlighting the out of distribution (OOD) regions using surrogate model uncertainties. This provides critical insights into model robustness and reliability. EuroPED-NN has been physically validated, first, analyzing electron density $n_e\!\left(\psi_{\text{pol}}=0.94\right)$ with respect to increasing plasma current, $I_p$, and second, validating the $\Delta-\beta_{p,ped}$ relation associated with the EuroPED model. Affirming the robustness of the underlying physics learned by the surrogate model.


TCL 6-Series 2022 Model R655 Review: The Best Value TV Right Now

WIRED

There's nothing like something you can easily unbox, plug in, and turn on without cracking a manual or downloading a PDF. So why am I still so obsessed with TCL's 6-Series? In the past I've said that the 6-Series was the best TV for most people based largely on how well the screen looks for your dollars. It wasn't for TCL's looks or sleek interfaces and apps--there were, and are, plenty of other mid-tier options from Vizio, Hisense, and others that get that job done right. But the latest 6-Series now wins out for sheer physical simplicity. It comes with a center pedestal stand, and you barely have to touch a settings menu.


22 Best Cyber Monday Soundbar and TV Deals (2022): Samsung, Vizio, LG, and More

WIRED

It's a great time to upgrade your home theater thanks to some excellent Cyber Monday TV and soundbar deals. If you've yet to take the plunge to a modern 4K TV, or you are still listening to your favorite shows and movies through those tinny built-in TV speakers, there are massive reasons to upgrade. Modern home theater technology now has better backlighting, sharper resolution, and immersive surround sound for less money required than ever before. Go on, convert your living room into a mini cinema. Updated Monday, November 28: We've added two new TV deals on sets from LG and Sony and moved a group of dead TV deals to the bottom of the article, just in case you want to check if they're back in stock. We've also updated prices and retailers throughout.


Reusable neural skill embeddings for vision-guided whole body movement and object manipulation

Merel, Josh, Tunyasuvunakool, Saran, Ahuja, Arun, Tassa, Yuval, Hasenclever, Leonard, Pham, Vu, Erez, Tom, Wayne, Greg, Heess, Nicolas

arXiv.org Artificial Intelligence

Both in simulation settings and robotics, there is an ambition to produce flexible control systems that can enable complex bodies to perform dynamic locomotion and natural object manipulation. In previous work, we developed a framework to train locomotor skills and reuse these skills for whole-body visuomotor tasks. Here, we extend this line of work to tasks involving whole body movement as well as visually guided manipulation of objects. This setting poses novel challenges in terms of task specification, exploration, and generalization. We develop an integrated approach consisting of a flexible motor primitive module, demonstrations, an instructed training regime as well as curricula in the form of task variations. We demonstrate the utility of our approach for solving challenging whole body tasks that require joint locomotion and manipulation, and characterize its behavioral robustness. We also provide a high-level overview video, see https://youtu.be/t0RDGSnE3cM .


Toshiba unveils robot to probe melted Fukushima nuclear...

Daily Mail - Science & tech

Toshiba unveiled a remote-controlled robot with tongs on Monday that it hopes will be able to probe the inside of one of the three damaged reactors at Japan's tsunami-hit Fukushima nuclear plant and grip chunks of highly radioactive melted fuel. The device is designed to slide down an extendable 11-meter (36-foot) long pipe and touch melted fuel inside the Unit 2 reactor's primary containment vessel. The reactor was built by Toshiba and GE. An earlier probe carrying a camera captured images of pieces of melted fuel in the reactor last year, and robotic probes in the two other reactors have detected traces of damaged fuel, but the exact location, contents and other details remain largely unknown. Toshiba unveiled the device carrying tongs that comes out of a long telescopic pipe for an internal probe in one of three damaged reactor chambers at Japan's tsunami-hit Fukushima nuclear plant - this time to touch chunks of melted fuel Toshiba's energy systems unit said experiments with the new probe planned in February are key to determining the proper equipment and technologies needed to remove the fuel debris, the most challenging part of the decommissioning process expected to take decades.


Data Science, AI and Hype cycles

#artificialintelligence

When in industry more than 50% of new roles are driven towards a specific skill set and when projections from various recruiting companies shows the world being short of certain skilled people and employers are scrambling to find certain type of resources in the market and are willing to pay a premium to get them on-board then it is a clear sign that we're in a hype cycle. The skill here is Data Science and resources Data Scientists. Those who have been in the industry long enough can recognize this. A decade back industry was going crazy for a similar skill known as Business Analysts, who are now found dime a dozen in the market (apologies if I've hurt someone's sentiments, but you can't escape the truth). I know certain organizations where the certain preference is being given to Data Modelers, Data Analysts & data Scientists instead of sitting Business Analysts.