Goto

Collaborating Authors

 teaser


Leveraging Digitized Newspapers to Collect Summarization Data in Low-Resource Languages

Dahan, Noam, Kidron, Omer, Stanovsky, Gabriel

arXiv.org Artificial Intelligence

High quality summarization data remains scarce in under-represented languages. However, historical newspapers, made available through recent digitization efforts, offer an abundant source of untapped, naturally annotated data. In this work, we present a novel method for collecting naturally occurring summaries via Front-Page Teasers, where editors summarize full length articles. We show that this phenomenon is common across seven diverse languages and supports multi-document summarization. To scale data collection, we develop an automatic process, suited to varying linguistic resource levels. Finally, we apply this process to a Hebrew newspaper title, producing HEBTEASESUM, the first dedicated multi-document summarization dataset in Hebrew.


Te Ahorré Un Click: A Revised Definition of Clickbait and Detection in Spanish News

Mordecki, Gabriel, Moncecchi, Guillermo, Couto, Javier

arXiv.org Artificial Intelligence

We revise the definition of clickbait, which lacks current consensus, and argue that the creation of a curiosity gap is the key concept that distinguishes clickbait from other related phenomena such as sensationalism and headlines that do not deliver what they promise or diverge from the article. Therefore, we propose a new definition: clickbait is a technique for generating headlines and teasers that deliberately omit part of the information with the goal of raising the readers' curiosity, capturing their attention and enticing them to click. We introduce a new approach to clickbait detection datasets creation, by refining the concept limits and annotations criteria, minimizing the subjectivity in the decision as much as possible. Following it, we created and release TA1C (for Te Ahorré Un Click, Spanish for Saved You A Click), the first open source dataset for clickbait detection in Spanish. It consists of 3,500 tweets coming from 18 well known media sources, manually annotated and reaching a 0.825 Fleiss' κ inter annotator agreement. We implement strong baselines that achieve 0.84 in F1-score.


Correspondence-Free Multiview Point Cloud Registration via Depth-Guided Joint Optimisation

Zhou, Yiran, Wang, Yingyu, Huang, Shoudong, Zhao, Liang

arXiv.org Artificial Intelligence

Multiview point cloud registration is a fundamental task for constructing globally consistent 3D models. Existing approaches typically rely on feature extraction and data association across multiple point clouds; however, these processes are challenging to obtain global optimal solution in complex environments. In this paper, we introduce a novel correspondence-free multiview point cloud registration method. Specifically, we represent the global map as a depth map and leverage raw depth information to formulate a non-linear least squares optimisation that jointly estimates poses of point clouds and the global map. Unlike traditional feature-based bundle adjustment methods, which rely on explicit feature extraction and data association, our method bypasses these challenges by associating multi-frame point clouds with a global depth map through their corresponding poses. This data association is implicitly incorporated and dynamically refined during the optimisation process. Extensive evaluations on real-world datasets demonstrate that our method outperforms state-of-the-art approaches in accuracy, particularly in challenging environments where feature extraction and data association are difficult.


Block-busted: why homemade Minecraft movies are the real hits

The Guardian

By any estimation, Minecraft is impossibly successful. The bestselling video game ever, as of last December it had 204 million monthly active players. Since it was first released in 2011, it has generated over 3bn ( 2.3bn) in revenue. What's more, its players have always been eager to demonstrate their fandom outside the boundaries of the game itself. In 2021, YouTube calculated that videos related to the game – tutorials, walk-throughs, homages, parodies – had collectively been viewed 1tn times. In short, it is a phenomenon.


The Morning After: Everything Samsung announced this week (and future devices teased)

Engadget

Welcome to a new newsletter, with a bit of a new direction. While our mid-week edition tackles news specifics, this end-of-the-week missive combines the biggest news with more context, more things to read and watch, recommendations, easter eggs, inside baseball and stuff that interests our readers, alongside the breaking news, reviews and features you expect from Engadget. We'd love your feedback on what you'd like to see covered in these meatier editions -- hit me up at tma(at)engadget.com. Luckily for me, we kick things off with Samsung's big Unpacked event, launching three new phones and teasing two -- yes, two! -- more coming soon. Everything Samsung announced, including prices and launch dates (February 8 -- I'll save you a click), we collated here, but it was largely a fallow year for Galaxy S hardware, barring a substantially more powerful chip.


TeaserGen: Generating Teasers for Long Documentaries

Xu, Weihan, Liang, Paul Pu, Kim, Haven, McAuley, Julian, Berg-Kirkpatrick, Taylor, Dong, Hao-Wen

arXiv.org Artificial Intelligence

Teasers are an effective tool for promoting content in entertainment, commercial and educational fields. However, creating an effective teaser for long videos is challenging for it requires long-range multimodal modeling on the input videos, while necessitating maintaining audiovisual alignments, managing scene changes and preserving factual accuracy for the output teasers. Due to the lack of a publicly-available dataset, progress along this research direction has been hindered. In this work, we present DocumentaryNet, a collection of 1,269 documentaries paired with their teasers, featuring multimodal data streams of video, speech, music, sound effects and narrations. With DocumentaryNet, we propose a new two-stage system for generating teasers from long documentaries. The proposed TeaserGen system first generates the teaser narration from the transcribed narration of the documentary using a pretrained large language model, and then selects the most relevant visual content to accompany the generated narration through language-vision models. For narration-video matching, we explore two approaches: a pretraining-based model using pretrained contrastive language-vision models and a deep sequential model that learns the mapping between the narrations and visuals. Our experimental results show that the pretraining-based approach is more effective at identifying relevant visual content than directly trained deep autoregressive models.


A Minecraft Movie trailer gives us our first look at Jason Momoa and Jack Black ahead of its 2025 release

Engadget

It took a decade, but we finally have a teaser for the live-action A Minecraft Movie. The first look comes courtesy of a video released by Warner Bros. today that clocks in at just over one minute -- but, hey, we'll take it. The film studio has confirmed its previous target, April 4, 2025, is moving forward with a theater-only release. Yes, once upon a time, it had release dates for May 2019 and March 2022, but the existence of a teaser makes us feel a little more hopeful (gullible?) After a series of directors joined and left the project, A Minecraft Movie is led by filmmaker Jared Hess.


What do you see FIRST? Brain teaser reveals the most respected part of your personality

Daily Mail - Science & tech

A new brain teaser reveals the most respected aspects of your personality. The puzzle features images that can be interpreted in different ways, depending on your personal experiences, traits and mental state. Hidden in the picture is a lion, panther and bunch of dandelions, and the one you see first means you are either a natural-born leader, a problem solver or have a strong sense of conviction. The puzzle features images that can be interpreted in different ways, depending on your personal experiences, traits and mental state. Did you see a lion, panther or dandelions first?


Can YOU spot the second horse? Only people with high IQs can solve the brainteaser in 10 seconds

Daily Mail - Science & tech

A new brain teaser claims that only highly intelligent people can spot a second horse in the majestic animal's painted coat. The picture features a full-grown horse standing in a field and asks viewers to use creative thinking to solve the brainteaser. Spotting the horse requires quick thinking, and if you can find the letter in 10 seconds or less, your level of intelligence is higher than people who take longer. A new brain teaser claims that only highly intelligent people can spot a second horse in the stallion's coat Solving the puzzle isn't so much about looking and simply seeing it, but about finding a different way to look at it. Set the timer for 10 seconds and try to find the second horse on the brown and white side of the stallion.


Google Project Astra hands-on: Full of potential, but it's going to be a while

Engadget

At I/O 2024, Google's teaser for Project Astra gave us a glimpse at where AI assistants are going in the future. It's a multi-modal feature that combines the smarts of Gemini with the kind of image recognition abilities you get in Google Lens, as well as powerful natural language responses. However, while the promo video was slick, after getting to try it out in person, it's clear there's a long way to go before something like Astra lands on your phone. So here are three takeaways from our first experience with Google's next-gen AI. Currently, most people interact with digital assistants using their voice, so right away Astra's multi-modality (i.e. using sight and sound in addition to text/speech) to communicate with an AI is relatively novel.