youtube video
Semantic Visual Navigation by Watching YouTube Videos
Semantic cues and statistical regularities in real-world environment layouts can improve efficiency for navigation in novel environments. This paper learns and leverages such semantic cues for navigating to objects of interest in novel environments, by simply watching YouTube videos. This is challenging because YouTube videos don't come with labels for actions or goals, and may not even showcase optimal behavior. Our method tackles these challenges through the use of Q-learning on pseudo-labeled transition quadruples (image, action, next image, reward). We show that such off-policy Q-learning from passive data is able to learn meaningful semantic cues for navigation. These cues, when used in a hierarchical navigation policy, lead to improved efficiency at the ObjectGoal task in visually realistic simulations. We observe a relative improvement of 15-83% over end-to-end RL, behavior cloning, and classical methods, while using minimal direct interaction.
From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024
Simoes, Raisa M., Kelly, Timoteo, Simoes, Eduardo J., Rao, Praveen
Abstract: This paper aims to explore two compet ing data science meth odologies to attempt answer ing th e question, " Which issues contributed most to voters' choice in the 2024 presidential election? " The methodologies involve novel empirical evidence driven by artificial intelligence (AI) techniques . By using two distinct methods based on natural language processing and clustering analysis to mine over eight thousand user comments on election - related YouTube videos from one right leaning journal, Wall Street Journal, and one left leaning journal, New York Times, during pre - election week, we quantify the frequency of selected issue areas among user comments to infer which issues were most salient to potential voters in the seven days preceding the November 5th election. Empirically, we primarily demonstrate that immigration and democracy were the most frequently and consistently invoked issues in user comments on the analyzed YouTube videos, followed by the issue of identity politics, while inflation was significantly less frequently referenced. These results corroborate certain findings of post - election surveys but also refute the supposed importance of inflation as an election issue. This indicate s that variations on opinion mining, with their analysis of raw user data online, ca n be more revealing than polling and surveys for analyzing election outcomes. Keywords: artificial intelligence; opinion mining; clustering; vot e choice; cleavages 1. Introduction The Democrats lost both houses of Congress and the Presidency to Republicans in the 2024 election, with former president Donald Trump winning all seven swing states and the national popular vote, despite most pre - election polls giving Vice President Kamala Harris and President Trump a roughly equal chance of winning . Most post - election punditry and analysis in the legacy press and alternative media has attributed the Democrats' large loss to two main issues - inflation [59] and immigration [30] However, a growing contingent of analysts has also attributed the election outcome to the Democratic party's association with cultural issues purportedly distant from the median voter's preferences, such as th ose alternatively aggregated under the concept of "identity" or " woke " politics [54, 56] . To this point, three post - election studies illustrate how voters associated Democrats with left - of - center ideas that were ostensibly distant from most voters' priorities. S urvey research from the think tank Third Way demonstrates that Democrats, and thus Kamala Harris, were largely perceived as "too liberal" [15], while a study from More In Common polling over 5, 000 Americans concluded that while inflation was the top concern for every major demographic group across both parties, Americans misperceived LGBT/transgender policies as the top policy priority for Democrats [37] .
- Europe > Germany (0.14)
- Europe > France (0.14)
- North America > United States > Missouri (0.04)
- (9 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.68)
MrBeast says AI advance is scary for YouTube creators
MrBeast: AI means it's'scary times' for YouTube creators The world's biggest YouTuber, MrBeast, says the rapid advance of generative artificial intelligence (AI) is scary for the millions of creators currently making content for a living. AI tools that can create fully-formed videos from simple text prompts by users have made rapid advances in recent years. On social media, MrBeast, real name Jimmy Donaldson, asked what would happen to people like him when AI videos are just as good as normal videos. Fears about the impact AI will have on the jobs market are widespread - but particularly acute in the creative industries. In the film and video game industries, there has been extensive industrial action over the use of AI.
- South America (0.16)
- North America > Central America (0.16)
- Oceania > Australia (0.06)
- (13 more...)
- Leisure & Entertainment > Games > Computer Games (0.56)
- Media > Film (0.36)
- North America > United States > Illinois (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
Content and Engagement Trends in COVID-19 YouTube Videos: Evidence from the Late Pandemic
Thakur, Nirmalya, Hartel, Madeline D, Boden, Lane Michael, Enriquez, Dallas, Ricks, Boston Joyner
This work investigated about 10,000 COVID-19-related YouTube videos published between January 2023 and October 2024 to evaluate how temporal, lexical, linguistic, and structural factors influenced engagement during the late pandemic period. Publishing activity showed consistent weekday effects: in the first window, average views peaked on Mondays at 92,658; in the second, on Wednesdays at 115,479; and in the third, on Fridays at 84,874, reflecting a shift in audience attention toward mid- and late week. Lexical analysis of video titles revealed recurring high-frequency keywords related to COVID-19 and YouTube features, including COVID, coronavirus, shorts, and live. Frequency analysis revealed sharp spikes, with COVID appearing in 799 video titles in August 2024, while engagement analysis showed that videos titled with shorts attracted very high views, peaking at 2.16 million average views per video in June 2023. Analysis of sentiment of video descriptions in English showed weak correlation with views in the raw data (Pearson r = 0.0154, p = 0.2987), but stronger correlations emerged once outliers were addressed, with Spearman r = 0.110 (p < 0.001) and Pearson r = 0.0925 (p < 0.001). Category-level analysis of video durations revealed contrasting outcomes: long videos focusing on people and blogs averaged 209,114 views, short entertainment videos averaged 288,675 views, and medium-to-long news and politics videos averaged 51,309 and 59,226 views, respectively. These results demonstrate that engagement patterns of COVID-19-related videos on YouTube during the late pandemic followed distinct characteristics driven by publishing schedules, title vocabulary, topics, and genre-specific duration effects.
- North America > United States > South Dakota > Pennington County > Rapid City (0.05)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Asia > Singapore (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Review for NeurIPS paper: Semantic Visual Navigation by Watching YouTube Videos
Weaknesses: The first weakness of this work is the lack of analysis of the overall video-to-experience framework. Each component in this pipeline can introduce error(s) and assumption(s) that must be carefully considered and analyzed. It would greatly aid this work to include discussion of the assumptions taken on by each component, provide discussion about error introduced by each component, and discuss alternative components (and why the chosen ones were used over them). As an example, for the inverse dynamics model: What are the "handful of environments" that are used to train the inverse dynamics model? How different are they from the evaluation setting?
Review for NeurIPS paper: Semantic Visual Navigation by Watching YouTube Videos
This paper proposes to leverage (mostly real-estate) unlabelled YouTube videos of egocentric navigation in indoor environments, to train the Q value function network for the high-level part of a hierarchical RL policy for goal-driven indoor robot navigation. The lower-level part relies on depth-based obstacle avoidance and planning in 2D maps. The method works in an unsupervised way by relying on two ways of augmenting the egocentric navigation video dataset: 1) extract action labels from motion classifiers and 2) extract semantic goal labels from object detection. It uses these two to 3) build experience replay tuples of (previous image, action, next image, goal) and then train the goal-conditional value function using Q-Learning. The high-level policy predicts Q values for navigating a topological graph.
Towards Safer Social Media Platforms: Scalable and Performant Few-Shot Harmful Content Moderation Using Large Language Models
Bonagiri, Akash, Li, Lucen, Oak, Rajvardhan, Babar, Zeerak, Wojcieszak, Magdalena, Chhabra, Anshuman
The prevalence of harmful content on social media platforms poses significant risks to users and society, necessitating more effective and scalable content moderation strategies. Current approaches rely on human moderators, supervised classifiers, and large volumes of training data, and often struggle with scalability, subjectivity, and the dynamic nature of harmful content (e.g., violent content, dangerous challenge trends, etc.). To bridge these gaps, we utilize Large Language Models (LLMs) to undertake few-shot dynamic content moderation via in-context learning. Through extensive experiments on multiple LLMs, we demonstrate that our few-shot approaches can outperform existing proprietary baselines (Perspective and OpenAI Moderation) as well as prior state-of-the-art few-shot learning methods, in identifying harm. We also incorporate visual information (video thumbnails) and assess if different multimodal techniques improve model performance. Our results underscore the significant benefits of employing LLM based methods for scalable and dynamic harmful content moderation online.
- North America > United States > New York (0.04)
- North America > United States > Florida (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- Health & Medicine (1.00)
- Media > News (0.46)
- Information Technology > Services (0.46)
- Education > Educational Setting (0.46)
Gemini AI is coming to Google TV devices in 2025, making them easier to talk to
This week at CES, Google presented an early look at new software and hardware upgrades coming to Google TV devices. The new features include the integration of Gemini, Google's AI model, to the Google Assistant, as well as a new ambient experience. New smart TVs with Google TV will also gain far-field mics and proximity sensors to support the new software perks. If you've used a Google TV or Google streaming device, you may have already used the "hey Google" prompt to search for shows to watch. With the addition of Gemini, those "conversations" should now feel more natural.
- North America > United States > Nevada > Clark County > Las Vegas (0.06)
- North America > United States > Illinois > Cook County > Chicago (0.06)