Goto

Collaborating Authors

 scout


Yahoo is adding generative AI to its search engine

Engadget

Apple could unveil Gemini-powered Siri in Feb. Yahoo Scout will be powered by Claude and is integrated across the company's products. Yahoo has a new AI-powered answer engine, dubbed Yahoo Scout. The new tool is available now in beta and is powered by Anthropic's . The company says Scout synthesizes info from the web, as well as Yahoo's own data and content when constructing responses to user's natural-language search queries. Yahoo says the interface will include interactive digital media, structured lists and tables and visible source links aimed at making answers easier to verify.


AlphaBeta is not as good as you think: a simple random games model for a better analysis of deterministic game-solving algorithms

Boige, Raphaël, Boumaza, Amine, Scherrer, Bruno

arXiv.org Artificial Intelligence

Deterministic game-solving algorithms are conventionally analyzed in the light of their average-case complexity against a distribution of random game-trees, where leaf values are independently sampled from a fixed distribution. This simplified model enables uncluttered mathematical analysis, revealing two key properties: root value distributions asymptotically collapse to a single fixed value for finite-valued trees, and all reasonable algorithms achieve global optimality. However, these findings are artifacts of the model's design: its long criticized independence assumption strips games of structural complexity, producing trivial instances where no algorithm faces meaningful challenges. To address this limitation, we introduce a simple probabilistic model that incrementally constructs game-trees using a fixed level-wise conditional distribution. By enforcing ancestor dependencies, a critical structural feature of real-world games, our framework generates problems with adjustable difficulty while retaining some form of analytical tractability. For several algorithms, including AlphaBeta and Scout, we derive recursive formulas characterizing their average-case complexities under this model. These allow us to rigorously compare algorithms on deep game-trees, where Monte-Carlo simulations are no longer feasible. While asymptotically, all algorithms seem to converge to identical branching factor (a result analogous to that of independence-based models), deep finite trees reveal stark differences: AlphaBeta incurs a significantly larger constant multiplicative factor compared to algorithms like Scout, leading to a substantial practical slowdown. Our framework sheds new light on classical game-solving algorithms, offering rigorous evidence and analytical tools to advance the understanding of these methods under a richer, more challenging, and yet tractable model.


Remembering Unequally: Global and Disciplinary Bias in LLM-Generated Co-Authorship Networks

Kalhor, Ghazal, Mashhadi, Afra

arXiv.org Artificial Intelligence

Ongoing breakthroughs in Large Language Models (LLMs) are reshaping search and recommendation platforms at their core. While this shift unlocks powerful new scientometric tools, it also exposes critical fairness and bias issues that could erode the integrity of the information ecosystem. Additionally, as LLMs become more integrated into web-based searches for scholarly tools, their ability to generate summarized research work based on memorized data introduces new dimensions to these challenges. The extent of memorization in LLMs can impact the accuracy and fairness of the co-authorship networks they produce, potentially reflecting and amplifying existing biases within the scientific community and across different regions. This study critically examines the impact of LLM memorization on the co-authorship networks. To this end, we assess memorization effects across three prominent models, DeepSeek R1, Llama 4 Scout, and Mixtral 8x7B, analyzing how memorization-driven outputs vary across academic disciplines and world regions. While our global analysis reveals a consistent bias favoring highly cited researchers, this pattern is not uniformly observed. Certain disciplines, such as Clinical Medicine, and regions, including parts of Africa, show more balanced representation, pointing to areas where LLM training data may reflect greater equity. These findings underscore both the risks and opportunities in deploying LLMs for scholarly discovery.


SCOUT: A Lightweight Framework for Scenario Coverage Assessment in Autonomous Driving

Yildiz, Anil, Thornton, Sarah M., Hildebrandt, Carl, Roy-Singh, Sreeja, Kochenderfer, Mykel J.

arXiv.org Artificial Intelligence

Assessing scenario coverage is crucial for evaluating the robustness of autonomous agents, yet existing methods rely on expensive human annotations or computationally intensive Large Vision-Language Models (LVLMs). These approaches are impractical for large-scale deployment due to cost and efficiency constraints. To address these shortcomings, we propose SCOUT (Scenario Coverage Oversight and Understanding Tool), a lightweight surrogate model designed to predict scenario coverage labels directly from an agent's latent sensor representations. SCOUT is trained through a distillation process, learning to approximate LVLM-generated coverage labels while eliminating the need for continuous LVLM inference or human annotation. By leveraging precomputed perception features, SCOUT avoids redundant computations and enables fast, scalable scenario coverage estimation. We evaluate our method across a large dataset of real-life autonomous navigation scenarios, demonstrating that it maintains high accuracy while significantly reducing computational cost. Our results show that SCOUT provides an effective and practical alternative for large-scale coverage analysis. While its performance depends on the quality of LVLM-generated training labels, SCOUT represents a major step toward efficient scenario coverage oversight in autonomous systems.


The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification

Baharav, Tavor Z., Dragazis, Spyros, Pacchiano, Aldo

arXiv.org Machine Learning

We study the problem of sequentially testing individuals for a binary disease outcome whose true risk is governed by an unknown logistic model. At each round, a patient arrives with feature vector $x_t$, and the decision maker may either pay to administer a (noiseless) diagnostic test--revealing the true label--or skip testing and predict the patient's disease status based on their feature vector and prior history. Our goal is to minimize the total number of costly tests required while guaranteeing that the fraction of misclassifications does not exceed a prespecified error tolerance $α$, with probability at least $1-δ$. To address this, we develop a novel algorithm that interleaves label-collection and distribution estimation to estimate both $θ^{*}$ and the context distribution $P$, and computes a conservative, data-driven threshold $τ_t$ on the logistic score $|x_t^\topθ|$ to decide when testing is necessary. We prove that, with probability at least $1-δ$, our procedure does not exceed the target misclassification rate, and requires only $O(\sqrt{T})$ excess tests compared to the oracle baseline that knows both $θ^{*}$ and the patient feature distribution $P$. This establishes the first no-regret guarantees for error-constrained logistic testing, with direct applications to cost-sensitive medical screening. Simulations corroborate our theoretical guarantees, showing that in practice our procedure efficiently estimates $θ^{*}$ while retaining safety guarantees, and does not require too many excess tests.


Optimal Dispersion Under Asynchrony

Pattanayak, Debasish, Kshemkalyani, Ajay D., Kumar, Manish, Molla, Anisur Rahaman, Sharma, Gokarna

arXiv.org Artificial Intelligence

The problem of dispersion of mobile agents studied extensively in recent distributed computing literature not only takes its inspiration from biological phenomena, such as damselfish establishing non-overlapping territories on coral reefs [6], or neural crest cells migrating and distributing themselves across the developing embryo [28]; but also with practical applications such as placing a fleet of small autonomous robots (agents) under shelves (nodes) in fulfillment centers [41]. It is also closely connected to other coordination tasks such as exploration, scattering, load balancing, and self-deployment [5, 10, 12, 14, 16, 39]. The dispersion problem, denoted as dispersion, involves k n mobile agents placed initially arbitrarily on the nodes of an n -node anonymous graph of maximum degree . The goal for the agents is to autonomously relocate such that each agent is on a distinct node of the graph (see Figure 1). The objective is to design algorithms that simultaneously optimize time and memory complexities. Time complexity is the total time required to achieve dispersion starting from any initial configuration. Memory complexity is the maximum number of bits stored in the persistent memory at each agent throughout the execution. We stress that graph nodes are memory-less and cannot store any information.


SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought

Li, Guanghao, Jiang, Wenhao, Chen, Mingfeng, Li, Yan, Yu, Hao, Dong, Shuting, Ren, Tao, Tang, Ming, Yuan, Chun

arXiv.org Artificial Intelligence

Chain of Thought (CoT) prompting improves the reasoning performance of large language models (LLMs) by encouraging step by step thinking. However, CoT-based methods depend on intermediate reasoning steps, which limits scalability and generalization. Recent work explores recursive reasoning, where LLMs reuse internal layers across iterations to refine latent representations without explicit CoT supervision. While promising, these approaches often require costly pretraining and lack a principled framework for how reasoning should evolve across iterations. We address this gap by introducing Flow Chain of Thought (Flow CoT), a reasoning paradigm that models recursive inference as a progressive trajectory of latent cognitive states. Flow CoT frames each iteration as a distinct cognitive stage deepening reasoning across iterations without relying on manual supervision. To realize this, we propose SCOUT (Stepwise Cognitive Optimization Using Teachers), a lightweight fine tuning framework that enables Flow CoT style reasoning without the need for pretraining. SCOUT uses progressive distillation to align each iteration with a teacher of appropriate capacity, and a cross attention based retrospective module that integrates outputs from previous iterations while preserving the models original computation flow. Experiments across eight reasoning benchmarks show that SCOUT consistently improves both accuracy and explanation quality, achieving up to 1.8% gains under fine tuning. Qualitative analyses further reveal that SCOUT enables progressively deeper reasoning across iterations refining both belief formation and explanation granularity. These results not only validate the effectiveness of SCOUT, but also demonstrate the practical viability of Flow CoT as a scalable framework for enhancing reasoning in LLMs.


PitcherNet helps researchers throw strikes with AI analysis

AIHub

University of Waterloo researchers have developed new artificial intelligence (AI) technology that can accurately analyze pitcher performance and mechanics using low-resolution video of baseball games. The system, developed for the Baltimore Orioles by the Waterloo team, plugs holes in much more elaborate and expensive technology already installed in most stadiums that host Major League Baseball (MLB), whose teams have increasingly tapped into data analytics in recent years. Waterloo researchers convert video of a pitcher's performance into a two-dimensional model that PitcherNet's AI algorithm can later analyze. Those systems, produced by a company called Hawk-Eye Innovations, use multiple special cameras in each park to catch players in action, but the data they yield is typically available to the home team that owns the stadium those games are played in. To add away games to their analytics operation, as well as use smartphone video taken by scouts in minor league and college games, the Orioles asked video and AI experts at Waterloo for help about three years ago.


Meta introduces Llama 4 with two new AI models available now, and two more on the way

Engadget

Meta has released the first two models from its multimodal Llama 4 suite: LLama 4 Scout and Llama 4 Maverick. Maverick is "the workhorse" of the two and excels at image and text understanding for "general assistant and chat use cases," the company said in a blog post, while the smaller model Scout could tackle things like "multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases." The company also introduced Llama 4 Behemoth, an upcoming model it says is "among the world's smartest LLMs" -- and CEO Mark Zuckerberg said we'll be hearing about a fourth model, LLama 4 Reasoning, "in the next month." Both Maverick and Scout are available to download now from the LLama website and Hugging Face, and they've been added to Meta AI, including for WhatsApp, Messenger and Instagram DMs. Scout has 17 billion active parameters with 16 experts, Meta says.


The Dream Hotel by Laila Lalami review – what if AI could read our minds?

The Guardian

Arriving home at Los Angeles international airport, Sara Hussein is asked by immigration and customs officers to step aside, then taken to an interview room. The fundamentals of this scene are familiar – you've probably watched something like it in a film, or dreamed about it happening to you; perhaps it already has. But Sara lives in a new world, several decades in the future, and she is being arrested because Scout, the state's AI security system, has flagged something irregular inside her mind. Sara seems unexceptional: she's a museum archivist, married and mother to young twins. She once had an argument with her husband Elias after he impulsively part-exchanged the family Toyota for a Volvo.