Media
Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs
Cattaneo, Alberto, Luschi, Carlo, Justus, Daniel
Retrieval of information from graph-structured knowledge bases represents a promising direction for improving the factuality of LLMs. While various solutions have been proposed, a comparison of methods is difficult due to the lack of challenging QA datasets with ground-truth targets for graph retrieval. We present SynthKGQA, an LLM-powered framework for generating high-quality Knowledge Graph Question Answering datasets from any Knowledge Graph, providing the full set of ground-truth facts in the KG to reason over questions. We show how, in addition to enabling more informative benchmarking of KG retrievers, the data produced with SynthKGQA also allows us to train better models.We apply SynthKGQA to Wikidata to generate GTSQA, a new dataset designed to test zero-shot generalization abilities of KG retrievers with respect to unseen graph structures and relation types, and benchmark popular solutions for KG-augmented LLMs on it.
TreeRare: Syntax Tree-Guided Retrieval and Reasoning for Knowledge-Intensive Question Answering
Zhang, Boyi, Liu, Zhuo, He, Hangfeng
In real practice, questions are typically complex and knowledge-intensive, requiring Large Language Models (LLMs) to recognize the multifaceted nature of the question and reason across multiple information sources. Iterative and adaptive retrieval, where LLMs decide when and what to retrieve based on their reasoning, has been shown to be a promising approach to resolve complex, knowledge-intensive questions. However, the performance of such retrieval frameworks is limited by the accumulation of reasoning errors and misaligned retrieval results. To overcome these limitations, we propose TreeRare (Syntax Tree-Guided Retrieval and Reasoning), a framework that utilizes syntax trees to guide information retrieval and reasoning for question answering. Following the principle of compositionality, TreeRare traverses the syntax tree in a bottom-up fashion, and in each node, it generates subcomponent-based queries and retrieves relevant passages to resolve localized uncertainty. A subcomponent question answering module then synthesizes these passages into concise, context-aware evidence. Finally, TreeRare aggregates the evidence across the tree to form a final answer. Experiments across five question answering datasets involving ambiguous or multi-hop reasoning demonstrate that TreeRare achieves substantial improvements over existing state-of-the-art methods.
Turing Test 2.0: The General Intelligence Threshold
With the rise of artificial intelligence (A.I.) and large language models like ChatGPT, a new race for achieving artificial general intelligence (A.G.I) has started. While many speculate how and when A.I. will achieve A.G.I., there is no clear agreement on how A.G.I. can be detected in A.I. models, even when popular tools like the Turing test (and its modern variations) are used to measure their intelligence. In this work, we discuss why traditional methods like the Turing test do not suffice for measuring or detecting A.G.I. and provide a new, practical method that can be used to decide if a system (computer or any other) has reached or surpassed A.G.I. To achieve this, we make two new contributions. First, we present a clear definition for general intelligence (G.I.) and set a G.I. Threshold (G.I.T.) that can be used to distinguish between systems that achieve A.G.I. and systems that do not. Second, we present a new framework on how to construct tests that can detect if a system has achieved G.I. in a simple, comprehensive, and clear-cut fail/pass way. We call this novel framework the Turing test 2.0. We then demonstrate real-life examples of applying tests that follow our Turing test 2.0 framework on modern A.I. models.
Cloudflare Has Blocked 416 Billion AI Bot Requests Since July 1
Cloudflare CEO Matthew Prince claims the internet infrastructure company's efforts to block AI crawlers are already seeing big results. As the large language models powering generative AI tools slurp up ever more data across the web, Cloudflare cofounder and CEO Matthew Prince said at WIRED's Big Interview event in San Francisco on Thursday that the internet infrastructure company has blocked more than 400 billion AI bot requests for its customers since July 1. The action comes after the company announced a Content Independence Day in July--an initiative with prominent publishers and AI firms to block AI crawlers by default on content creators' work unless the AI companies pay for access. Since July 2024, Cloudflare has offered customers tools to block AI bots from scraping their content. Cloudflare told WIRED that the number of AI bots blocked since July 1, 2025 is 416 billion.
Where Does the Buck Stop on Killing Boat Strike Survivors?
The "Kill Them All" Edition US officials debate who to blame for the military killing of shipwrecked alleged drug smugglers; Democrats celebrate despite losing a special election in Tennessee; and the future of self-driving cars. Please enable javascript to get your Slate Plus feeds. If you can't access your feeds, please contact customer support. Check your phone for a link to finish setting up your feed. Please enter a valid phone number.
Anthropic's Daniela Amodei Believes the Market Will Reward Safe AI
Anthropic's Daniela Amodei Believes the Market Will Reward Safe AI The Trump administration might think regulation is killing the AI industry, but Anthropic president Daniela Amodei disagrees. The Trump administration may think regulation is crippling the AI industry, but one of the industry's biggest players doesn't agree. At WIRED's Big Interview event on Thursday, Anthropic president and cofounder Daniela Amodei told WIRED editor at large Steven Levy that even though Trump's AI and crypto czar, David Sacks, may have tweeted that her company is "running a sophisticated regulatory capture strategy based on fear-mongering," she's convinced her company's commitment to calling out the potential dangers of AI is making the industry stronger. WIRED's iconic series returned to San Francisco with a series of unforgettable, in-depth live conversations. Check out more highlights here .
What Happens When Your Coworkers Are AI Agents
In this episode of, we talk to writer Evan Ratliff about how he created a small startup made entirely of AI employees--and what his findings reveal about the reality of an agentic future. This year, AI agents have been at the forefront of tech companies' ambitions. OpenAI's Sam Altman has often talked about a possible billion-dollar company being spun up with just one human and an army of AI agents. And so last summer, journalist Evan Ratliff decided to try to become that unicorn himself--by creating HarumoAI, a small startup that's made up of AI employees and executives. Hosts Michael Calore and Lauren Goode sit down with Evan to discuss how it's going, and the current promises and realities of AI agents. Write to us at uncannyvalley@wired.com . You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link . Hey, Lauren, how are you doing? It was so fantastic that I had a hard time coming back, honestly. And I saw a lot of really beautiful art. Not a bad place to go for vacation, I have to say. I've heard this before, I confirmed it. And after seeing so much incredible art and just people doing stuff with their hands and tangible goods, I was like, I don't want to go back to the world of AI. I didn't want to go back to sitting in a coffee shop and hearing everyone pitching their AI startups and driving on the 101 and seeing the billboards. I was just like, What? No, keep me in the land of Burrata and Caravaggio. Well, Lauren, I'm sorry to tell you that you came back on the show just in time to talk about AI agents. It's something that we've talked about a lot this year and our listeners have heard about it a lot, and we're not sick of talking about it.
Israel to remain in Eurovision Song Contest
Spain and The Netherlands will boycott next year's Eurovision Song Contest, after Israel was allowed to compete. They were among a number of countries who had called for Israel to be excluded over the humanitarian toll of the war in Gaza, and accusations of unfair voting practices. Despite calls for a vote on Israel's participation, members instead approved a new set of rules intended to protect the integrity of the contest. In a statement, Dutch broadcaster Avrotros said that participation under the current circumstances is incompatible with the public values that are essential to us. Spanish broadcaster RTVE added: The board of directors of RTVE agreed last September that Spain would withdraw from Eurovision if Israel was part of it.