Business Law
PHANTOM: ABenchmark for Hallucination Detection in Financial Long-Context QA
While Large Language Models (LLMs) show great promise, their tendencies to hallucinate pose significant risks in high-stakes domains like finance, especially when used for regulatory reporting and decision-making. Existing hallucination detection benchmarks fail to capture the complexities of financial benchmarks, which require high numerical precision, nuanced understanding of the language of finance, and ability to handle long-context documents. To address this, we introduce PHANTOM, a novel benchmark dataset for evaluating hallucination detection in long-context financial QA. Our approach first generates a seed dataset of high-quality "query-answer-document (chunk)" triplets, with either hallucinated or correct answers - that are validated by human annotators and subsequently expanded to capture various context lengths and information placements. We demonstrate how PHANTOM allows fair comparison of hallucination detection models and provides insights into LLM performance, offering a valuable resource for improving hallucination detection in financial applications. Further, our benchmarking results highlight the severe challenges out-of-the-box models face in detecting real-world hallucinations on long context data, and establish some promising directions towards alleviating these challenges, by fine-tuning open-source LLMs using PHANTOM.1
'Creepy' Listening Tool for Targeted Ads Didn't Actually Work, FTC Says
'Creepy' Listening Tool for Targeted Ads Didn't Actually Work, FTC Says Three firms will pay nearly $1 million for selling "Active Listening" technology that they claimed tapped people's phones for advertising. The FTC alleges the "tech" was just pricey email lists. The Federal Trade Commission announced on Thursday that Cox Media Group and two other marketing companies, MindSift LLC and 1010 Digital Works, have agreed to collectively pay nearly $1 million to settle allegations that they deceived their customers--other businesses--by claiming that they could help target ads based on audio recordings collected from consumers' smart devices via a marketing service called Active Listening. In a statement to WIRED, a spokesperson for CMG says, "We are pleased to have this matter resolved. Our local marketing team relied on marketing materials provided to us by a third-party vendor about their product. We withdrew the materials expeditiously and stopped further use of the product."
Top Google scientist says EU data measures pose privacy risk for users
A top Google scientist warned EU antitrust regulators that its proposal requiring the company to share search engine data with rivals risked exposing users' private information. BRUSSELS - A top Google scientist sent a warning to EU antitrust regulators on Tuesday that its proposal requiring the company to share search engine data with rivals such as OpenAI risked exposing users' private information, the sternest rebuke yet in a tussle over Google's lucrative business model. The European Commission, which acts as the EU competition enforcer, has in recent years cracked down on Big Tech via a slew of legislation to ensure that users have more choices and that smaller rivals have room to compete. However, that has triggered the ire of the U.S. government. Sergei Vassilvitskii, with the title of distinguished scientist at Google since 2012 and regarded a leader in his field, will meet EU antitrust officials on Wednesday to voice his concerns and propose a broader approach with better guardrails.
EU warns Meta over blocking rival AI chatbots on WhatsApp
Valve's Steam Machine: Everything we know MetaAI is essentially the only AI assistant now available on WhatsApp. The EU could take interim measures against WhatsApp as it investigates AI providers' access to the app. On Monday, the EU's regulatory arm announced its preliminary view that Meta, WhatsApp's parent company, violated antitrust laws by blocking third-party AI assistants from operating on WhatsApp. The European Commission's is concerned that Meta's actions will limit competitors from entering the AI assistant market. We must protect effective competition in this vibrant field, which means we cannot allow dominant tech companies to illegally leverage their dominance to give themselves an unfair advantage, Teresa Ribera, executive vice-president for Clean, Just and Competitive Transition said in a statement. Ribera continued: AI markets are developing at rapid pace, so we also need to be swift in our action.