Goto

Collaborating Authors

 Industry


Stolen iPhones fuel scary passcode scam

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG . Midjourney's wild body scanner scans you in water Debt collection letter for debt you don't owe?


ClinBench: A Standardized Multi-Domain Framework for Evaluating Large Language Models in Clinical Information Extraction

Neural Information Processing Systems

Large Language Models (LLMs) offer substantial promise for clinical natural language processing (NLP); however, a lack of standardized benchmarking methodologies limits their objective evaluation and practical translation. To address this gap, we introduce ClinBench, an open-source, multi-model, multi-domain benchmarking framework. ClinBench is designed for the rigorous evaluation of LLMs on important structured information extraction tasks (e.g., tumor staging, histologic diagnoses, atrial fibrillation, and social determinants of health) from unstructured clinical notes. The framework standardizes the evaluation pipeline by: (i) operating on consistently structured input datasets; (ii) employing dynamic, YAML-based prompting for uniform task definition; and (iii) enforcing output validation via JSON schemas, supporting robust comparison across diverse LLM architectures. We demonstrate ClinBench through a large-scale study of 11 prominent LLMs (e.g., GPT-4o series, LLaMA3 variants, Mixtral) across three clinical domains using configurations of public datasets (TCGA for lung cancer, MIMIC-IV-ECG for atrial fibrillation, and MIMIC notes for SDOH). Our results reveal significant performance-efficiency trade-offs. For example, when averaged across the four benchmarked clinical extraction tasks, GPT-3.5-turbo


France-Germany jet plans crash: Can Europe end reliance on US for security?

Al Jazeera

France-Germany jet plans crash: Can Europe end reliance on US for security? France and Germany have announced this week that they are ditching a landmark project to jointly develop a sixth-generation fighter jet. French President Emmanuel Macron confirmed on Monday that the project is being terminated, in what is being seen as a major blow to efforts to boost defence cooperation between European Union states, a key issue amid uncertainty cast by United States President Donald Trump over the readiness of the US to help defend its NATO allies. Since 2019, the US president has been flirting with the idea of obtaining Greenland . His remarks about his desire for the island, a self-governing territory which is part of the Kingdom of Denmark, built to a crescendo at the start of this year, with European leaders signalling their displeasure with the idea and Trump even threatening additional trade tariffs on those countries standing in his way.


Apple says Siri AI won't suck up to you

Engadget

Craig Federighi, Apple's SVP of engineering, said the new Siri will resist attempts at romance. Siri AI will fend off users' attempts at romancing it, according to Craig Federighi, Apple's SVP of engineering. As MacRumors has reported, Federighi clarified that the upgraded Siri for iOS 27 won't suck up to you like other AIs in an interview with the podcast, along with Apple marketing chief Greg Joswiak. Quite the opposite because as you may know, if you use many of the existing chat bots, they're really focused on engagement to a large degree. They kind of wanna pull you in, Federighi replied when asked about the possibility of Siri becoming a user's AI partner.


Why Real-Life Disclosure Day Will Look Nothing Like Steven Spielberg's New Movie

WIRED

Why Real-Life Disclosure Day Will Look Nothing Like Steven Spielberg's New Movie Previous landmark scientific discoveries like the Higgs boson provide a better template for what it will take to confirm whether aliens have made contact with Earth. Steven Spielberg's new film imagines the moment 8 billion humans find out that we are not alone in the universe. The movie, which opens in US theaters on June 12, is a fictional account of the government cover-up and subsequent "disclosure" of evidence that aliens have contacted Earth. The UFO community has been chasing that type of cinematic big reveal for 80 years. But it's more likely that monumental scientific discoveries, like the detection of the Higgs boson in 2012 and the confirmation of gravitational waves in 2016, are a better guideline for how real-world disclosure is likely to play out: through long-running research and with verifiable results.


The AI PC era has a benchmarking problem

PCWorld

PCWorld highlights how AI-focused hardware like Nvidia's RTX Spark creates challenges for traditional PC benchmarking methods that may no longer adequately assess performance. Current benchmarks struggle to evaluate devices designed for hybrid computing, where workloads split between local hardware and cloud services. The industry needs new benchmarking approaches that answer whether AI PCs are right for individual users' specific needs.


Google sues Chinese scammers using Gemini AI for fraud

Engadget

The company is also promoting legislation to fight the potential of AI to create'massive' scams. Google sued a Chinese cybercrime network for using its Gemini AI to perpetuate a massive scam operation, the company announced . The search giant has coordinated with the FBI, along with carriers AT&T, T-Mobile and Verizon to dismantle the operation. Google is also advocating for updated laws to deal with AI-driven attacks, saying the technology has the potential to supercharge threats. This is our first coordinated effort and lawsuit and that speaks to the breadth of impact that this particular scam has, Google's general counsel DeLaine Prado told The New York Times in an interview.


Adversarial generalization of unfolding (model-based) networks

Neural Information Processing Systems

Unfolding networks are interpretable networks emerging from iterative algorithms, incorporate prior knowledge of data structure, and are designed to solve inverse problems like compressed sensing, which deals with recovering data from noisy, missing observations. Compressed sensing finds applications in critical domains, from medical imaging to cryptography, where adversarial robustness is crucial to prevent catastrophic failures. However, a solid theoretical understanding of the performance of unfolding networks in the presence of adversarial attacks is still in its infancy. In this paper, we study the adversarial generalization of unfolding networks when perturbed with $l_2$-norm constrained attacks, generated by the fast gradient sign method. Particularly, we choose a family of state-of-the-art overaparameterized unfolding networks and deploy a new framework to estimate their adversarial Rademacher complexity. Given this estimate, we provide adversarial generalization error bounds for the networks under study, which are tight with respect to the attack level. To our knowledge, this is the first theoretical analysis on the adversarial generalization of unfolding networks. We further present a series of experiments on real-world data, with results corroborating our derived theory, consistently for all data. Finally, we observe that the family's overparameterization can be exploited to promote adversarial robustness, shedding light on how to efficiently robustify neural networks.


3D-Agent: A Tri-Modal Multi-Agent Responsive Framework for Comprehensive 3D Object Annotation

Neural Information Processing Systems

Driven by the applications in autonomous driving, robotics, and augmented reality, 3D object annotation is a critical task compared to 2D annotation, such as spatial complexity, occlusion, and viewpoint inconsistency.


Sheeran Loopers Looper X Review: Create Your One-Person Tour

WIRED

Musician Ed Sheeran created his own line of loopers so anyone can record and layer riffs in a loop to become a one-man band. Routing options are robust and intuitive. Pedals are tactile and easy to press. "Mode" switch customization is clutch. Touchscreen makes setup a breeze.