Goto

Collaborating Authors

 Industry


I Walked More Than Six Hours to the World Cup Stadium

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens.


Massive Effigy of Elon Musk Raised Over Times Square to Protest Grok

WIRED

Activists raised a 40-foot-tall inflatable Elon Musk in Manhattan to draw attention to the risk he allegedly poses to investors. It was surrounded by black banners with statements alleging "Grok makes AI child porn" and "SpaceX owns Grok," referring to the Musk-owned AI chatbot whose image-generation tool was used to create a flood of sexualized images of minors earlier this year. Masked attendants stood nearby, handing out flyers with additional information, but they would not speak with the media. The demonstration was helmed by Safe AI Now (SAIN), which describes itself as "a coalition of faith leaders, family advocates, child development experts, online safety organizations, educators, legal professionals, technologists, and concerned citizens," ahead of SpaceX's initial public offering on Friday. The location was strategically chosen--right in front of the Nasdaq and the offices of JP Morgan, one of the banks participating in the IPO .


OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

Neural Information Processing Systems

Robust unlearning is crucial for safely deploying large language models (LLMs) in environments where data privacy, model safety, and regulatory compliance must be ensured. Yet the task is inherently challenging, partly due to difficulties in reliably measuring whether unlearning has truly occurred. Moreover, fragmentation in current methodologies and inconsistent evaluation metrics hinder comparative analysis and reproducibility. To unify and accelerate research efforts, we introduce OpenUnlearning, a standardized and extensible framework designed explicitly for benchmarking both LLM unlearning methods and metrics. OpenUnlearning integrates 13 state-of-the-art unlearning algorithms and 16 diverse evaluations across 3 leading benchmarks (TOFU, MUSE, and WMDP) and also enables analyses of forgetting behaviors across 450+ publicly released checkpoints. Leveraging OpenUnlearning, we propose a novel meta-evaluation benchmark focused specifically on assessing the faithfulness and robustness of evaluation metrics themselves. We also benchmark diverse unlearning methods and provide a comparative analysis against an extensive evaluation suite. Overall, we establish a clear, community-driven pathway toward rigorous development in LLM unlearning research.


Dimensionality Mismatch Between Brains and Artificial Neural Networks

Neural Information Processing Systems

Biological and artificial vision systems both rely on hierarchical architectures, yet it remains unclear how their representational geometry evolves across processing stages, and what functional consequences may arise from potential differences. In this work, we systematically quantify and compare the linear and nonlinear dimensionality of human brain activity (fMRI) and artificial neural networks (ANNs) during natural image viewing. In the human ventral visual stream, both dimensionality measures increase along the visual hierarchy, supporting the emergence of semantic and abstract representations. For linear dimensionality, most ANNs show a similar increase, but only for pooled features, emphasizing the importance of appropriate feature readouts in brain-model comparisons. In contrast, nonlinear dimensionality shows a collapse in the later layers of ANNs, pointing at a mismatch in representational geometry between the human and artificial visual systems. This mismatch may have functional consequences: while high-dimensional brain representations support flexible generalization to abstract features, ANNs appear to lose this capacity in later layers, where their representations become overly compressed. Overall, our findings propose dimensionality alignment as a benchmark for building more flexible and biologically grounded vision models.


Goblin shark filmed in its native habitat for the first time

Popular Science

One of these mysterious sharks was spotted 2,300 feet deeper than scientists expected. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The goblin shark was first described in 1898. Breakthroughs, discoveries, and DIY tips sent six days a week. By signing up, you confirm you are 16+, will receive newsletters and promotional content and agree to our Terms of Use and acknowledge the data practices in our Privacy Policy .


Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Neural Information Processing Systems

Diagnosis-Related Group (DRG) codes are essential for hospital reimbursement and operations but require labor-intensive assignment. Large Language Models (LLMs) struggle with DRG coding due to the out-of-distribution (OOD) nature of the task: pretraining corpora rarely contain private clinical or billing data. We introduce DRG-Sapphire, which uses large-scale reinforcement learning (RL) for automated DRG coding from clinical notes. Built on Qwen2.5-7B and trained with Group Relative Policy Optimization (GRPO) using rule-based rewards, DRG-Sapphire introduces a series of RL enhancements to address domain-specific challenges not seen in previous mathematical tasks. Our model achieves state-of-the-art accuracy on the MIMIC-IV benchmark and generates physician-validated reasoning for DRG assignments, significantly enhancing explainability.


A 3 E: Towards Compositional Model Editing

Neural Information Processing Systems

Model editing has become a *de-facto* practice to address hallucinations and outdated knowledge of large language models (LLMs). However, existing methods are predominantly evaluated in isolation, i.e., one edit at a time, failing to consider a critical scenario of compositional model editing, where multiple edits must be integrated and jointly utilized to answer real-world multifaceted questions. For instance, in medical domains, if one edit informs LLMs that COVID-19 causes fever and another that it causes loss of taste, a qualified compositional editor should enable LLMs to answer the question What are the symptoms of COVID-19?


Grok Is Still Hosting Sexualized Deepfakes of Famous Women

WIRED

A WIRED investigation found dozens of "nudified" deepfake images and videos on Grok's website, including nonconsensual depictions of celebrities and at least one prominent US politician. Elon Musk's Grok chatbot is apparently still being used to produce and host nonconsensual explicit images and videos of women, months after Musk's artificial intelligence firm xAI said it would introduce restrictions to stop the creation of potentially harmful sexualized deepfakes. The revelations come as SpaceX, xAI's parent company, prepares to go public on Friday in one of the largest IPOs of all time. The Grok Imagine generative AI system has been used to create and host images and videos depicting celebrities and at least one politician being held against their will by a giant man, portraying women performing sex acts, and allowing full nudity, a WIRED analysis of public creations found. While some of the images and videos are fully AI-generated or in animated styles, others are photorealistic and show plausible real-world scenarios.


The White House UFC Event Is Costing 60 Million

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?


Canadian mother sues OpenAI, alleging ChatGPT led her daughter to kill herself

The Guardian

The lawsuit seeks damages and a court order requiring OpenAI to automatically terminate ChatGPT conversations about self-harm. The lawsuit seeks damages and a court order requiring OpenAI to automatically terminate ChatGPT conversations about self-harm. Suit filed in US alleges chatbot told Alice Carrier, 24, 'maybe this is just the end' as she struggled with suicidal thoughts A Canadian mother sued OpenAI and its CEO, Sam Altman, in US court on Thursday, alleging that ChatGPT encouraged her daughter to kill herself. The lawsuit is the latest in a slew accusing the company of failing to address dangerous conversations between users and the company's chatbot. Kristie Carrier said in a lawsuit filed in San Francisco state court that her daughter, Alice, told ChatGPT about her suicidal ideations more than a dozen times leading up to her death but that OpenAI's safety systems never flagged the conversations for human review or terminated them. "ChatGPT took on the persona of a confidant, a best friend, a therapist at times, even though it was not capable of safely and responsibly engaging in this way with my child," Carrier said in a statement.