AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs

Neural Information Processing SystemsJun-29-2026, 03:08:44 GMT

Large Language Models (LLMs) have achieved remarkable performance by capturing complex interactions between input features. To identify these interactions, most existing approaches require enumerating all possible combinations of features up to a given order, causing them to scale poorly with the number of inputs $n$. Recently, Kang et al. (2025) proposed SPEX, an information-theoretic approach that uses interaction sparsity to scale to $n \approx 10^3$ features. SPEX greatly improves upon prior methods but requires tens of thousands of model inferences, which can be prohibitive for large models. In this paper, we observe that LLM feature interactions are often *hierarchical*--higher-order interactions are accompanied by their lower-order subsets--which enables more efficient discovery. To exploit this hierarchy, we propose ProxySPEX, an interaction attribution algorithm that first fits gradient boosted trees to masked LLM outputs and then extracts the important interactions. Experiments across four challenging high-dimensional datasets show that ProxySPEX more faithfully reconstructs LLM outputs by 20\% over marginal attribution approaches while using *$10\times$ fewer inferences* than SPEX. By accounting for interactions, ProxySPEX efficiently identifies the most influential features, providing a scalable approximation of their Shapley values. Further, we apply ProxySPEX to two interpretability tasks.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

Neural Information Processing SystemsJun-28-2026, 21:25:23 GMT

Modern engineering, spanning electrical, mechanical, aerospace, civil, and computer disciplines, stands as a cornerstone of human civilization and the foundation of our society. However, engineering design poses a fundamentally different challenge for large language models (LLMs) compared with traditional textbook-style problem solving or factual question answering. Although existing benchmarks have driven progress in areas such as language understanding, code synthesis, and scientific problem solving, real-world engineering design demands the synthesis of domain knowledge, navigation of complex trade-offs, and management of the tedious processes that consume much of practicing engineers' time. Despite these shared challenges across engineering disciplines, no benchmark currently captures the unique demands of engineering design work. In this work, we introduce EngDesign, an Engineering Design benchmark that evaluates LLMs' abilities to perform practical design tasks across nine engineering domains. Unlike existing benchmarks that focus on factual recall or question answering, EngDesign uniquely emphasizes LLMs' ability to synthesize domain knowledge, reason under constraints, and generate functional, objective-oriented engineering designs. Each task in EngDesign represents a real-world engineering design problem, accompanied by a detailed task description specifying design goals, constraints, and performance requirements. EngDesign pioneers a simulation-based evaluation paradigm that moves beyond textbook knowledge to assess genuine engineering design capabilities and shifts evaluation from static answer checking to dynamic, simulation-driven functional verification, marking a crucial step toward realizing the vision of engineering Artificial General Intelligence (AGI).

artificial intelligence, large language model, natural language, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

My Boyfriend Said His Family Friend Is Like a "Sister" to Him. Uh, That's Not What His Computer History Says.

SlateJun-27-2026, 16:00:00 GMT

How to Do It My Boyfriend Said His Family Friend Is Like a "Sister" to Him. Uh, That's Not What His Computer History Says. Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. I've been with my boyfriend for a year and things have been smooth sailing so far, with very little contention between us two. But about a week ago, he left his laptop open on the bed and I decided to take a look at his ChatGPT history.

large language model, machine learning, slate shop game newsletter sign, (12 more...)

Slate

Industry:

Marketing (1.00)
Information Technology > Security & Privacy (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Don't pay 20/month for ChatGPT--1 year of ChatOn gives you GPT, Gemini & Claude for just 30

PCWorldJun-27-2026, 08:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. Don't pay $20/month for ChatGPT--1 year of ChatOn gives you GPT, Gemini & Claude for just $30 Try the biggest AI models in one place for a year. Get ChatOn Premium for $29.99 (MSRP $39.99) through June 28 and access GPT, Claude, Gemini, Sonar, and more from a single app. Most people don't need another AI subscription. They need one app that does the job .

home robotic performance privacy productivity, large language model, machine learning, (16 more...)

PCWorld

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Apple executive in charge of Vision Pro is reportedly leaving for OpenAI

EngadgetJun-27-2026, 07:36:58 GMT

Paul Meade will start OpenAI's hardware division, 'Bloomberg' says. Paul Meade, an Apple VP who heads the Vision Products Group, is reportedly leaving the company next week for OpenAI. According to Bloomberg, the top executive in charge of the Vision Pro headset and Apple's smart glasses projects will be starting up the AI company's hardware unit. OpenAI has been developing AI-powered devices with Jony Ive's startup since 2025. While Ive's io merged with OpenAI in a $6.5 billion deal, it remains independent.

large language model, machine learning, natural language, (13 more...)

Engadget

Industry: Leisure & Entertainment > Games > Computer Games (0.77)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Preference Optimization by Estimating the Ratio of the Data Distribution

Neural Information Processing SystemsJun-27-2026, 07:35:16 GMT

Direct preference optimization (DPO) is widely used as a simple and stable method for aligning large language models (LLMs) with human preferences. This paper investigates a generalized DPO loss that enables a policy model to match the target policy from a likelihood ratio estimation perspective. The ratio of the target policy provides a unique identification of the policy distribution without relying on reward models or partition functions. This allows the generalized loss to retain both simplicity and theoretical guarantees, which prior work such as $f$-PO fails to achieve simultaneously. We propose \textit{Bregman preference optimization} (BPO), a generalized framework for ratio matching that provides a family of objective functions achieving target policy optimality.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)

Add feedback

OpenAI launches a limited preview of GPT-5.6 for a 'small group of trusted partners'

EngadgetJun-27-2026, 04:23:40 GMT

OpenAI launches a limited preview of GPT-5.6 for a'small group of trusted partners' OpenAI launches a limited preview of GPT-5.6 for a'small group of trusted partners' The model's three variants will be available more broadly in the coming weeks. OpenAI has started previewing its GPT 5.6 series, which will be available in three versions, to a limited number of trusted partners. The company says the variant Sol is its strongest model yet, while Terra is for everyday use and has a similar performance to GPT 5.5 despite being twice as cheap. Luna, the last variant, is the company's lowest cost model. OpenAI plans to give them a broad release sometime in the coming weeks.

large language model, machine learning, natural language, (14 more...)

Engadget

Country: North America > United States (0.19)

Industry: Leisure & Entertainment > Games > Computer Games (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly

Neural Information Processing SystemsJun-27-2026, 02:15:46 GMT

Recent advancements have increasingly focused on leveraging large language models (LLMs) to construct autonomous agents for complex problem-solving tasks. However, existing approaches predominantly employ a single-agent framework to generate search branches and estimate rewards during Monte Carlo Tree Search (MCTS) planning. This single-agent paradigm inherently limits exploration capabilities, often resulting in insufficient diversity among generated branches and suboptimal planning performance.

artificial intelligence, large language model, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Add feedback

How People in China Keep Outsmarting Anthropic's Geolocation Restrictions

WIREDJun-26-2026, 21:02:54 GMT

How People in China Keep Outsmarting Anthropic's Geolocation Restrictions As Anthropic tightens restrictions on access to Claude in China, users keep finding new workarounds, from proxy services to fake identities sourced on Telegram. Anthropic goes to great lengths to prevent people in China from using its AI models, but in practice, its safeguards have often failed. Over the past year, startups, researchers, and tech enthusiasts across the country have developed increasingly sophisticated workarounds to access Claude. Many of them consider it the world's most capable AI assistant, making the extra effort to obtain it worthwhile. In early June, Anthropic publicly released Fable 5, a safeguarded version of its most powerful AI model to date, Mythos.

large language model, machine learning, natural language, (18 more...)

WIRED

Country:

Asia > China (1.00)
North America > United States > California (0.29)

Industry:

Retail (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Most prominent AI chatbots have liberal bias, new study finds

FOX NewsJun-26-2026, 19:52:27 GMT

OpenAI's ChatGPT showed left-leaning bias 80% of the time, according to a new study testing AI chatbots including Google Gemini, Anthropic Claude, and Grok for political bias.

large language model, machine learning, natural language, (12 more...)

FOX News

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback