AITopics | Genre

Collaborating Authors

Genre

Stab-SGD: Noise-Adaptivity in Smooth Optimization with Stability Ratios

Neural Information Processing SystemsJun-15-2026, 10:18:04 GMT

In the context of smooth stochastic optimization with first order methods, we introduce the stability ratio of gradient estimates, as a measure of local relative noise level, from zero for pure noise to one for negligible noise. We show that a schedulefree variant (Stab-SGD) of stochastic gradient descent obtained by just shrinking the learning rate by the stability ratio achieves real adaptivity to noise levels (i.e.

artificial intelligence, experiment, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

1c2b1c8f7d317719a9ce32dd7386ba35-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 10:17:06 GMT

Neural techniques radiance to reconstruct fields (NeRF) and render and 3D photorealistic Gaussian Splatting images. Ho (3DGS) wever, are the popular prerequisite completeness.

large language model, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Media > Photography (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

TaiwanVQA: Benchmarking and Enhancing Cultural Understanding in Vision-Language Models

Neural Information Processing SystemsJun-15-2026, 10:16:45 GMT

Vision-language models (VLMs) often struggle with culturally specific content -- a challenge largely overlooked by existing benchmarks that focus on dominant languages and globalized datasets. We introduce TAIWANVQA, a VQA benchmark designed for Taiwanese culture to evaluate recognition and reasoning in regional contexts. TAIWANVQA contains 2,736 images and 5,472 manually curated questions covering topics such as traditional foods, public signs, festivals, and landmarks. The official benchmark set includes 1,000 images and 2,000 questions for systematic assessment, with the remainder of the data used as training material. Evaluations on state-of-the-art VLMs reveal strong visual recognition but notable weaknesses in cultural reasoning.

benchmark, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: Asia > Taiwan (0.30)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Media (0.92)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

GATEKEEPER: Improving Model Cascades Through Confidence Tuning

Neural Information Processing SystemsJun-15-2026, 10:08:25 GMT

Large-scale machine learning models deliver strong performance across a wide range of tasks but come with significant computational and resource constraints. To mitigate these challenges, local smaller models are often deployed alongside larger models, relying on routing and deferral mechanisms to offload complex tasks.

arxiv preprint arxiv, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Information Technology (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

Neural Information Processing SystemsJun-15-2026, 10:07:24 GMT

We introduce ReplaceMe, a generalized training-free depth pruning method that effectively replaces transformer blocks with a linear operation, while maintaining high performance for low compression ratios. In contrast to conventional pruning approaches that require additional training or fine-tuning, our approach requires only a small calibration dataset that is used to estimate a linear transformation, which approximates the pruned blocks. The estimated linear mapping can be seamlessly merged with the remaining transformer blocks, eliminating the need for any additional network parameters. Our experiments show that ReplaceMe consistently outperforms other training-free approaches and remains highly competitive with state-of-the-art pruning methods that involve extensive retraining/fine-tuning and architectural modifications. Applied to several large language models (LLMs), ReplaceMe achieves up to 25% pruning while retaining approximately 90% of the original model's performance on open benchmarks--without any training or healing steps, resulting in minimal computational overhead.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Education (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SteerConf: Steering LLMs for Confidence Elicitation

Neural Information Processing SystemsJun-15-2026, 10:07:14 GMT

Large Language Models (LLMs) exhibit impressive performance across diverse domains but often suffer from overconfidence, limiting their reliability in critical applications. We propose SteerConf, a novel framework that systematically steers LLMs' confidence scores to improve their calibration and reliability. SteerConf introduces three key components: (1) a steering prompt strategy that guides LLMs to produce confidence scores in specified directions (e.g., conservative or optimistic) by leveraging prompts with varying steering levels; (2) a steered confidence consistency measure that quantifies alignment across multiple steered confidences to enhance calibration; and (3) a steered confidence calibration method that aggregates confidence scores using consistency measures and applies linear quantization for answer selection. SteerConf operates without additional training or fine-tuning, making it broadly applicable to existing LLMs. Experiments on seven benchmarks spanning professional knowledge, common sense, ethics, and reasoning tasks, using advanced LLM models (GPT-3.5, LLaMA 3, GPT-4), demonstrate that SteerConf significantly outperforms existing methods, often by a significant margin. Our findings highlight the potential of steering the confidence of LLMs to enhance their reliability for safer deployment in real-world applications.

confidence score, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mamba Modulation On the Length Generalization of Mamba

Neural Information Processing SystemsJun-15-2026, 10:06:12 GMT

The quadratic complexity of the attention mechanism in Transformer models has motivated the development of alternative architectures with sub-quadratic scaling, such as state-space models. Among these, Mamba has emerged as a leading architecture, achieving state-of-the-art results across a range of language modeling tasks. However, Mambas performance significantly deteriorates when applied to contexts longer than those seen during pre-training, revealing a sharp sensitivity to context length extension. Through detailed analysis, we attribute this limitation to the out-of-distribution behavior of its state-space dynamics, particularly within the parameterization of the state transition matrix A. Unlike recent works which attribute this sensitivity to the vanished accumulation of discretization time steps, exp( PN t=1 t), we establish a connection between state convergence behavior as the input length approaches infinity and the spectrum of the transition matrix A, offering a well-founded explanation of its role in length extension. Next, to overcome this challenge, we propose an approach that applies spectrum scaling to pre-trained Mamba models to enable robust long-context generalization by selectively modulating the spectrum of Amatrices in each layer. We show that this can significantly improve performance in settings where simply modulating t fails, validating our insights and providing avenues for better length generalization of state-space models with structured transition matrices.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.45)
North America > Canada > Quebec (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

1bf3dbbd6346f50627e2ab1795f90435-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 10:05:54 GMT

Diffusion Transformers have emerged as the foundation for vision generative models, but their scalability is limited by the high cost of hyperparameter (HP) tuning at large scales. Recently, Maximal Update Parametrization (µP) was proposed for vanilla Transformers, which enables stable HP transfer from small to large language models, and dramatically reduces tuning costs. However, it remains unclear whether µP of vanilla Transformers extends to diffusion Transformers, which differ architecturally and objectively. In this work, we generalize standard µP to diffusion Transformers and validate its effectiveness through large-scale experiments. First, we rigorously prove that µP of mainstream diffusion Transformers, including DiT, U-ViT, PixArt-α, and MMDiT, aligns with that of the vanilla Transformer, enabling the direct application of existing µP methodologies. Leveraging this result, we systematically demonstrate that DiT-µP enjoys robust HP transferability. Notably, DiT-XL-2-µP with transferred learning rate achieves 2.9 faster convergence than the original DiT-XL-2.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

"Yuppies," "Mutiny," and "How to Start," Reviewed

The New YorkerJun-15-2026, 10:00:00 GMT

When Did White-Collar Work Start to Look So Bleak? In the nineteen-eighties, an office job promised security and fulfillment. For graduates starting careers today, the prospect is often tinged with dread. The workplace's sense of control can prove illusory--as it did in the era of yuppie-wrought corporate consolidation, and as it does now for graduates entering an economy destabilized by new uncertainties. This spring, across the nation's auditoriums and quadrangles, members of the class of 2026 took their seats to receive remarks from distinguished guests. The graduation speech is a thankless form: generalized, impersonal exhortation/congratulation is almost guaranteed to be forgettable, if all goes well. But this year, on at least a few American campuses, all did not go well. At the University of Arizona, Eric Schmidt, the former C.E.O. of Google, told the crowd that artificial intelligence "will touch every profession, every classroom, every hospital, every laboratory, every person, and every relationship you have," a sweeping promise that landed like a threat.

artificial intelligence, book review, culture fiction & poetry humor, (11 more...)

The New Yorker

Country:

North America > United States > New York (0.30)
North America > United States > Arizona (0.24)

Genre: Summary/Review (0.64)

Industry:

Law (1.00)
Banking & Finance (1.00)
Health & Medicine (0.89)
(3 more...)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

1 in 4 World Cup Matches Could Be Played in Dangerous Temperatures

WIREDJun-15-2026, 10:00:00 GMT

A new report warns that Miami, Kansas City, Philadelphia, Dallas, and Houston could be particularly hot places to play during the 2026 World Cup. Extreme heat will be one of the biggest challenges for players and fans during the 2026 FIFA World Cup . According to an analysis by the World Weather Attribution (WWA), around 25 percent of the 104 matches of the tournament could be played under temperatures that exceed the recommended thermal safety limits. The study points out that the probability of facing these conditions is almost double that recorded in the 1994 tournament held in the United States. The projections were developed using a statistical model designed to calculate the probability of each match being played in extremely hot conditions.

artificial intelligence, social media, world cup, (11 more...)

WIRED

Country: