AITopics | nsa

Collaborating Authors

nsa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Disneyland Now Uses Face Recognition on Visitors

WIREDMay-2-2026, 10:30:00 GMT

Plus: The NSA tests Anthropic's Mythos Preview to find vulnerabilities, a Finnish teen is charged over the Scattered Spider hacking spree, and more. A gunman attempted to enter the White House Correspondents' Dinner in Washington, DC, last weekend, while President Donald Trump, Vice President JD Vance, and other administration officials were in attendance. Media reports and Trump himself quickly identified the suspected shooter as 31-year-old engineer and computer scientist Cole Tomas Allen. The California resident was arrested at the scene on Saturday and appeared Monday in the US District Court for the District of Columbia to face three federal charges: attempting to assassinate the president, transportation of a firearm in interstate commerce, and discharge of a firearm during a crime of violence. The authentication standards body known as the FIDO Alliance announced working groups this week along with Google and Mastercard to develop technical guardrails for validating and protecting transactions initiated by an AI agent .

anthropic, artificial intelligence, natural language, (13 more...)

WIRED

Country:

North America > United States > California (0.36)
North America > United States > District of Columbia > Washington (0.25)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.54)
Information Technology > Artificial Intelligence > Natural Language (0.34)

Add feedback

NavigatingtotheBestPolicy inMarkovDecisionProcesses

Neural Information Processing SystemsFeb-11-2026, 10:17:39 GMT

Somewhat surprisingly, learning in a Markov Decision Process is most often considered under the performance criteria ofconsistency or regret minimization(see e.g.

artificial intelligence, machine learning, nsa, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England (0.04)
Europe > Sweden (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

UnderstandingandExploringtheNetworkwith StochasticArchitectures

Neural Information Processing SystemsFeb-9-2026, 18:28:04 GMT

The predictions provided by different architectures can be further assembled or used to calculate uncertainty estimates, making the prediction model more accurate,robust,andcalibrated.

architecture, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Understanding and Exploring the Network with Stochastic Architectures

Neural Information Processing SystemsDec-24-2025, 10:32:12 GMT

There is an emerging trend to train a network with stochastic architectures to enable various architectures to be plugged and played during inference. However, the existing investigation is highly entangled with neural architecture search (NAS), limiting its widespread use across scenarios. In this work, we decouple the training of a network with stochastic architectures (NSA) from NAS and provide a first systematical investigation on it as a stand-alone problem. We first uncover the characteristics of NSA in various aspects ranging from training stability, convergence, predictive behaviour, to generalization capacity to unseen architectures. We identify various issues of the vanilla NSA, such as training/test disparity and function mode collapse, and further propose the solutions to these issues with theoretical and empirical insights. We believe that these results could also serve as good heuristics for NAS. Given these understandings, we further apply the NSA with our improvements into diverse scenarios to fully exploit its promise of inference-time architecture stochasticity, including model ensemble, uncertainty estimation and semi-supervised learning. Remarkable performance (e.g., 2.75% error rate and 0.0032 expected calibration error on CIFAR-10) validate the effectiveness of such a model, providing new perspectives of exploring the potential of the network with stochastic architectures, beyond NAS.

architecture, name change, stochastic architecture, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimizing Native Sparse Attention with Latent Attention and Local Global Alternating Strategies

Hu, Yuxuan, Tan, Jianchao, Zhang, Jiaqi, Zan, Wen, Sun, Pingwei, Lu, Yifan, Sun, Yerui, Xie, Yuchen, Cai, Xunliang, Zhang, Jing

arXiv.org Artificial IntelligenceNov-4-2025

In this work, we conduct a systematic analysis of Native Sparse Attention (NSA) and propose targeted improvements that enhance long-context modeling. A key insight is that alternating between local (sliding-window) and global (compression, selective) attention across layers, rather than using fixed patterns, enables more effective propagation of long-range dependencies and substantially boosts performance on long-sequence tasks. Meanwhile, we further refine NSA's branches with Latent Attention that the sliding-window branch is enhanced with Multi-head Latent Attention (MLA) while compression and selective branches adopt Group-head Latent Attention (GLA). These changes reduce KV-cache memory by 50\% versus NSA while improving the model's common-sense reasoning and long-text understanding capabilities. Experiments on models from 340M to 1.3B parameters (trained on 15B and 100B tokens) show our method matches or exceeds full attention and native sparse attention in both common-sense reasoning and long-context understanding tasks.

artificial intelligence, machine learning, preprint, (19 more...)

arXiv.org Artificial Intelligence

2511.00819

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

FSA: An Alternative Efficient Implementation of Native Sparse Attention Kernel

Yan, Ran, Jiang, Youhe, Chen, Zhuoming, Mai, Haohui, Chen, Beidi, Yuan, Binhang

arXiv.org Artificial IntelligenceOct-14-2025

Recent advance in sparse attention mechanisms has demonstrated strong potential for reducing the computational cost of long-context training and inference in large language models (LLMs). Native Sparse Attention (NSA), one state-of-the-art approach, introduces natively trainable, hardware-aligned sparse attention that delivers substantial system-level performance boost while maintaining accuracy comparable to full attention. However, the kernel implementation of NSA forces a loop order that is only efficient with a relatively large number of query heads in each Grouped Query Attention (GQA) group, whereas existing LLMs widely adopt much smaller number of query heads in each GQA group -- such an inconsistency significantly limits the applicability of this sparse algorithmic advance. In this work, we propose Flash Sparse Attention (FSA), an alternative kernel implementation that enables efficient NSA computation across a wide range of popular LLMs with varied smaller number of heads in each GQA group on modern GPUs. Compared to vanilla NSA kernel implementation, our empirical evaluation demonstrates that FSA achieves (i) up to 3.5x and on average 1.6x kernel-level latency reduction, (ii) up to 1.25x and 1.09x on average end-to-end training speedup on state-of-the-art LLMs, and (iii) up to 1.36x and 1.11x on average for prefill-phase speedup in LLM generative inference. Github Repo at https://github.com/Relaxed-System-Lab/Flash-Sparse-Attention.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.18224

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

Zhao, Weilin, Zhou, Zihan, Su, Zhou, Xiao, Chaojun, Li, Yuxuan, Li, Yanghao, Zhang, Yudi, Zhao, Weilun, Li, Zhen, Huang, Yuxiang, Sun, Ao, Han, Xu, Liu, Zhiyuan

arXiv.org Artificial IntelligenceSep-30-2025

Long-sequence processing is a critical capability for modern large language models. However, the self-attention mechanism in the standard Transformer architecture faces severe computational and memory bottlenecks when processing long sequences. While trainable sparse attention methods offer a promising solution, existing approaches such as NSA introduce excessive extra parameters and disrupt the conventional \textit{pretrain-on-short, finetune-on-long} workflow, resulting in slow convergence and difficulty in acceleration. To overcome these limitations, we introduce dense-sparse switchable attention framework, termed as InfLLM-V2. InfLLM-V2 is a trainable sparse attention that seamlessly adapts models from short to long sequences. Specifically, InfLLM-V2 reuses dense attention parameters through parameter-free architecture modification, maintaining consistency between short and long sequence processing. Additionally, InfLLM-V2 ensures computational efficiency across all sequence lengths, by using dense attention for short inputs and smoothly transitioning to sparse attention for long sequences. To achieve practical acceleration, we further introduce an efficient implementation of InfLLM-V2 that significantly reduces the computational overhead. Our experiments on long-context understanding and chain-of-thought reasoning demonstrate that InfLLM-V2 is 4$\times$ faster than dense attention while retaining 98.1% and 99.7% of the performance, respectively. Based on the InfLLM-V2 framework, we have trained and open-sourced MiniCPM4.1 (https://huggingface.co/openbmb/MiniCPM4.1-8B), a hybrid reasoning model, providing a reproducible implementation for the research community.

large language model, machine learning, sparse attention, (16 more...)

arXiv.org Artificial Intelligence

2509.24663

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Cindy Cohn Is Leaving the EFF, but Not the Fight for Digital Rights

WIREDSep-9-2025, 21:00:00 GMT

After 25 years at the Electronic Frontier Foundation, Cindy Cohn is stepping down as executive director. In a WIRED interview, she reflects on encryption, AI, and why she's not ready to quit the battle. After a quarter century defending digital rights, Cindy Cohn announced on Tuesday that she is stepping down as executive director of the Electronic Frontier Foundation. Cohn, who has led the San Francisco-based nonprofit since 2015, says she will leave the role later this year, concluding a chapter that helped define the modern fight over online freedom. Cohn first rose to prominence as lead counsel in, the 1990s case that overturned federal restrictions on publishing encryption code. As EFF's legal director and later executive director, she guided the group through legal challenges to government surveillance, reforms to computer crime laws, and efforts to hold corporations accountable for data collection. Over the past decade, EFF has expanded its influence, becoming a central force in shaping the debate over privacy, security, and digital freedom. In an interview with WIRED, Cohn reflected on EFF's foundational encryption victories, its unfinished battles against National Security Agency (NSA) surveillance, and the organization's work protecting independent security researchers.

artificial intelligence, machine learning, natural language, (19 more...)

WIRED

Country:

North America > United States > California > San Francisco County > San Francisco (0.24)
North America > United States > Texas (0.04)
Asia > China (0.04)
(5 more...)

Genre: Personal > Interview (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.48)
Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

aa85e45da94cb0d78853c50ba636a15a-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 16:55:35 GMT

architecture, arxiv preprint arxiv, nsa, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets

Lapautre, Nicolas, Marchenko, Maria, Patiño, Carlos Miguel, Zhou, Xin

arXiv.org Artificial IntelligenceAug-15-2025

Unlocking the potential of transformers on datasets of large physical systems depends on overcoming the quadratic scaling of the attention mechanism. This work explores combining the Erwin architecture with the Native Sparse Attention (NSA) mechanism to improve the efficiency and receptive field of transformer models for large-scale physical systems, addressing the challenge of quadratic attention complexity. We adapt the NSA mechanism for non-sequential data, implement the Erwin NSA model, and evaluate it on three datasets from the physical sciences -- cosmology simulations, molecular dynamics, and air pressure modeling -- achieving performance that matches or exceeds that of the original Erwin model. Additionally, we reproduce the experimental results from the Erwin paper to validate their implementation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.10758

Country: Europe (0.15)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Vision (0.47)
Information Technology > Artificial Intelligence > Natural Language (0.47)

Add feedback