AITopics | Technology

Collaborating Authors

Technology

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

News Overviews Instructional Materials AI-Alerts Classics

Hallo Brötchen! Berlin Zoo welcomes baby pygmy hippo

More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Pygmy hippos like Brötchen are endangered in the wild. Breakthroughs, discoveries, and DIY tips sent six days a week. By signing up, you confirm you are 16+, will receive newsletters and promotional content and agree to our Terms of Use and acknowledge the data practices in our Privacy Policy . The pygmy hippopotamus () was born on May 9th at Germany's Zoo Berlin, weighing in at 13 pounds.

artificial intelligence, physics popular science video space, tchen, (7 more...)

Popular Science

Country: North America > United States (0.98)

Industry:

Leisure & Entertainment (0.98)
Law (0.67)
Information Technology > Security & Privacy (0.37)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

NSNQuant: ADouble Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache

Neural Information Processing SystemsJun-16-2026, 15:55:42 GMT

Large Language Model (LLM) inference is typically memory-intensive, especially when processing large batch sizes and long sequences, due to the large size of key-value (KV) cache. Vector Quantization (VQ) is recently adopted to alleviate this issue, but we find that the existing approach is susceptible to distribution shift due to its reliance on calibration datasets. To address this limitation, we introduce NSNQuant, a calibration-free Vector Quantization (VQ) technique designed for low-bit compression of the KV cache. By applying a three-step transformation--1) a token-wise normalization (Normalize), 2) a channel-wise centering (Shift), and 3) a second token-wise normalization (Normalize)--with Hadamard transform, NSNQuant effectively aligns the token distribution with the standard normal distribution. This alignment enables robust, calibration-free vector quantization using a single reusable codebook. Extensive experiments show that NSNQuant consistently outperforms prior methods in both 1-bit and 2-bit settings, offering strong generalization and up to 3 throughput gain over full-precision baselines.

large language model, machine learning, quantization, (20 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Channel Simulation and Distributed Compression with Ensemble Rejection Sampling

Neural Information Processing SystemsJun-16-2026, 15:54:41 GMT

We study channel simulation and distributed matching, two fundamental problems with several applications to machine learning, using a recently introduced generalization of the standard rejection sampling (RS) algorithm known as Ensemble Rejection Sampling (ERS). For channel simulation, we propose a new coding scheme based on ERS that achieves a near-optimal coding rate. In this process, we demonstrate that standard RS can also achieve a near-optimal coding rate and generalize the result of Braverman and Garg (2014) to the continuous alphabet setting. Next, as our main contribution, we present a distributed matching lemma for ERS, which serves as the rejection sampling counterpart to the Poisson Matching Lemma (PML) introduced by Li and Anantharam (2021). Our result also generalizes a recent work on importance matching lemma (Phan et al, 2024) and, to our knowledge, is the first result on distributed matching in the family of rejection sampling schemes where the matching probability is close to PML. We demonstrate the practical significance of our approach over prior works by applying it to distributed compression. The effectiveness of our proposed scheme is validated through experiments involving synthetic Gaussian sources and distributed image compression using the MNIST dataset.

artificial intelligence, machine learning, pby, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Gradient Multi-Normalization for Efficient LLMTraining

Neural Information Processing SystemsJun-16-2026, 15:53:23 GMT

Training large language models (LLMs) commonly relies on adaptive optimizers such as Adam (Kingma & Ba, 2015), which accelerate convergence through moment estimates but incur substantial memory overhead. Recent stateless approaches such as SWAN (Ma et al., 2024) have shown that appropriate preprocessing of instantaneous gradient matrices can match the performance of adaptive methods without storing optimizer states. Building on this insight, we introduce gradient multi-normalization, a principled framework for designing stateless optimizers that normalize gradients with respect to multiple norms simultaneously. Whereas standard first-order methods can be viewed as gradient normalization under a single norm (Bernstein & Newhouse, 2024), our formulation generalizes this perspective to a multi-norm setting. We derive an efficient alternating scheme that enforces these normalization constraints and show that our procedure can produce, up to an arbitrary precision, a fixed-point of the problem. This unifies and extends prior stateless optimizers, showing that SWAN arises as a specific instance with particular norm choices. Leveraging this principle, we develop SinkGD, a lightweight matrix optimizer that retains the memory footprint of SGD (w/o momentum) while substantially reducing computation relative to whitening-based methods. On the memory-efficient LLaMA training benchmark (Zhao et al., 2024a), SinkGD achieves state-of-the-art performance, reaching the same evaluation perplexity as Adam using only 40% of the training tokens.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.46)
Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

On the Hardness of Approximating Distributions with Tractable Probabilistic Models

Neural Information Processing SystemsJun-16-2026, 15:53:05 GMT

A fundamental challenge in probabilistic modeling is to balance expressivity and inference efficiency. Tractable probabilistic models (TPMs) aim to directly address this tradeoff by imposing constraints that guarantee efficient inference of certain queries while maintaining expressivity. In particular, probabilistic circuits (PCs) provide a unifying framework for many TPMs, by characterizing families of models as circuits satisfying different structural properties. Because the complexity of inference on PCs is a function of the circuit size, understanding the size requirements of different families of PCs is fundamental in mapping the trade-off between tractability and expressive efficiency. However, the study of expressive efficiency of circuits are often concerned with exact representations, which may not align with model learning, where we look to approximate the underlying data distribution closely by some distance measure.

approximation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol

Neural Information Processing SystemsJun-16-2026, 15:48:03 GMT

Holdout validation and hyperparameter tuning from data is a long-standing problem in offline reinforcement learning (RL). A standard framework is to use off-policy evaluation (OPE) methods to evaluate and select between different policies, but OPE methods either incur exponential variance (e.g., importance sampling) or have hyperparameters of their own (e.g., FQE and model-based). We focus on model selection for OPE itself, which is even more under-investigated. Concretely, we select among candidate value functions ("model-free") or dynamics models ("model-based") to best assess the performance of a target policy. We develop: (1) new model-free and model-based selectors with theoretical guarantees, and (2) a new experimental protocol for empirically evaluating them. Compared to the model-free protocol in prior works, our new protocol allows for more stable generation and better control of candidate value functions in an optimizationfree manner, and evaluation of model-free and model-based methods alike. We exemplify the protocol on Gym-Hopper, and find that our new model-free selector, LSTD-Tournament, demonstrates promising empirical performance.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Compositional Model Editing

Neural Information Processing SystemsJun-16-2026, 15:47:07 GMT

Model editing has become a de-facto practice to address hallucinations and outdated knowledge of large language models (LLMs). However, existing methods are predominantly evaluated in isolation, i.e., one edit at a time, failing to consider a critical scenario of compositional model editing, where multiple edits must be integrated and jointly utilized to answer real-world multifaceted questions. For instance, in medical domains, if one edit informs LLMs that COVID-19 causes "fever" and another that it causes "loss of taste", a qualified compositional editor should enable LLMs to answer the question "What are the symptoms of COVID-19?" with both "fever" and "loss of taste" (and potentially more). In this work, we define and systematically benchmark this compositional model editing (CME) task, identifying three key undesirable issues that existing methods struggle with: knowledge loss, incorrect preceding and knowledge sinking. To overcome these issues, we propose A3E, a novel compositional editor that (1) adaptively combines and adaptively regularizes pre-trained foundation knowledge in LLMs in the stage of edit training and (2) adaptively merges multiple edits to better meet compositional needs in the stage of edit composing. Extensive experiments demonstrate that A3E improves the composability by at least 22.45% without sacrificing the performance of non-compositional model editing.

artificial intelligence, large language model, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States > New York (0.93)
Europe > Germany (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.44)
Health & Medicine > Therapeutic Area > Immunology (0.44)
Health & Medicine > Epidemiology (0.44)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

3d3a9e085540c65dd3e5731361f9320e-Paper-Conference.pdf

Neural Information Processing SystemsJun-16-2026, 15:46:50 GMT

Instruction fine-tuning (IFT) has emerged as a ubiquitous strategy for specializing large language models (LLMs), yet it implicitly assumes a single, coherent "groundtruth" preference behind all human-written instructions. In practice, annotators differ in the styles, emphases, and granularities they prefer, introducing preference bias that can erode both robustness and generalization. We propose Dynamic Cross-Layer Preference Correction (DCPC), it couples (i) a preference-sensitive similarity estimator that detects mismatched instructional cues, (ii) cross-layer prefix alignment to reconcile semantic representations across transformer layers, and (iii) a lightweight Preference Correction Module (PCM) that dynamically adjusts hidden states to honor the inferred dominant preference. On five Super/GLUE tasks and the ALPACA set--plus six preference-shifted variants--DCPC boosts accuracy/F1-EM by 4.0-6.7 points and gpt-score by +0.7, while cutting inter-seed variance up to 35% on LlaMA-2 13B and Mistral-7B, setting a new state of the art for robust instruction tuning.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Turkey's Erdogan Is Running Out of Tricks

TIME - TechJun-16-2026, 15:45:00 GMT

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?

artificial intelligence, erdogan, turkey, (10 more...)

TIME - Tech

Country: Asia > Middle East > Republic of Türkiye (1.00)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > Asia Government > Middle East Government > Republic of Türkiye Government (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.43)

Add feedback

Median Selection with Noisy and Structural Information

Neural Information Processing SystemsJun-16-2026, 15:37:26 GMT

We study the problem of computing the exact median by leveraging side information to minimize costly, exact comparisons. We analyze this problem in two key settings: (1) using predictions from unreliable "weak" oracles, and (2) exploiting known structural information in the form of a partial order. In the classical setting, we introduce a modified LazySelect algorithm that combines weak comparisons with occasional strong comparisons through majority voting. We show that this hybrid strategy has near-linear running time and can achieve high-probability correctness using only sublinear strong comparisons, even when the weak oracle is only slightly better than random guessing. Our theoretical results hold under the persistent comparison model, where resampling will not amplify the probability of correctness. In the partially ordered setting, we generalize the notion of median to directed acyclic graphs (DAGs) and show that the complexity of median selection depends heavily on the DAG's width. We complement our analysis with extensive experiments on synthetic data.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Software (0.68)

Add feedback