AITopics

Neural Information Processing SystemsJun-10-2026, 03:08:18 GMT

Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search

Clustering-based nearest neighbor search algorithms partition points into shards to form an index, and search only a subset of shards to process a query. Even though search efficacy is heavily influenced by the algorithm that identifies the shards to probe, it has received little attention in the literature. We study routing in clustering-based maximum inner product search, which includes cosine similarity search. We unpack existing routers and notice the surprising role of optimism. We then take a page from the sequential decision making literature and formalize that insight following the principle of ``optimism in the face of uncertainty.'' In particular, we present a framework that incorporates the moments of the distribution of inner products within each shard to estimate the maximum inner product. We then develop a practical instance of our algorithm that uses only the first two moments to reach the same accuracy as state-of-the-art routers by probing up to $50\%$ fewer points on benchmark datasets without compromising efficiency. Our algorithm is also space-efficient: we design a sketch of the second moment whose size is independent of the number of points and requires $\mathcal{O}(1)$ vectors per shard.

artificial intelligence, natural language, proceedings, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.97)
Information Technology > Information Management > Search (0.63)
Information Technology > Artificial Intelligence > Natural Language (0.60)

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, Rohan Kadekodi

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Neural Information Processing SystemsFeb-11-2026, 10:06:17 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, dataset, graph, (16 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > India (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)

Neural Information Processing SystemsFeb-9-2026, 17:36:08 GMT

87f7ee4fdb57bdfd52179947211b7ebb-Supplemental.pdf

artificial intelligence, machine learning, probability, (18 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 17:36:04 GMT

AdaptiveMachineUnlearning

However,for sequences ofdeletions, most prior work inthe non-convexsetting gives valid guarantees only for sequences that are chosenindependently of the models that are published. If people choose to delete their data as a function of the published models (because they don't like what the models reveal about them, for example), then the update sequence isadaptive.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States > Colorado > Denver County > Denver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 07:46:54 GMT

UltraRE: Enhancing RecEraser for Recommendation Unlearning via Error Decomposition

As the state-of-the-art framework, i.e., RecEraser, naturally achieves full unlearning completeness,

data mining, machine learning, natural language, (19 more...)

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Greco, Davide, Rawlik, Konrad

Same model, better performance: the impact of shuffling on DNA Language Models benchmarking

arXiv.org Artificial IntelligenceDec-12-2025

Large Language Models are increasingly popular in genomics due to their potential to decode complex biological sequences. Hence, researchers require a standardized benchmark to evaluate DNA Language Models (DNA LMs) capabilities. However, evaluating DNA LMs is a complex task that intersects genomic's domain-specific challenges and machine learning methodologies, where seemingly minor implementation details can significantly compromise benchmark validity. We demonstrate this through BEND (Benchmarking DNA Language Models), where hardware-dependent hyperparameters -- number of data loading workers and buffer sizes -- create spurious performance variations of up to 4% for identical models. The problem stems from inadequate data shuffling interacting with domain specific data characteristics. Experiments with three DNA language models (HyenaDNA, DNABERT-2, ResNet-LM) show these artifacts affect both absolute performance and relative model rankings. We propose a simple solution: pre-shuffling data before storage eliminates hardware dependencies while maintaining efficiency. This work highlights how standard ML practices can interact unexpectedly with domain-specific data characteristics, with broader implications for benchmark design in specialized domains.

large language model, machine learning, natural language, (18 more...)

2510.12617

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

arXiv.org Artificial IntelligenceDec-5-2025

Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System Design

Anthony, Quentin, Tokpanov, Yury, Szot, Skyler, Rajagopal, Srivatsan, Medepalli, Praneeth, Golubeva, Anna, Shyam, Vasu, Washbourne, Robert, Iyer, Rishi, Chaurasia, Ansh, Figliolia, Tomas, Yang, Xiao, Sarje, Abhinav, Thorstensen, Drew, Pearson, Amartey, Grossbart, Zack, van Patten, Jason, Barsoum, Emad, Gu, Zhenyu, Fu, Yao, Millidge, Beren

We report on the first large-scale mixture-of-experts (MoE) pretraining study on pure AMD hardware, utilizing both MI300X GPUs and Pollara networking. We distill practical guidance for both systems and model design. On the systems side, we deliver a comprehensive cluster and networking characterization: microbenchmarks for all core collectives (all-reduce, reduce-scatter, all-gather, broadcast) across message sizes and GPU counts over Pollara. To our knowledge, this is the first at this scale. We further provide MI300X microbenchmarks on kernel sizing and memory bandwidth to inform model design. On the modeling side, we introduce and apply MI300X-aware transformer sizing rules for attention and MLP blocks and justify MoE widths that jointly optimize training throughput and inference latency. We describe our training stack in depth, including often-ignored utilities such as fault-tolerance and checkpoint-reshaping, as well as detailed information on our training recipe. We also provide a preview of our model architecture and base model - ZAYA1 (760M active, 8.3B total parameters MoE, available at https://huggingface.co/Zyphra/ZAYA1-base) - which will be further improved upon in forthcoming papers. ZAYA1-base achieves performance comparable to leading base models such as Qwen3-4B and Gemma3-12B at its scale and larger, and outperforms models including Llama-3-8B and OLMoE across reasoning, mathematics, and coding benchmarks. Together, these results demonstrate that the AMD hardware, network, and software stack are mature and optimized enough for competitive large-scale pretraining.

large language model, machine learning, natural language, (19 more...)

2511.17127

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Networks (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Rasheed, Mohsin, Al-Mamun, Abdullah

Provenance-Driven Reliable Semantic Medical Image Vector Reconstruction via Lightweight Blockchain-Verified Latent Fingerprints

arXiv.org Artificial IntelligenceDec-2-2025

Medical imaging is essential for clinical diagnosis, yet real-world data frequently suffers from corruption, noise, and potential tampering, challenging the reliability of AI-assisted interpretation. Conventional reconstruction techniques prioritize pixel-level recovery and may produce visually plausible outputs while compromising anatomical fidelity, an issue that can directly impact clinical outcomes. We propose a semantic-aware medical image reconstruction framework that integrates high-level latent embeddings with a hybrid U-Net architecture to preserve clinically relevant structures during restoration. To ensure trust and accountability, we incorporate a lightweight blockchain-based provenance layer using scale-free graph design, enabling verifiable recording of each reconstruction event without imposing significant overhead. Extensive evaluation across multiple datasets and corruption types demonstrates improved structural consistency, restoration accuracy, and provenance integrity compared with existing approaches. By uniting semantic-guided reconstruction with secure traceability, our solution advances dependable AI for medical imaging, enhancing both diagnostic confidence and regulatory compliance in healthcare environments.

artificial intelligence, machine learning, reconstruction, (19 more...)

2512.00999

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-20-2025

Selective Forgetting in Option Calibration: An Operator-Theoretic Gauss-Newton Framework

Özsoy, Ahmet Umur

Modern financial models are not static; they are recalibrated as market conditions change. Therefore calibrating parametric asset-pricing models to market data has always been an ongoing interest for both practitioners and academics in the field of mathematical finance. Risk management systems along with trading desks rely heavily on the repeated solutions of inverse problems aimed at calibrating and adjusting parameters θ so that the model-based prices m(x;θ) reproduce observed quotes to some extent of accuracy. Option-implied volatility surfaces evolve minute by minute, and model parameters such as mean reversion, volatility of volatility, or correlation etc. are adapted to new market information.

artificial intelligence, calibration, machine learning, (17 more...)

2511.1498

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.66)