AITopics | alibi

Collaborating Authors

alibi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BMemorization: Formal treatment To empirically bound the level εof DP, prior work instantiates a general membership inference game, defined in Figure 2 for two arbitrary neighboring datasets D0 and D1

Neural Information Processing SystemsApr-25-2026, 11:39:18 GMT

Tables 3 and 4 summarize hyperparameters for PATE-FM and ALIBI respectively. Table 3: PATE-FM (Algorithms 1 and 2) hyperparameters for select accuracy levels. To empirically bound the level εof DP, prior work instantiates a general membership inference game, defined in Figure 2 for two arbitrary neighboring datasets D0 and D1. By repeating this game multiple times, we can estimate the adversary's success rate and convert this into a lower bound on ε. This would be prohibitively expensive in our setting (each iteration of the game requires training a model on CIFAR-10 or CIFAR-100, and the game has to be repeated about 1,000 times to get 13 non-trivial bounds).

adversary, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.30)

Industry: Leisure & Entertainment > Games (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

37a413841a614b5414b333585e7613b8-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 08:06:34 GMT

artificial intelligence, kernel, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

37ecd27608480aa3569a511a638ca74f-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 05:55:45 GMT

adversary, confidence interval, game 3, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AntipodesofLabelDifferentialPrivacy: PATEandALIBI

Neural Information Processing SystemsFeb-8-2026, 05:55:41 GMT

A prominent example of label-only privacy is in online advertising, where the goal is to predict conversionofanadimpression(thelabel)givenauser'sprofileandthespot'scontext(thefeatures).

artificial intelligence, inductive learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Security & Privacy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

The Impact of Positional Encoding on Length Generalization in Transformers

Neural Information Processing SystemsDec-25-2025, 04:10:49 GMT

Length generalization, the ability to generalize from small training context sizes to larger ones, is a critical challenge in the development of Transformer-based language models. Positional encoding (PE) has been identified as a major factor influencing length generalization, but the exact impact of different PE schemes on extrapolation in downstream tasks remains unclear. In this paper, we conduct a systematic empirical study comparing the length generalization performance of decoder-only Transformers with five different position encoding approaches including Absolute Position Embedding (APE), T5's Relative PE, ALiBi, and Rotary, in addition to Transformers without positional encoding (NoPE). Our evaluation encompasses a battery of reasoning and mathematical tasks. Our findings reveal that the most commonly used positional encoding methods, such as ALiBi, Rotary, and APE, are not well suited for length generalization in downstream tasks. More importantly, NoPE outperforms other explicit positional encoding methods while requiring no additional computation. We theoretically demonstrate that NoPE can represent both absolute and relative PEs, but when trained with SGD, it mostly resembles T5's relative PE attention patterns. Finally, we find that scratchpad is not always helpful to solve length generalization and its format highly impacts the model's performance. Overall, our work suggests that explicit position embeddings are not essential for decoder-only Transformers to generalize well to longer sequences.

length generalization, positional encoding, transformer, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Towards Personalized Treatment Plan: Geometrical Model-Agnostic Approach to Counterfactual Explanations

Sin, Daniel, Toutounchian, Milad

arXiv.org Machine LearningNov-17-2025

In our article, we describe a method for generating counterfactual explanations in high-dimensional spaces using four steps that involve fitting our dataset to a model, finding the decision boundary, determining constraints on the problem, and computing the closest point (counterfactual explanation) from that boundary. We propose a discretized approach where we find many discrete points on the boundary and then identify the closest feasible counterfactual explanation. This method, which we later call $\textit{Segmented Sampling for Boundary Approximation}$ (SSBA), applies binary search to find decision boundary points and then searches for the closest boundary point. Across four datasets of varying dimensionality, we show that our method can outperform current methods for counterfactual generation with reductions in distance between $5\%$ to $50\%$ in terms of the $L_2$ norm. Our method can also handle real-world constraints by restricting changes to immutable and categorical features, such as age, gender, sex, height, and other related characteristics such as the case for a health-based dataset. In terms of runtime, the SSBA algorithm generates decision boundary points on multiple orders of magnitude in the same given time when we compare to a grid-based approach. In general, our method provides a simple and effective model-agnostic method that can compute nearest feasible (i.e. realistic with constraints) counterfactual explanations. All of our results and code are available at: https://github.com/dsin85691/SSBA_For_Counterfactuals

boundary point, machine learning, natural language, (13 more...)

arXiv.org Machine Learning

doi: 10.48550/arXiv.2510.22911

2510.22911

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

Bianchessi, Arthur S., Aguirre, Yasmin C., Barros, Rodrigo C., Kupssinskü, Lucas S.

arXiv.org Artificial IntelligenceSep-26-2025

Effective PE is vital, particularly for enabling LMs trained on shorter contexts to generalize to significantly longer sequences during inference--a desirable capability known as context length extrapolation. Several PE methods have been proposed to facilitate context length extrapolation, including Sinusoidal embeddings (V aswani, 2017), RoPE (Su et al., 2024), ALiBi (Press et al., 2022), and even the omission Bayesian attention mechanism, hereby called BAM. 2.1 B This dependency is trivially modeled by a scalar Z when the scoring function is additive, as detailed below. If the scoring function of the attention mechanism is additive, i.e., of the form With Theorem 1, we can frame positional encoding as priors to BAM. Lemma 2. ALiBi is a special case of BAM prior where the token position distribution comprises Lemma 3. ALiBi becomes local attention as the relative length |j i| increases. See Appendix B.1, B.2, and B.3. 2.3 A PE We call this new PE method GGD-BAM.

context length, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2505.22842

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

37a413841a614b5414b333585e7613b8-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 05:54:34 GMT

dataset, kernel, positive definite function, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluation of Coding Schemes for Transformer-based Gene Sequence Modeling

Gong, Chenlei, Tian, Yuanhe, Mao, Lei, Song, Yan

arXiv.org Artificial IntelligenceJul-22-2025

Currently, many studies view DNA sequences as a special type of language and utilize Transformers to model them. These studies use fixed-length k-mer segmentation and BPE subword tokenization but lack a systematic evaluation to determine which is superior. We compare k-mer segmentation with k=1,3,4,5,6, a 4,096-token BPE vocabulary, and three positional encoding methods--sinusoidal, AliBi, and RoPE. Each configuration is trained from scratch in 3, 6, 12, and 24-layer Transformer encoders and evaluated on GUE benchmark dataset. In general, BPE delivers higher and more stable performance across tasks by compressing frequent motifs into variable-length tokens, reducing sequence length, and improving model generalization. RoPE excels at capturing periodic motifs and extrapolating to long sequences, while AliBi also performs well on tasks driven by local dependencies. In terms of depth, we observe significant gains when increasing layers from 3 to 12, with only marginal improvements or slight overfitting at 24 layers. This study provides practical guidance for designing tokenization and positional encoding in DNA Transformer models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.15087

Genre: Research Report > New Finding (0.68)

Industry: