AITopics | negative sampling

Collaborating Authors

negative sampling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Better Evaluation for Dynamic Link Prediction

Neural Information Processing SystemsApr-27-2026, 23:07:47 GMT

Despite the prevalence of recent success in learning from static graphs, learning from time-evolving graphs remains an open challenge. In this work, we design new, more stringent evaluation procedures for link prediction specific to dynamic graphs, which reflect real-world considerations, to better compare the strengths and weaknesses of methods. First, we create two visualization techniques to understand the reoccurring patterns of edges over time and show that many edges reoccur at later time steps. Based on this observation, we propose a pure memorization-based baseline called EdgeBank. EdgeBank achieves surprisingly strong performance across multiple settings which highlights that the negative edges used in the current evaluation are easy. To sample more challenging negative edges, we introduce two novel negative sampling strategies that improve robustness and better match real-world applications. Lastly, we introduce six new dynamic graph datasets from a diverse set of domains missing from current benchmarks, providing new challenges and opportunities for future research. Our code repository is accessible at https://github.com/fpour/DGB.git.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre:

Research Report (0.93)
Overview (0.93)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic

Neural Information Processing SystemsDec-23-2025, 18:06:07 GMT

Graph embedding, which represents real-world entities in a mathematical space, has enabled numerous applications such as analyzing natural languages, social networks, biochemical networks, and knowledge bases.It has been experimentally shown that graph embedding in hyperbolic space can represent hierarchical tree-like data more effectively than embedding in linear space, owing to hyperbolic space's exponential growth property. However, since the theoretical comparison has been limited to ideal noiseless settings, the potential for the hyperbolic space's property to worsen the generalization error for practical data has not been analyzed.In this paper, we provide a generalization error bound applicable for graph embedding both in linear and hyperbolic spaces under various negative sampling settings that appear in graph embedding. Our bound states that error is polynomial and exponential with respect to the embedding space's radius in linear and hyperbolic spaces, respectively, which implies that hyperbolic space's exponential growth property worsens the error.Using our bound, we clarify the data size condition on which graph embedding in hyperbolic space can represent a tree better than in Euclidean space by discussing the bias-variance trade-off.Our bound also shows that imbalanced data distribution, which often appears in graph embedding, can worsen the error.

generalization bound, hyperbolic space, negative sampling, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.76)

Add feedback

Causal Negative Sampling via Diffusion Model for Out-of-Distribution Recommendation

Zhao, Chu, Yang, Eneng, Dang, Yizhou, Zhao, Jianzhe, Guo, Guibing, Wang, Xingwei

arXiv.org Artificial IntelligenceAug-12-2025

Heuristic negative sampling enhances recommendation performance by selecting negative samples of varying hardness levels from predefined candidate pools to guide the model toward learning more accurate decision boundaries. However, our empirical and theoretical analyses reveal that unobserved environmental confounders (e.g., exposure or popularity biases) in candidate pools may cause heuristic sampling methods to introduce false hard negatives (FHNS). These misleading samples can encourage the model to learn spurious correlations induced by such confounders, ultimately compromising its generalization ability under distribution shifts. To address this issue, we propose a novel method named Causal Negative Sampling via Diffusion (CNSDiff). By synthesizing negative samples in the latent space via a conditional diffusion process, CNSDiff avoids the bias introduced by predefined candidate pools and thus reduces the likelihood of generating FHNS. Moreover, it incorporates a causal regularization term to explicitly mitigate the influence of environmental confounders during the negative sampling process, leading to robust negatives that promote out-of-distribution (OOD) generalization. Comprehensive experiments under four representative distribution shift scenarios demonstrate that CNSDiff achieves an average improvement of 13.96% across all evaluation metrics compared to state-of-the-art baselines, verifying its effectiveness and robustness in OOD recommendation tasks.

artificial intelligence, machine learning, negative sample, (15 more...)

arXiv.org Artificial Intelligence

2508.07243

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

ChemHGNN: A Hierarchical Hypergraph Neural Network for Reaction Virtual Screening and Discovery

Huang, Xiaobao, Ma, Yihong, Gurajapu, Anjali, Schleinitz, Jules, Guo, Zhichun, Reisman, Sarah E., Chawla, Nitesh V.

arXiv.org Artificial IntelligenceJun-16-2025

Reaction virtual screening and discovery are fundamental challenges in chemistry and materials science, where traditional graph neural networks (GNNs) struggle to model multi-reactant interactions. In this work, we propose ChemHGNN, a hypergraph neural network (HGNN) framework that effectively captures high-order relationships in reaction networks. Unlike GNNs, which require constructing complete graphs for multi-reactant reactions, ChemHGNN naturally models multi-reactant reactions through hyperedges, enabling more expressive reaction representations. To address key challenges, such as combinatorial explosion, model collapse, and chemically invalid negative samples, we introduce a reaction center-aware negative sampling strategy (RCNS) and a hierarchical embedding approach combining molecule, reaction and hypergraph level features. Experiments on the USPTO dataset demonstrate that ChemHGNN significantly outperforms HGNN and GNN baselines, particularly in large-scale settings, while maintaining interpretability and chemical plausibility. Our work establishes HGNNs as a superior alternative to GNNs for reaction virtual screening and discovery, offering a chemically informed framework for accelerating reaction discovery.

artificial intelligence, machine learning, reaction, (16 more...)

arXiv.org Artificial Intelligence

2506.11041

Genre: Research Report (0.50)

Industry: Law > Intellectual Property & Technology Law (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MixDec Sampling: A Soft Link-based Sampling Method of Graph Neural Network for Recommendation

Xie, Xiangjin, Chen, Yuxin, Wang, Ruipeng, Ouyang, Kai, Zhang, Zihan, Zheng, Hai-Tao, Qian, Buyue, Zheng, Hansen, Hu, Bo, Zhuo, Chengxiang, Li, Zang

arXiv.org Artificial IntelligenceFeb-12-2025

Graph neural networks have been widely used in recent recommender systems, where negative sampling plays an important role. Existing negative sampling methods restrict the relationship between nodes as either hard positive pairs or hard negative pairs. This leads to the loss of structural information, and lacks the mechanism to generate positive pairs for nodes with few neighbors. To overcome limitations, we propose a novel soft link-based sampling method, namely MixDec Sampling, which consists of Mixup Sampling module and Decay Sampling module. The Mixup Sampling augments node features by synthesizing new nodes and soft links, which provides sufficient number of samples for nodes with few neighbors. The Decay Sampling strengthens the digestion of graph structure information by generating soft links for node embedding learning. To the best of our knowledge, we are the first to model sampling relationships between nodes by soft links in GNN-based recommender systems. Extensive experiments demonstrate that the proposed MixDec Sampling can significantly and consistently improve the recommendation performance of several representative GNN-based models on various recommendation benchmarks.

artificial intelligence, machine learning, sampling, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDM54844.2022.00070

2502.08161

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Enhancing Link Prediction with Fuzzy Graph Attention Networks and Dynamic Negative Sampling

Xing, Jinming, Xing, Ruilin

arXiv.org Artificial IntelligenceNov-21-2024

Link prediction is crucial for understanding complex networks but traditional Graph Neural Networks (GNNs) often rely on random negative sampling, leading to suboptimal performance. This paper introduces Fuzzy Graph Attention Networks (FGAT), a novel approach integrating fuzzy rough sets for dynamic negative sampling and enhanced node feature aggregation. Fuzzy Negative Sampling (FNS) systematically selects high-quality negative edges based on fuzzy similarities, improving training efficiency. FGAT layer incorporates fuzzy rough set principles, enabling robust and discriminative node representations. Experiments on two research collaboration networks demonstrate FGAT's superior link prediction accuracy, outperforming state-of-the-art baselines by leveraging the power of fuzzy rough sets for effective negative sampling and node feature learning.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.07482

Country: North America > United States > North Carolina (0.04)

Genre:

Research Report (0.84)
Overview > Innovation (0.34)

Industry:

Information Technology > Security & Privacy (0.95)
Health & Medicine (0.69)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models

Prakash, Arushi, Bermperidis, Dimitrios, Chennu, Srivas

arXiv.org Artificial IntelligenceOct-29-2024

Large-scale industrial recommendation models predict the most relevant items from catalogs containing millions or billions of options. To train these models efficiently, a small set of irrelevant items (negative samples) is selected from the vast catalog for each relevant item (positive example), helping the model distinguish between relevant and irrelevant items. Choosing the right negative sampling method is a common challenge. We address this by implementing and comparing various negative sampling methods - random, popularity-based, in-batch, mixed, adaptive, and adaptive with mixed variants - on modern sequential recommendation models. Our experiments, including hyperparameter optimization and 20x repeats on three benchmark datasets with varying popularity biases, show how the choice of method and dataset characteristics impact key model performance metrics. We also reveal that average performance metrics often hide imbalances across popularity bands (head, mid, tail). We find that commonly used random negative sampling reinforces popularity bias and performs best for head items. Popularity-based methods (in-batch and global popularity negative sampling) can offer balanced performance at the cost of lower overall model performance results. Our study serves as a practical guide to the trade-offs in selecting a negative sampling method for large-scale sequential recommendation models. Code, datasets, experimental results and hyperparameters are available at: https://github.com/apple/ml-negative-sampling.

dataset, proceedings, recommendation, (16 more...)

arXiv.org Artificial Intelligence

2410.17276

Country: Europe > Italy > Apulia > Bari (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.31)

Add feedback

Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic

Neural Information Processing SystemsOct-9-2024, 11:30:55 GMT

generalization bound, hyperbolic space, linear vs hyperbolic, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.80)

Add feedback

Negative Sampling in Knowledge Graph Representation Learning: A Review

Madushanka, Tiroshan, Ichise, Ryutaro

arXiv.org Artificial IntelligenceFeb-29-2024

Knowledge graph representation learning (KGRL) or knowledge graph embedding (KGE) plays a crucial role in AI applications for knowledge construction and information exploration. These models aim to encode entities and relations present in a knowledge graph into a lower-dimensional vector space. During the training process of KGE models, using positive and negative samples becomes essential for discrimination purposes. However, obtaining negative samples directly from existing knowledge graphs poses a challenge, emphasizing the need for effective generation techniques. The quality of these negative samples greatly impacts the accuracy of the learned embeddings, making their generation a critical aspect of KGRL. This comprehensive survey paper systematically reviews various negative sampling (NS) methods and their contributions to the success of KGRL. Their respective advantages and disadvantages are outlined by categorizing existing NS methods into five distinct categories. Moreover, this survey identifies open research questions that serve as potential directions for future investigations. By offering a generalization and alignment of fundamental NS concepts, this survey provides valuable insights for designing effective NS methods in the context of KGRL and serves as a motivating force for further advancements in the field.

graph, knowledge graph, negative sample, (10 more...)

arXiv.org Artificial Intelligence

2402.19195

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (0.48)
Overview > Innovation (0.46)
Research Report > Promising Solution (0.46)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Mitigating Pooling Bias in E-commerce Search via False Negative Estimation

Wang, Xiaochen, Xiao, Xiao, Zhang, Ruhan, Zhang, Xuan, Na, Taesik, Tenneti, Tejaswi, Wang, Haixun, Ma, Fenglong

arXiv.org Artificial IntelligenceNov-18-2023

Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.

false negative estimation, instacart, query, (11 more...)

arXiv.org Artificial Intelligence

2311.06444

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Pennsylvania (0.04)
Oceania > Australia > Western Australia > Perth (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Services > e-Commerce Services (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback