AITopics | maarten

Collaborating Authors

maarten

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CLAX: Fast and Flexible Neural Click Models in JAX

Hager, Philipp, Zoeter, Onno, de Rijke, Maarten

arXiv.org Artificial IntelligenceNov-6-2025

CLAX is a JAX-based library that implements classic click models using modern gradient-based optimization. While neural click models have emerged over the past decade, complex click models based on probabilistic graphical models (PGMs) have not systematically adopted gradient-based optimization, preventing practitioners from leveraging modern deep learning frameworks while preserving the interpretability of classic models. CLAX addresses this gap by replacing EM-based optimization with direct gradient-based optimization in a numerically stable manner. The framework's modular design enables the integration of any component, from embeddings and deep networks to custom modules, into classic click models for end-to-end optimization. We demonstrate CLAX's efficiency by running experiments on the full Baidu-ULTR dataset comprising over a billion user sessions in $\approx$ 2 hours on a single GPU, orders of magnitude faster than traditional EM approaches. CLAX implements ten classic click models, serving both industry practitioners seeking to understand user behavior and improve ranking performance at scale and researchers developing new click models. CLAX is available at: https://github.com/philipphager/clax

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.0362

Country: Europe > Netherlands (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Add feedback

Proximal Ranking Policy Optimization for Practical Safety in Counterfactual Learning to Rank

Gupta, Shashank, Oosterhuis, Harrie, de Rijke, Maarten

arXiv.org Artificial IntelligenceSep-15-2024

Counterfactual learning to rank (CLTR) can be risky and, in various circumstances, can produce sub-optimal models that hurt performance when deployed. Safe CLTR was introduced to mitigate these risks when using inverse propensity scoring to correct for position bias. However, the existing safety measure for CLTR is not applicable to state-of-the-art CLTR methods, cannot handle trust bias, and relies on specific assumptions about user behavior. We propose a novel approach, proximal ranking policy optimization (PRPO), that provides safety in deployment without assumptions about user behavior. PRPO removes incentives for learning ranking behavior that is too dissimilar to a safe ranking model. Thereby, PRPO imposes a limit on how much learned models can degrade performance metrics, without relying on any specific user assumptions. Our experiments show that PRPO provides higher performance than the existing safe inverse propensity scoring approach. PRPO always maintains safety, even in maximally adversarial situations. By avoiding assumptions, PRPO is the first method with unconditional safety in deployment that translates to robust safety for real-world applications.

counterfactual learning, harrie oosterhuis, learning, (13 more...)

arXiv.org Artificial Intelligence

2409.09881

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Finland (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank

Gupta, Shashank, Oosterhuis, Harrie, de Rijke, Maarten

arXiv.org Artificial IntelligenceAug-6-2024

Counterfactual learning to rank (CLTR) can be risky and, in various circumstances, can produce sub-optimal models that hurt performance when deployed. Safe CLTR was introduced to mitigate these risks when using inverse propensity scoring to correct for position bias. However, the existing safety measure for CLTR is not applicable to state-of-the-art CLTR methods, cannot handle trust bias, and relies on specific assumptions about user behavior. Our contributions are two-fold. First, we generalize the existing safe CLTR approach to make it applicable to state-of-the-art doubly robust CLTR and trust bias. Second, we propose a novel approach, proximal ranking policy optimization (PRPO), that provides safety in deployment without assumptions about user behavior. PRPO removes incentives for learning ranking behavior that is too dissimilar to a safe ranking model. Thereby, PRPO imposes a limit on how much learned models can degrade performance metrics, without relying on any specific user assumptions. Our experiments show that both our novel safe doubly robust method and PRPO provide higher performance than the existing safe inverse propensity scoring approach. However, in unexpected circumstances, the safe doubly robust approach can become unsafe and bring detrimental performance. In contrast, PRPO always maintains safety, even in maximally adversarial situations. By avoiding assumptions, PRPO is the first method with unconditional safety in deployment that translates to robust safety for real-world applications.

harrie oosterhuis, learning, prpo, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3627673.3679531

2407.19943

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset

Hager, Philipp, Deffayet, Romain, Renders, Jean-Michel, Zoeter, Onno, de Rijke, Maarten

arXiv.org Artificial IntelligenceMay-15-2024

Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user clicks, which are often biased by the ranker collecting the data. While theoretically justified and extensively tested in simulation, ULTR techniques lack empirical validation, especially on modern search engines. The Baidu-ULTR dataset released for the WSDM Cup 2023, collected from Baidu's search engine, offers a rare opportunity to assess the real-world performance of prominent ULTR techniques. Despite multiple submissions during the WSDM Cup 2023 and the subsequent NTCIR ULTRE-2 task, it remains unclear whether the observed improvements stem from applying ULTR or other learning techniques. In this work, we revisit and extend the available experiments on the Baidu-ULTR dataset. We find that standard unbiased learning-to-rank techniques robustly improve click predictions but struggle to consistently improve ranking performance, especially considering the stark differences obtained by choice of ranking loss and query-document features. Our experiments reveal that gains in click prediction do not necessarily translate to enhanced ranking performance on expert relevance annotations, implying that conclusions strongly depend on how success is measured in this benchmark.

dataset, learning, proceedings, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657892

2404.02543

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Recent Advances in the Foundations and Applications of Unbiased Learning to Rank

Gupta, Shashank, Hager, Philipp, Huang, Jin, Vardasbi, Ali, Oosterhuis, Harrie

arXiv.org Artificial IntelligenceMay-4-2023

Since its inception, the field of unbiased learning to rank (ULTR) has remained very active and has seen several impactful advancements in recent years. This tutorial provides both an introduction to the core concepts of the field and an overview of recent advancements in its foundations along with several applications of its methods. The tutorial is divided into four parts: Firstly, we give an overview of the different forms of bias that can be addressed with ULTR methods. Secondly, we present a comprehensive discussion of the latest estimation techniques in the ULTR field. Thirdly, we survey published results of ULTR in real-world applications. Fourthly, we discuss the connection between ULTR and fairness in ranking. We end by briefly reflecting on the future of ULTR research and its applications. This tutorial is intended to benefit both researchers and industry practitioners who are interested in developing new ULTR solutions or utilizing them in real-world applications.

artificial intelligence, machine learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3539618.3594247

2305.02914

Country:

Europe > Netherlands > North Holland > Amsterdam (0.06)
Asia > Taiwan > Taiwan Province > Taipei (0.05)
North America > United States > New York > New York County > New York City (0.05)
(3 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.30)

Add feedback

Safe Deployment for Counterfactual Learning to Rank with Exposure-Based Risk Minimization

Gupta, Shashank, Oosterhuis, Harrie, de Rijke, Maarten

arXiv.org Artificial IntelligenceApr-26-2023

Counterfactual learning to rank (CLTR) relies on exposure-based inverse propensity scoring (IPS), a LTR-specific adaptation of IPS to correct for position bias. While IPS can provide unbiased and consistent estimates, it often suffers from high variance. Especially when little click data is available, this variance can cause CLTR to learn sub-optimal ranking behavior. Consequently, existing CLTR methods bring significant risks with them, as naively deploying their models can result in very negative user experiences. We introduce a novel risk-aware CLTR method with theoretical guarantees for safe deployment. We apply a novel exposure-based concept of risk regularization to IPS estimation for LTR. Our risk regularization penalizes the mismatch between the ranking behavior of a learned model and a given safe model. Thereby, it ensures that learned ranking models stay close to a trusted model, when there is high uncertainty in IPS estimation, which greatly reduces the risks during deployment. Our experimental results demonstrate the efficacy of our proposed method, which is effective at avoiding initial periods of bad performance when little data is available, while also maintaining high performance at convergence. For the CLTR field, our novel exposure-based risk minimization method enables practitioners to adopt CLTR methods in a safer manner that mitigates many of the risks attached to previous methods.

artificial intelligence, learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3539618.3591760

2305.01522

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Topic Modeling with BERTopic - Talking Language AI Ep#1

#artificialintelligenceNov-28-2022, 11:00:54 GMT

In the first episode of the Talking Language AI series, I spoke with Maarten Grootendorst, author and maintainer of the BERTopic open source package (over 3,000 stars on Github). BERTopic is used to explore collections of text to spot trends and identify the topics in these texts. This is an NLP task called Topic Modeling. It's also embedded in the bottom of this overview. Feel free to post questions or comments in this thread in the Cohere Discord.

bertopic, maarten, topic modeling, (6 more...)

#artificialintelligence

Genre:

Personal > Interview (0.58)
Overview (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.54)

Add feedback

Using AlphaFold to find complex protein knots

AIHubAug-11-2022, 13:22:21 GMT

A complex protein knot with seven crossings (left) predicted by AlphaFold and a simplified representation (right). The question of how the chemical composition of a protein, the amino acid sequence, determines its 3D structure has been one of the biggest challenges in biophysics for more than half a century. This knowledge about the so-called "folding" of proteins is in great demand, as it contributes significantly to the understanding of various diseases and their treatment, among other things. For these reasons, Google's DeepMind research team has developed AlphaFold, an artificial intelligence that predicts 3D structures. A team consisting of researchers from Johannes Gutenberg University Mainz (JGU) and the University of California, Los Angeles, has now taken a closer look at these structures and examined them with respect to knots. We know knots primarily from shoelaces and cables, but they also occur on the nanoscale in our cells.

alphafold, complex protein knot, protein knot, (9 more...)

AIHub

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.59)
Europe > Germany > Rheinland-Pfalz > Mainz (0.29)

Technology: Information Technology > Artificial Intelligence (0.84)

Add feedback

Explaining Predictions from Tree-based Boosting Ensembles

Lucic, Ana, Haned, Hinda, de Rijke, Maarten

arXiv.org Artificial IntelligenceJul-4-2019

Understanding how "black-box" models arrive at their predictions has sparked significant interest from both within and outside the AI community. Our work focuses on doing this by generating local explanations about individual predictions for tree-based ensembles, specifically Gradient Boosting Decision Trees (GBDTs). Given a correctly predicted instance in the training set, we wish to generate a counterfactual explanation for this instance, that is, the minimal perturbation of this instance such that the prediction flips to the opposite class. Most existing methods for counterfactual explanations are (1) model-agnostic, so they do not take into account the structure of the original model, and/or (2) involve building a surrogate model on top of the original model, which is not guaranteed to represent the original model accurately. There exists a method specifically for random forests; we wish to extend this method for GBDTs. This involves accounting for (1) the sequential dependency between trees and (2) training on the negative gradients instead of the original labels.

machine learning, natural language, prediction, (19 more...)

arXiv.org Artificial Intelligence

1907.02582

Country:

Europe > Netherlands (0.16)
Europe > France (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.54)

Add feedback