AITopics | van Krieken, Emile

Collaborating Authors

van Krieken, Emile

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Noise to the Rescue: Escaping Local Minima in Neurosymbolic Local Search

Daniele, Alessandro, van Krieken, Emile

arXiv.org Artificial IntelligenceMar-3-2025

Deep learning has achieved remarkable success across various domains, largely thanks to the efficiency of backpropagation (BP). However, BP's reliance on differentiability poses challenges in neurosymbolic learning, where discrete computation is combined with neural models. We show that applying BP to Godel logic, which represents conjunction and disjunction as min and max, is equivalent to a local search algorithm for SAT solving, enabling the optimisation of discrete Boolean formulas without sacrificing differentiability. However, deterministic local search algorithms get stuck in local optima. Therefore, we propose the Godel Trick, which adds noise to the model's logits to escape local optima. We evaluate the Godel Trick on SATLIB, and demonstrate its ability to solve a broad range of SAT problems. Additionally, we apply it to neurosymbolic models and achieve state-of-the-art performance on Visual Sudoku, all while avoiding expensive probabilistic reasoning. These results highlight the Godel Trick's potential as a robust, scalable approach for integrating symbolic reasoning with neural architectures.

artificial intelligence, fuzzy logic, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.01817

Country:

North America > Canada (0.14)
Europe > United Kingdom (0.14)
Europe > Italy (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Self-Training Large Language Models for Tool-Use Without Demonstrations

Luo, Ne, Gema, Aryo Pradipta, He, Xuanli, van Krieken, Emile, Lesci, Pietro, Minervini, Pasquale

arXiv.org Artificial IntelligenceFeb-9-2025

Large language models (LLMs) remain prone to factual inaccuracies and computational errors, including hallucinations and mistakes in mathematical reasoning. Recent work augmented LLMs with tools to mitigate these shortcomings, but often requires curated gold tool-use demonstrations. In this paper, we investigate whether LLMs can learn to use tools without demonstrations. First, we analyse zero-shot prompting strategies to guide LLMs in tool utilisation. Second, we propose a self-training method to synthesise tool-use traces using the LLM itself. We compare supervised fine-tuning and preference fine-tuning techniques for fine-tuning the model on datasets constructed using existing Question Answering (QA) datasets, i.e., TriviaQA and GSM8K. Experiments show that tool-use enhances performance on a long-tail knowledge task: 3.7% on PopQA, which is used solely for evaluation, but leads to mixed results on other datasets, i.e., TriviaQA, GSM8K, and NQ-Open. Our findings highlight the potential and challenges of integrating external tools into LLMs without demonstrations.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.05867

Country:

Europe (0.67)
North America > Canada (0.28)
North America > United States (0.28)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mixtures of In-Context Learners

Hong, Giwon, van Krieken, Emile, Ponti, Edoardo, Malkin, Nikolay, Minervini, Pasquale

arXiv.org Artificial IntelligenceNov-5-2024

In-context learning (ICL) adapts LLMs by providing demonstrations without fine-tuning the model parameters; however, it does not differentiate between demonstrations and quadratically increases the complexity of Transformer LLMs, exhausting the memory. As a solution, we propose Mixtures of In-Context Learners (MoICL), a novel approach to treat subsets of demonstrations as experts and learn a weighting function to merge their output distributions based on a training set. In our experiments, we show performance improvements on 5 out of 7 classification datasets compared to a set of strong baselines (up to +13\% compared to ICL and LENS). Moreover, we enhance the Pareto frontier of ICL by reducing the inference time needed to achieve the same performance with fewer demonstrations. Finally, MoICL is more robust to out-of-domain (up to +11\%), imbalanced (up to +49\%), or noisy demonstrations (up to +38\%) or can filter these out from datasets. Overall, MoICL is a more expressive approach to learning from demonstrations without exhausting the context window or memory.

demonstration, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.0283

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Are We Done with MMLU?

Gema, Aryo Pradipta, Leang, Joshua Ong Jun, Hong, Giwon, Devoto, Alessio, Mancino, Alberto Carlo Maria, Saxena, Rohit, He, Xuanli, Zhao, Yu, Du, Xiaotang, Madani, Mohammad Reza Ghasemi, Barale, Claire, McHardy, Robert, Harris, Joshua, Kaddour, Jean, van Krieken, Emile, Minervini, Pasquale

arXiv.org Artificial IntelligenceJun-7-2024

We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive framework for identifying dataset errors using a novel error taxonomy. Then, we create MMLU-Redux, which is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects. Using MMLU-Redux, we demonstrate significant discrepancies with the model performance metrics that were originally reported. Our results strongly advocate for revising MMLU's error-ridden questions to enhance its future utility and reliability as a benchmark.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.04127

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Education > Curriculum > Subject-Specific Education (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Independence Assumption in Neurosymbolic Learning

van Krieken, Emile, Minervini, Pasquale, Ponti, Edoardo M., Vergari, Antonio

arXiv.org Machine LearningApr-12-2024

State-of-the-art neurosymbolic learning systems use probabilistic reasoning to guide neural networks towards predictions that conform to logical constraints over symbols. Many such systems assume that the probabilities of the considered symbols are conditionally independent given the input to simplify learning and reasoning. We study and criticise this assumption, highlighting how it can hinder optimisation and prevent uncertainty quantification. We prove that loss functions bias conditionally independent neural networks to become overconfident in their predictions. As a result, they are unable to represent uncertainty over multiple valid options. Furthermore, we prove that these loss functions are difficult to optimise: they are non-convex, and their minima are usually highly disconnected. Our theoretical analysis gives the foundation for replacing the conditional independence assumption and designing more expressive neurosymbolic probabilistic models.

artificial intelligence, implicant, machine learning, (14 more...)

arXiv.org Machine Learning

2404.08458

Country:

North America > United States > New York (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimisation in Neurosymbolic Learning Systems

van Krieken, Emile

arXiv.org Artificial IntelligenceJan-19-2024

Neurosymbolic AI aims to integrate deep learning with symbolic AI. This integration has many promises, such as decreasing the amount of data required to train a neural network, improving the explainability and interpretability of answers given by models and verifying the correctness of trained systems. We study neurosymbolic learning, where we have both data and background knowledge expressed using symbolic languages. How do we connect the symbolic and neural components to communicate this knowledge? One option is fuzzy reasoning, which studies degrees of truth. For example, being tall is not a binary concept. Instead, probabilistic reasoning studies the probability that something is true or will happen. Our first research question studies how different forms of fuzzy reasoning combine with learning. We find surprising results like a connection to the Raven paradox stating we confirm "ravens are black" when we observe a green apple. In this study, we did not use the background knowledge when we deployed our models after training. In our second research question, we studied how to use background knowledge in deployed models. We developed a new neural network layer based on fuzzy reasoning. Probabilistic reasoning is a natural fit for neural networks, which we usually train to be probabilistic. However, they are expensive to compute and do not scale well to large tasks. In our third research question, we study how to connect probabilistic reasoning with neural networks by sampling to estimate averages, while in the final research question, we study scaling probabilistic neurosymbolic learning to much larger problems than before. Our insight is to train a neural network with synthetic data to predict the result of probabilistic reasoning.

artificial intelligence, differentiable fuzzy logic operator, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.5463/thesis.466

2401.10819

Country:

North America > United States > California (0.14)
North America > United States > Maryland (0.13)
Europe > Austria > Vienna (0.13)
Asia > Middle East > Israel (0.13)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation (0.67)
Leisure & Entertainment (0.45)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks

Younesian, Taraneh, Thanapalasingam, Thiviyan, van Krieken, Emile, Daza, Daniel, Bloem, Peter

arXiv.org Artificial IntelligenceOct-5-2023

Graph neural networks (GNNs) learn the representation of nodes in a graph by aggregating the neighborhood information in various ways. As these networks grow in depth, their receptive field grows exponentially due to the increase in neighborhood sizes, resulting in high memory costs. Graph sampling solves memory issues in GNNs by sampling a small ratio of the nodes in the graph. This way, GNNs can scale to much larger graphs. Most sampling methods focus on fixed sampling heuristics, which may not generalize to different structures or tasks. We introduce GRAPES, an adaptive graph sampling method that learns to identify sets of influential nodes for training a GNN classifier. GRAPES uses a GFlowNet to learn node sampling probabilities given the classification objectives. We evaluate GRAPES across several small- and large-scale graph benchmarks and demonstrate its effectiveness in accuracy and scalability. In contrast to existing sampling methods, GRAPES maintains high accuracy even with small sample sizes and, therefore, can scale to very large graphs. Our code is publicly available at https://github.com/dfdazac/grapes.

learning, sample graph, scalable graph neural network, (1 more...)

arXiv.org Artificial Intelligence

2310.03399

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

A-NeSI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference

van Krieken, Emile, Thanapalasingam, Thiviyan, Tomczak, Jakub M., van Harmelen, Frank, Teije, Annette ten

arXiv.org Machine LearningSep-22-2023

We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate inference. A-NeSI 1) performs approximate inference in polynomial time without changing the semantics of probabilistic logics; 2) is trained using data generated by the background knowledge; 3) can generate symbolic explanations of predictions; and 4) can guarantee the satisfaction of logical constraints at test time, which is vital in safety-critical applications. Our experiments show that A-NeSI is the first end-to-end method to solve three neurosymbolic tasks with exponential combinatorial scaling. Finally, our experiments show that A-NeSI achieves explainability and safety without a penalty in performance.

artificial intelligence, latexit sha1, machine learning, (17 more...)

arXiv.org Machine Learning

2212.12393

Country:

North America > Canada (0.28)
Europe > Austria (0.27)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation

Thanapalasingam, Thiviyan, van Krieken, Emile, Bloem, Peter, Groth, Paul

arXiv.org Artificial IntelligenceAug-25-2023

Knowledge Graph Embedding (KGE) models are used to learn continuous representations of entities and relations. A key task in the literature is predicting missing links between entities. However, Knowledge Graphs are not just sets of links but also have semantics underlying their structure. Semantics is crucial in several downstream tasks, such as query answering or reasoning. We introduce the subgraph inference task, where a model has to generate likely and semantically valid subgraphs. We propose IntelliGraphs, a set of five new Knowledge Graph datasets. The IntelliGraphs datasets contain subgraphs with semantics expressed in logical rules for evaluating subgraph inference. We also present the dataset generator that produced the synthetic datasets. We designed four novel baseline models, which include three models based on traditional KGEs. We evaluate their expressiveness and show that these models cannot capture the semantics. We believe this benchmark will encourage the development of machine learning models that emphasize semantic understanding.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2307.06698

Country: Europe > Netherlands (0.46)

Genre: Research Report (0.82)

Industry: Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prompting as Probing: Using Language Models for Knowledge Base Construction

Alivanistos, Dimitrios, Santamaría, Selene Báez, Cochez, Michael, Kalo, Jan-Christoph, van Krieken, Emile, Thanapalasingam, Thiviyan

arXiv.org Artificial IntelligenceJun-19-2023

Language Models (LMs) have proven to be useful in various downstream applications, such as summarisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020, to perform the task of Knowledge Base Construction (KBC). ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our results show that manual prompt curation is essential, that the LM must be encouraged to give answer sets of variable lengths, in particular including empty answer sets, that true/false questions are a useful device to increase precision on suggestions generated by the LM, that the size of the LM is a crucial factor, and that a dictionary of entity aliases improves the LM score. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions: ProP won track 2 of the LM-KBC competition, outperforming the baseline by 36.4 percentage points.

artificial intelligence, expert system, natural language, (17 more...)

arXiv.org Artificial Intelligence

2208.11057

Country:

North America > United States (1.00)
Europe > Spain (1.00)
Asia > Middle East (1.00)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Media (1.00)
Automobiles & Trucks > Manufacturer (0.93)
Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback