AITopics | Lukasiewicz, Thomas

Collaborating Authors

Lukasiewicz, Thomas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary

Jang, Myeongjun Erik, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceOct-24-2023

The non-humanlike behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon of such faulty behaviours is the generation of inconsistent predictions, which produces logically contradictory results, such as generating different predictions for texts delivering the same meaning or violating logical properties. Previous studies exploited data augmentation or implemented specialised loss functions to alleviate the issue. However, their usage is limited, because they consume expensive training resources for large-sized PLMs and can only handle a certain consistency type. To this end, we propose a practical approach that alleviates the inconsistent behaviour issue by fundamentally improving PLMs' meaning awareness. Based on the conceptual role theory, our method allows PLMs to capture accurate meaning by learning precise interrelationships between concepts from word-definition pairs in a dictionary. Next, we propose an efficient parameter integration technique that updates only a few additional parameters to combine the learned interrelationship with PLMs' pre-trained knowledge. Our experimental results reveal that the approach can concurrently improve multiple types of consistency, enables efficient knowledge integration, and easily applies to other languages.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.15541

Country:

Europe (1.00)
North America (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Brain-Inspired Computational Intelligence via Predictive Coding

Salvatori, Tommaso, Mali, Ankur, Buckley, Christopher L., Lukasiewicz, Thomas, Rao, Rajesh P. N., Friston, Karl, Ororbia, Alexander

arXiv.org Artificial IntelligenceAug-15-2023

Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.0787

Country:

Europe (1.00)
North America > United States > Washington > King County > Seattle (0.27)
North America > United States > Florida (0.27)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

Wang, Jianfeng, Massiceti, Daniela, Hu, Xiaolin, Pavlovic, Vladimir, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceAug-5-2023

Semi-supervised semantic segmentation involves assigning pixel-wise labels to unlabeled images at training time. This is useful in a wide range of real-world applications where collecting pixel-wise labels is not feasible in time or cost. Current approaches to semi-supervised semantic segmentation work by predicting pseudo-labels for each pixel from a class-wise probability distribution output by a model. If the predicted probability distribution is incorrect, however, this leads to poor segmentation results, which can have knock-on consequences in safety critical systems, like medical images or self-driving cars. It is, therefore, important to understand what a model does not know, which is mainly achieved by uncertainty quantification. Recently, neural processes (NPs) have been explored in semi-supervised image classification, and they have been a computationally efficient and effective method for uncertainty quantification. In this work, we move one step forward by adapting NPs to semi-supervised semantic segmentation, resulting in a new model called NP-SemiSeg. We experimentally evaluated NP-SemiSeg on the public benchmarks PASCAL VOC 2012 and Cityscapes, with different training settings, and the results verify its effectiveness.

artificial intelligence, machine learning, np-semiseg, (15 more...)

arXiv.org Artificial Intelligence

2308.02866

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.49)
Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Mathematical Capabilities of ChatGPT

Frieder, Simon, Pinchetti, Luca, Chevalier, Alexis, Griffiths, Ryan-Rhys, Salvatori, Tommaso, Lukasiewicz, Thomas, Petersen, Philipp Christian, Berner, Julius

arXiv.org Artificial IntelligenceJul-20-2023

We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathematical Library), current datasets of natural-language mathematics, used to benchmark language models, either cover only elementary mathematics or are very small. We address this by publicly releasing two new datasets: GHOSTS and miniGHOSTS. These are the first natural-language datasets curated by working researchers in mathematics that (1) aim to cover graduate-level mathematics, (2) provide a holistic overview of the mathematical capabilities of language models, and (3) distinguish multiple dimensions of mathematical reasoning. These datasets also test whether ChatGPT and GPT-4 can be helpful assistants to professional mathematicians by emulating use cases that arise in the daily professional activities of mathematicians. We benchmark the models on a range of fine-grained performance metrics. For advanced mathematics, this is the most detailed evaluation effort to date. We find that ChatGPT can be used most successfully as a mathematical assistant for querying facts, acting as a mathematical search engine and knowledge base interface. GPT-4 can additionally be used for undergraduate-level mathematics but fails on graduate-level difficulty. Contrary to many positive reports in the media about GPT-4 and ChatGPT's exam-solving abilities (a potential case of selection bias), their overall mathematical performance is well below the level of a graduate student. Hence, if your goal is to use ChatGPT to pass a graduate-level math exam, you would be better off copying from your average peer!

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2301.13867

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology (0.92)
Education > Educational Setting > Higher Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Faithfulness Tests for Natural Language Explanations

Atanasova, Pepa, Camburu, Oana-Maria, Lioma, Christina, Lukasiewicz, Thomas, Simonsen, Jakob Grue, Augenstein, Isabelle

arXiv.org Artificial IntelligenceJun-30-2023

Explanations of neural models aim to reveal a model's decision-making process for its predictions. However, recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading, as they are prone to present reasons that are unfaithful to the model's inner workings. This work explores the challenging question of evaluating the faithfulness of natural language explanations (NLEs). To this end, we present two tests. First, we propose a counterfactual input editor for inserting reasons that lead to counterfactual predictions but are not reflected by the NLEs. Second, we reconstruct inputs from the reasons stated in the generated NLEs and check how often they lead to the same predictions. Our tests can evaluate emerging NLE models, proving a fundamental tool in the development of faithful NLEs.

machine learning, natural language, prediction, (15 more...)

arXiv.org Artificial Intelligence

2305.18029

Country: Europe (1.00)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.46)

Add feedback

Causal Inference via Predictive Coding

Salvatori, Tommaso, Pinchetti, Luca, M'Charrak, Amine, Millidge, Beren, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceJun-27-2023

Bayesian and causal inference are fundamental processes for intelligence. Bayesian inference models observations: what can be inferred about y if we observe a related variable x? Causal inference models interventions: if we directly change x, how will y change? Predictive coding is a neuroscience-inspired method for performing Bayesian inference on continuous state variables using local information only. In this work, we go beyond Bayesian inference, and show how a simple change in the inference process of predictive coding enables interventional and counterfactual inference in scenarios where the causal graph is known. We then extend our results, and show how predictive coding can be generalized to cases where this graph is unknown, and has to be inferred from data, hence performing causal discovery. What results is a novel and straightforward technique that allows us to perform end-to-end causal inference on predictive-coding-based structural causal models, and demonstrate its utility for potential applications in machine learning.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.15479

Country:

Europe (0.46)
North America > United States > California (0.45)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.46)

Industry:

Law > Litigation (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

NP-Match: Towards a New Probabilistic Model for Semi-Supervised Learning

Wang, Jianfeng, Hu, Xiaolin, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceJun-25-2023

Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match. NP-Match is suited to this task for two reasons. Firstly, NP-Match implicitly compares data points when making predictions, and as a result, the prediction of each unlabeled data point is affected by the labeled data points that are similar to it, which improves the quality of pseudo-labels. Secondly, NP-Match is able to estimate uncertainty that can be used as a tool for selecting unlabeled samples with reliable pseudo-labels. Compared with uncertainty-based SSL methods implemented with Monte-Carlo (MC) dropout, NP-Match estimates uncertainty with much less computational overhead, which can save time at both the training and the testing phases. We conducted extensive experiments on five public datasets under three semi-supervised image classification settings, namely, the standard semi-supervised image classification, the imbalanced semi-supervised image classification, and the multi-label semi-supervised image classification, and NP-Match outperforms state-of-the-art (SOTA) approaches or achieves competitive results on them, which shows the effectiveness of NP-Match and its potential for SSL. The codes are at https://github.com/Jianf-Wang/NP-Match

artificial intelligence, machine learning, proc, (17 more...)

arXiv.org Artificial Intelligence

2301.13569

Country: Europe (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

Xie, Zhongbin, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceJun-6-2023

The increasingly large size of modern pretrained language models not only makes them inherit more human-like biases from the training corpora, but also makes it computationally expensive to mitigate such biases. In this paper, we investigate recent parameter-efficient methods in combination with counterfactual data augmentation (CDA) for bias mitigation. We conduct extensive experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance and abilities to preserve the internal knowledge of a pre-trained model. We find that the parameter-efficient methods (i) are effective in mitigating gender bias, where adapter tuning is consistently the most effective one and prompt tuning is more suitable for GPT-2 than BERT, (ii) are less effective when it comes to racial and religious bias, which may be attributed to the limitations of CDA, and (iii) can perform similarly to or sometimes better than full fine-tuning with improved time and memory efficiency, as well as maintain the internal knowledge in BERT and GPT-2, evaluated via fact retrieval and downstream fine-tuning.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

2306.04067

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating Inconsistencies in Natural Language Explanations

Jang, Myeongjun, Majumder, Bodhisattwa Prasad, McAuley, Julian, Lukasiewicz, Thomas, Camburu, Oana-Maria

arXiv.org Artificial IntelligenceJun-5-2023

While recent works have been considerably improving the quality of the natural language explanations (NLEs) generated by a model to justify its predictions, there is very limited research in detecting and alleviating inconsistencies among generated NLEs. In this work, we leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs. We apply our attack to high-performing NLE models and show that models with higher NLE quality do not necessarily generate fewer inconsistencies. Moreover, we propose an off-the-shelf mitigation method to alleviate inconsistencies by grounding the model into external background knowledge. Our method decreases the inconsistencies of previous high-performing NLE models as detected by our attack.

machine learning, natural language, xplanation, (20 more...)

arXiv.org Artificial Intelligence

2306.0298

Country: North America > United States (0.48)

Genre: Research Report (1.00)

Industry:

Transportation (0.69)
Government (0.50)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.34)

Add feedback

Machine Learning with Requirements: a Manifesto

Giunchiglia, Eleonora, Imrie, Fergus, van der Schaar, Mihaela, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceApr-7-2023

In the recent years, machine learning has made great advancements that have been at the root of many breakthroughs in different application domains. However, it is still an open issue how make them applicable to high-stakes or safety-critical application domains, as they can often be brittle and unreliable. In this paper, we argue that requirements definition and satisfaction can go a long way to make machine learning models even more fitting to the real world, especially in critical domains. To this end, we present two problems in which (i) requirements arise naturally, (ii) machine learning models are or can be fruitfully deployed, and (iii) neglecting the requirements can have dramatic consequences. We show how the requirements specification can be fruitfully integrated into the standard machine learning development pipeline, proposing a novel pyramid development process in which requirements definition may impact all the subsequent phases in the pipeline, and viceversa.

artificial intelligence, machine learning, requirement, (14 more...)

arXiv.org Artificial Intelligence

2304.03674

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.68)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.95)
Transportation > Ground > Road (0.69)
Information Technology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)

Add feedback