AITopics | Touileb, Samia

Collaborating Authors

Touileb, Samia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles

Touileb, Samia, Mikhailov, Vladislav, Kroka, Marie, Øvrelid, Lilja, Velldal, Erik

arXiv.org Artificial IntelligenceJan-13-2025

We introduce a dataset of high-quality human-authored summaries of news articles in Norwegian. The dataset is intended for benchmarking the abstractive summarisation capabilities of generative language models. Each document in the dataset is provided with three different candidate gold-standard summaries written by native Norwegian speakers, and all summaries are provided in both of the written variants of Norwegian -- Bokm{\aa}l and Nynorsk. The paper describes details on the data creation effort as well as an evaluation of existing open LLMs for Norwegian on the dataset. We also provide insights from a manual human evaluation, comparing human-authored to model-generated summaries. Our results indicate that the dataset provides a challenging LLM benchmark for Norwegian summarisation capabilities

annotator, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.07718

Country: Europe > Norway > Eastern Norway (0.14)

Genre:

Research Report (0.70)
Overview (0.68)

Industry: Media > News (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Learning Horn Envelopes via Queries from Large Language Models

Blum, Sophie, Koudijs, Raoul, Ozaki, Ana, Touileb, Samia

arXiv.org Artificial IntelligenceSep-13-2023

We investigate an approach for extracting knowledge from trained neural networks based on Angluin's exact learning model with membership and equivalence queries to an oracle. In this approach, the oracle is a trained neural network. We consider Angluin's classical algorithm for learning Horn theories and study the necessary changes to make it applicable to learn from neural networks. In particular, we have to consider that trained neural networks may not behave as Horn oracles, meaning that their underlying target theory may not be Horn. We propose a new algorithm that aims at extracting the "tightest Horn approximation" of the target theory and that is guaranteed to terminate in exponential time (in the worst case) and in polynomial time if the target has polynomially many non-Horn examples. To showcase the applicability of the approach, we perform experiments on pre-trained language models and extract rules that expose occupation-based gender biases.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.12143

Country:

Europe (0.93)
North America > United States > Louisiana (0.14)
North America > United States > Arizona (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

JSEEGraph: Joint Structured Event Extraction as Graph Parsing

You, Huiling, Touileb, Samia, Øvrelid, Lilja

arXiv.org Artificial IntelligenceJun-26-2023

We propose a graph-based event extraction framework JSEEGraph that approaches the task of event extraction as general graph parsing in the tradition of Meaning Representation Parsing. It explicitly encodes entities and events in a single semantic graph, and further has the flexibility to encode a wider range of additional IE relations and jointly infer individual tasks. JSEEGraph performs in an end-to-end manner via general graph parsing: (1) instead of flat sequence labelling, nested structures between entities/triggers are efficiently encoded as separate nodes in the graph, allowing for nested and overlapping entities and triggers; (2) both entities, relations, and events can be encoded in the same graph, where entities and event triggers are represented as nodes and entity relations and event arguments are constructed via edges; (3) joint inference avoids error propagation and enhances the interpolation of different IE tasks. We experiment on two benchmark datasets of varying structural complexities; ACE05 and Rich ERE, covering three languages: English, Chinese, and Spanish. Experimental results show that JSEEGraph can handle nested event structures, that it is beneficial to solve different IE tasks jointly, and that event argument extraction in particular benefits from entity extraction. Our code and models are released as open-source.

artificial intelligence, extraction, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.14633

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

NorBench -- A Benchmark for Norwegian Language Models

Samuel, David, Kutuzov, Andrey, Touileb, Samia, Velldal, Erik, Øvrelid, Lilja, Rønningstad, Egil, Sigdel, Elina, Palatkina, Anna

arXiv.org Artificial IntelligenceMay-5-2023

We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics. We also introduce a range of new Norwegian language models (both encoder and encoder-decoder based). Finally, we compare and analyze their performance, along with other existing LMs, across the different benchmark tests of NorBench.

artificial intelligence, natural language, norbert 3, (18 more...)

arXiv.org Artificial Intelligence

2305.0388

Country:

Europe > Sweden (0.28)
North America > United States > Minnesota (0.28)
Asia > Middle East (0.28)
Asia > Japan (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Measuring Normative and Descriptive Biases in Language Models Using Census Data

Touileb, Samia, Øvrelid, Lilja, Velldal, Erik

arXiv.org Artificial IntelligenceApr-12-2023

We investigate in this paper how distributions of occupations with respect to gender is reflected in pre-trained language models. Such distributions are not always aligned to normative ideals, nor do they necessarily reflect a descriptive assessment of reality. In this paper, we introduce an approach for measuring to what degree pre-trained language models are aligned to normative and descriptive occupational distributions. To this end, we use official demographic information about gender--occupation distributions provided by the national statistics agencies of France, Norway, United Kingdom, and the United States. We manually generate template-based sentences combining gendered pronouns and nouns with occupations, and subsequently probe a selection of ten language models covering the English, French, and Norwegian languages. The scoring system we introduce in this work is language independent, and can be used on any combination of template-based sentences, occupations, and languages. The approach could also be extended to other dimensions of national census data and other demographic variables.

artificial intelligence, natural language, occupation, (15 more...)

arXiv.org Artificial Intelligence

2304.05764

Country:

Europe > France (0.55)
Europe > Norway (0.35)
North America > United States (0.35)
Europe > United Kingdom (0.34)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Measuring Harmful Representations in Scandinavian Language Models

Touileb, Samia, Nozza, Debora

arXiv.org Artificial IntelligenceNov-21-2022

Scandinavian countries are perceived as role-models when it comes to gender equality. With the advent of pre-trained language models and their widespread usage, we investigate to what extent gender-based harmful and toxic content exist in selected Scandinavian language models. We examine nine models, covering Danish, Swedish, and Norwegian, by manually creating template-based sentences and probing the models for completion. We evaluate the completions using two methods for measuring harmful and toxic completions and provide a thorough analysis of the results. We show that Scandinavian pre-trained language models contain harmful and gender-based stereotypes with similar values across all languages. This finding goes against the general expectations related to gender equality in Scandinavian countries and shows the possible problematic outcomes of using such models in real-world settings.

artificial intelligence, computational linguistic, natural language, (17 more...)

arXiv.org Artificial Intelligence

2211.11678

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Law > Civil Rights & Constitutional Law (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback