AITopics | Spitz, Andreas

Collaborating Authors

Spitz, Andreas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models

Faulborn, Mats, Sen, Indira, Pellert, Max, Spitz, Andreas, Garcia, David

arXiv.org Artificial IntelligenceMar-20-2025

Prompt-based language models like GPT4 and LLaMa have been used for a wide variety of use cases such as simulating agents, searching for information, or for content analysis. For all of these applications and others, political biases in these models can affect their performance. Several researchers have attempted to study political bias in language models using evaluation suites based on surveys, such as the Political Compass Test (PCT), often finding a particular leaning favored by these models. However, there is some variation in the exact prompting techniques, leading to diverging findings and most research relies on constrained-answer settings to extract model responses. Moreover, the Political Compass Test is not a scientifically valid survey instrument. In this work, we contribute a political bias measured informed by political science theory, building on survey design principles to test a wide variety of input prompts, while taking into account prompt sensitivity. We then prompt 11 different open and commercial models, differentiating between instruction-tuned and non-instruction-tuned models, and automatically classify their political stances from 88,110 responses. Leveraging this dataset, we compute political bias profiles across different prompt variations and find that while PCT exaggerates bias in certain models like GPT3.5, measures of political bias are often unstable, but generally more left-leaning for instruction-tuned models.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.16148

Country:

North America > United States > New York (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

R.U.Psycho? Robust Unified Psychometric Testing of Language Models

Schelb, Julian, Borin, Orr, Garcia, David, Spitz, Andreas

arXiv.org Artificial IntelligenceMar-13-2025

Generative language models are increasingly being subjected to psychometric questionnaires intended for human testing, in efforts to establish their traits, as benchmarks for alignment, or to simulate participants in social science experiments. While this growing body of work sheds light on the likeness of model responses to those of humans, concerns are warranted regarding the rigour and reproducibility with which these experiments may be conducted. Instabilities in model outputs, sensitivity to prompt design, parameter settings, and a large number of available model versions increase documentation requirements. Consequently, generalization of findings is often complex and reproducibility is far from guaranteed. In this paper, we present R.U.Psycho, a framework for designing and running robust and reproducible psychometric experiments on generative language models that requires limited coding expertise. We demonstrate the capability of our framework on a variety of psychometric questionnaires, which lend support to prior findings in the literature. R.U.Psycho is available as a Python package at https://github.com/julianschelb/rupsycho.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.10229

Country: North America > United States (0.46)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.93)

Industry:

Transportation > Air (0.93)
Education > Assessment & Standards > Assessment Methods (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

Quantifying the Risks of Tool-assisted Rephrasing to Linguistic Diversity

Wang, Mengying, Spitz, Andreas

arXiv.org Artificial IntelligenceOct-23-2024

Writing assistants and large language models see widespread use in the creation of text content. While their effectiveness for individual users has been evaluated in the literature, little is known about their proclivity to change language or reduce its richness when adopted by a large user base. In this paper, we take a first step towards quantifying this risk by measuring the semantic and vocabulary change enacted by the use of rephrasing tools on a multi-domain corpus of human-generated text.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.1767

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges

Spinner, Thilo, Kehlbeck, Rebecca, Sevastjanova, Rita, Stähle, Tobias, Keim, Daniel A., Deussen, Oliver, Spitz, Andreas, El-Assady, Mennatallah

arXiv.org Artificial IntelligenceOct-17-2023

The growing popularity of generative language models has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting large language models, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A comprehensive examination of model outputs, including runner-up candidates and their corresponding probabilities, is needed to address these issues. The beam search tree, the prevalent algorithm to sample model outputs, can inherently supply this information. Consequently, we introduce an interactive visual method for investigating the beam search tree, facilitating analysis of the decisions made by the model during generation. We quantitatively show the value of exposing the beam search tree and present five detailed analysis scenarios addressing the identified challenges. Our methodology validates existing results and offers additional insights.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.11252

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mind Your Bias: A Critical Review of Bias Detection Methods for Contextual Language Models

Husse, Silke, Spitz, Andreas

arXiv.org Artificial IntelligenceNov-15-2022

The awareness and mitigation of biases are of fundamental importance for the fair and transparent use of contextual language models, yet they crucially depend on the accurate detection of biases as a precursor. Consequently, numerous bias detection methods have been proposed, which vary in their approach, the considered type of bias, and the data used for evaluation. However, while most detection methods are derived from the word embedding association test for static word embeddings, the reported results are heterogeneous, inconsistent, and ultimately inconclusive. To address this issue, we conduct a rigorous analysis and comparison of bias detection methods for contextual language models. Our results show that minor design and implementation decisions (or errors) have a substantial and often significant impact on the derived bias scores. Overall, we find the state of the field to be both worse than previously acknowledged due to systematic and propagated errors in implementations, yet better than anticipated since divergent results in the literature homogenize after accounting for implementation errors. Based on our findings, we conclude with a discussion of paths towards more robust and consistent bias detection methods.

artificial intelligence, bias score, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.08461

Country:

Europe (0.68)
Asia (0.67)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.92)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

IM-META: Influence Maximization Using Node Metadata in Networks With Unknown Topology

Tran, Cong, Shin, Won-Yong, Spitz, Andreas

arXiv.org Artificial IntelligenceJun-5-2021

In real-world applications of influence maximization (IM), the network structure is often unknown. In this case, we may identify the most influential seed nodes by exploring only a part of the underlying network given a small budget for node queries. Motivated by the fact that collecting node metadata is more cost-effective than investigating the relationship between nodes via queried nodes, we develop IM-META, an end-to-end solution to IM in networks with unknown topology by retrieving information from both queries and node metadata. However, using such metadata to aid the IM process is not without risk due to the noisy nature of metadata and uncertainties in connectivity inference. To tackle these challenges, we formulate an IM problem that aims to find two sets, i.e., seed nodes and queried nodes. We propose an effective method that iteratively performs three steps: 1) we learn the relationship between collected metadata and edges via a Siamese neural network model, 2) we select a number of inferred influential edges to construct a reinforced graph used for discovering an optimal seed set, and 3) we identify the next node to query by maximizing the inferred influence spread using a topology-aware ranking strategy. By querying only 5% of nodes, IM-META reaches 93% of the upper bound performance.

neural network, node, social media, (22 more...)

arXiv.org Artificial Intelligence

2106.02926

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California > Santa Clara County (0.14)
North America > United States > Arizona > Maricopa County (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.93)
Information Technology (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

State of the Union: A Data Consumer's Perspective on Wikidata and Its Properties for the Classification and Resolution of Entities

Spitz, Andreas (Heidelberg University) | Dixit, Vaibhav (Heidelberg University) | Richter, Ludwig (Heidelberg University) | Gertz, Michael (Heidelberg University) | Geiss, Johanna (Heidelberg University)

AAAI ConferencesMay-8-2016

Wikipedia is one of the most popular sources of free data on the Internet and subject to extensive use in numerous areas of research. Wikidata on the other hand, the knowledge base behind Wikipedia, is less popular as a source of data, despite having the "data" already in its name, and despite the fact that many applications in Natural Language Processing in general and Information Extraction in particular benefit immensely from the integration of knowledge bases. In part, this imbalance is owed to the younger age of Wikidata, which launched over a decade after Wikipedia. However, this is also owed to challenges posed by the still evolving properties of Wikidata that make its content more difficult to consume for third parties than is desirable. In this article, we analzye the causes of these challenges from the viewpoint of a data consumer and discuss possible avenues of research and advancement that both the scientific and the Wikidata community can collaborate on to turn the knowledge base into the invaluable asset that it is uniquely positioned to become.

classification and resolution, data consumer&, union

AAAI Conferences

Tenth International AAAI Conference on Web and Social Media

Country: North America > United States (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback