AITopics | Heiter, Edith

Plotting

Heiter, Edith

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Models Reflect the Ideology of their Creators

Buyl, Maarten, Rogiers, Alexander, Noels, Sander, Dominguez-Catena, Iris, Heiter, Edith, Romero, Raphael, Johary, Iman, Mara, Alexandru-Cristian, Lijffijt, Jefrey, De Bie, Tijl

arXiv.org Artificial IntelligenceOct-24-2024

Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in artificial intelligence (AI) assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, training, and use. In this paper, we uncover notable diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed. We do this by prompting a diverse panel of popular LLMs to describe a large number of prominent and controversial personalities from recent world history, both in English and in Chinese. By identifying and analyzing moral assessments reflected in the generated descriptions, we find consistent normative differences between how the same LLM responds in Chinese compared to English. Similarly, we identify normative disagreements between Western and non-Western LLMs about prominent actors in geopolitical conflicts. Furthermore, popularly hypothesized disparities in political goals among Western models are reflected in significant normative differences related to inclusion, social inequality, and political scandals. Our results show that the ideological stance of an LLM often reflects the worldview of its creators. This raises important concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically `unbiased', and it poses risks for political instrumentalization.

large language model, machine learning, political person, (20 more...)

arXiv.org Artificial Intelligence

2410.18417

Country:

Europe (1.00)
Asia (1.00)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pattern or Artifact? Interactively Exploring Embedding Quality with TRACE

Heiter, Edith, Martens, Liesbet, Seurinck, Ruth, Guilliams, Martin, De Bie, Tijl, Saeys, Yvan, Lijffijt, Jefrey

arXiv.org Artificial IntelligenceJun-18-2024

This paper presents TRACE, a tool to analyze the quality of 2D embeddings generated through dimensionality reduction techniques. Dimensionality reduction methods often prioritize preserving either local neighborhoods or global distances, but insights from visual structures can be misleading if the objective has not been achieved uniformly. TRACE addresses this challenge by providing a scalable and extensible pipeline for computing both local and global quality measures. The interactive browser-based interface allows users to explore various embeddings while visually assessing the pointwise embedding quality. The interface also facilitates in-depth analysis by highlighting high-dimensional nearest neighbors for any group of points and displaying high-dimensional distances between points. TRACE enables analysts to make informed decisions regarding the most suitable dimensionality reduction method for their specific use case, by showing the degree and location where structure is preserved in the reduced space.

data mining, interactively exploring embedding quality, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.12953

Country: Europe > Belgium (0.16)

Genre: Research Report (0.65)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.81)

Add feedback

Topologically Regularized Data Embeddings

Heiter, Edith, Vandaele, Robin, De Bie, Tijl, Saeys, Yvan, Lijffijt, Jefrey

arXiv.org Artificial IntelligenceNov-7-2023

Unsupervised representation learning methods are widely used for gaining insight into high-dimensional, unstructured, or structured data. In some cases, users may have prior topological knowledge about the data, such as a known cluster structure or the fact that the data is known to lie along a tree- or graph-structured topology. However, generic methods to ensure such structure is salient in the low-dimensional representations are lacking. This negatively impacts the interpretability of low-dimensional embeddings, and plausibly downstream learning tasks. To address this issue, we introduce topological regularization: a generic approach based on algebraic topology to incorporate topological prior knowledge into low-dimensional embeddings. We introduce a class of topological loss functions, and show that jointly optimizing an embedding loss with such a topological loss function as a regularizer yields embeddings that reflect not only local proximities but also the desired topological structure. We include a self-contained overview of the required foundational concepts in algebraic topology, and provide intuitive guidance on how to design topological loss functions for a variety of shapes, such as clusters, cycles, and bifurcations. We empirically evaluate the proposed approach on computational efficiency, robustness, and versatility in combination with linear and non-linear dimensionality reduction and graph embedding methods.

artificial intelligence, machine learning, spatial reasoning, (14 more...)

arXiv.org Artificial Intelligence

2301.03338

Country: Europe > Belgium (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors

Heiter, Edith, Kang, Bo, Seurinck, Ruth, Lijffijt, Jefrey

arXiv.org Artificial IntelligenceApr-11-2023

Conditional t-SNE (ct-SNE) is a recent extension to t-SNE that allows removal of known cluster information from the embedding, to obtain a visualization revealing structure beyond label information. This is useful, for example, when one wants to factor out unwanted differences between a set of classes. We show that ct-SNE fails in many realistic settings, namely if the data is well clustered over the labels in the original high-dimensional space. We introduce a revised method by conditioning the high-dimensional similarities instead of the low-dimensional similarities and storing within- and across-label nearest neighbors separately. This also enables the use of recently proposed speedups for t-SNE, improving the scalability. From experiments on synthetic data, we find that our proposed method resolves the considered problems and improves the embedding quality. On real data containing batch effects, the expected improvement is not always there. We argue revised ct-SNE is preferable overall, given its improved scalability. The results also highlight new open questions, such as how to handle distance variations between clusters.

artificial intelligence, machine learning, similarity, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-30047-9_14

2302.03493

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)

Add feedback

Factoring out prior knowledge from low-dimensional embeddings

Heiter, Edith, Fischer, Jonas, Vreeken, Jilles

arXiv.org Machine LearningMar-2-2021

Embedding high dimensional data into low dimensional spaces, such as with tSNE [van der Maaten and Hinton, 2008] or UMAP [McInnes et al., 2018], allow us to visually inspect and discover meaningful structure from the data that would otherwise be difficult or impossible to see. These methods are as popular as they are useful, but, at the same time limited in that they are one-shot only: they embed the data as is, and that is that. If the resulting embedding reveals novel knowledge, all is well, but, what if the structure that dominates it is something we already know, something we are no longer interested in, or, if we want to discover whether the data has meaningful structure other than what the first result revealed? In word embeddings, for example, we may already know that certain words are synonyms, while in single cell sequencing we may want to discover structure other than known cell types, or factor out family relationships. The question at hand is therefore, how can we obtain low-dimensional embeddings that reveal structure beyond what we already know, i.e. how to factor out prior knowledge from low-dimensional embeddings? For conditional embeddings, research so far mostly focused on emphasizing rather than factoring out prior knowledge [Barshan et al., 2011, De Ridder et al., 2003, Hanhijärvi et al., 2009], with conditional tSNE as notable exception, which, however, can only factor out label information [Kang et al., 2019]. Here, we propose two techniques for factoring out a more general form of prior knowledge from low-dimensional embeddings of arbitrary data types. In particular, we consider background knowledge in the form of pairwise distances between samples. This formulation allows us to cover a plethora of practical instances including labels, clustering structure, family trees, user-defined distances, but also, and especially important for unstructured data, kernel matrices.

health & medicine, information management, knowledge, (18 more...)

arXiv.org Machine Learning

2103.01828

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Information Management (0.66)

Add feedback