measurement instrument
Using latent representations to link disjoint longitudinal data for mixed-effects regression
Schächter, Clemens, Hackenberg, Maren, Pfaffenlehner, Michelle, Tambe-Ndonfack, Félix B., Schmidt, Thorsten, Pechmann, Astrid, Kirschner, Janbernd, Hasenauer, Jan, Binder, Harald
Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal data trajectories, complicate the application of traditional modeling approaches like mixed-effects regression. We tackle this by mapping observations of each instrument to a aligned low-dimensional temporal trajectory, enabling longitudinal modeling across instruments. Specifically, we employ a set of variational autoencoder architectures to embed item values into a shared latent space for each time point. Temporal disease dynamics and treatment switch effects are then captured through a mixed-effects regression model applied to latent representations. To enable statistical inference, we present a novel statistical testing approach that accounts for the joint parameter estimation of mixed-effects regression and variational autoencoders. The methodology is applied to quantify the impact of treatment switches for patients with spinal muscular atrophy. Here, our approach aligns motor performance items from different measurement instruments for mixed-effects regression and maps estimated effects back to the observed item level to quantify the treatment switch effect. Our approach allows for model selection as well as for assessing effects of treatment switching. The results highlight the potential of modeling in joint latent representations for addressing small data challenges.
- Europe > Germany > Baden-Württemberg > Freiburg (0.05)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Human and AI Trust: Trust Attitude Measurement Instrument
With the current progress of Artificial Intelligence (AI) technology and its increasingly broader applications, trust is seen as a required criterion for AI usage, acceptance, and deployment. A robust measurement instrument is essential to correctly evaluate trust from a human-centered perspective. This paper describes the development and validation process of a trust measure instrument, which follows psychometric principles, and consists of a 16-items trust scale. The instrument was built explicitly for research in human-AI interaction to measure trust attitudes towards AI systems from layperson (non-expert) perspective. The use-case we used to develop the scale was in the context of AI medical support systems (specifically cancer/health prediction). The scale development (Measurement Item Development) and validation (Measurement Item Evaluation) involved six research stages: item development, item evaluation, survey administration, test of dimensionality, test of reliability, and test of validity. The results of the six-stages evaluation show that the proposed trust measurement instrument is empirically reliable and valid for systematically measuring and comparing non-experts' trust in AI Medical Support Systems.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Information Technology (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Education (1.00)
- (5 more...)
Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
Sühr, Tom, Dorner, Florian E., Salaudeen, Olawale, Kelava, Augustin, Samadi, Samira
Large Language Models (LLMs) have achieved remarkable results on a range of standardized tests originally designed to assess human cognitive and psychological traits, such as intelligence and personality. While these results are often interpreted as strong evidence of human-like characteristics in LLMs, this paper argues that such interpretations constitute an ontological error. Human psychological and educational tests are theory-driven measurement instruments, calibrated to a specific human population. Applying these tests to non-human subjects without empirical validation, risks mischaracterizing what is being measured. Furthermore, a growing trend frames AI performance on benchmarks as measurements of traits such as ``intelligence'', despite known issues with validity, data contamination, cultural bias and sensitivity to superficial prompt changes. We argue that interpreting benchmark performance as measurements of human-like traits, lacks sufficient theoretical and empirical justification. This leads to our position: Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead. We call for the development of principled, AI-specific evaluation frameworks tailored to AI systems. Such frameworks might build on existing frameworks for constructing and validating psychometrics tests, or could be created entirely from scratch to fit the unique context of AI.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > Wyoming (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Law (1.00)
- Education > Assessment & Standards (1.00)
- Health & Medicine > Therapeutic Area (0.93)
- Government > Regional Government > North America Government > United States Government (0.67)
Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems
Harvey, Emma, Sheng, Emily, Blodgett, Su Lin, Chouldechova, Alexandra, Garcia-Gathright, Jean, Olteanu, Alexandra, Wallach, Hanna
The NLP research community has made publicly available numerous instruments for measuring representational harms caused by large language model (LLM)-based systems. These instruments have taken the form of datasets, metrics, tools, and more. In this paper, we examine the extent to which such instruments meet the needs of practitioners tasked with evaluating LLM-based systems. Via semi-structured interviews with 12 such practitioners, we find that practitioners are often unable to use publicly available instruments for measuring representational harms. We identify two types of challenges. In some cases, instruments are not useful because they do not meaningfully measure what practitioners seek to measure or are otherwise misaligned with practitioner needs. In other cases, instruments - even useful instruments - are not used by practitioners due to practical and institutional barriers impeding their uptake. Drawing on measurement theory and pragmatic measurement, we provide recommendations for addressing these challenges to better meet practitioner needs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Singapore (0.04)
- (18 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Health & Medicine (1.00)
- Information Technology (0.93)
- Law (0.93)
Taxonomizing Representational Harms using Speech Act Theory
Corvi, Emily, Washington, Hannah, Reed, Stefanie, Atalla, Chad, Chouldechova, Alexandra, Dow, P. Alex, Garcia-Gathright, Jean, Pangakis, Nicholas, Sheng, Emily, Vann, Dan, Vogel, Matthew, Wallach, Hanna
Representational harms are widely recognized among fairness-related harms caused by generative language systems. However, their definitions are commonly under-specified. We present a framework, grounded in speech act theory (Austin, 1962), that conceptualizes representational harms caused by generative language systems as the perlocutionary effects (i.e., real-world impacts) of particular types of illocutionary acts (i.e., system behaviors). Building on this argument and drawing on relevant literature from linguistic anthropology and sociolinguistics, we provide new definitions stereotyping, demeaning, and erasure. We then use our framework to develop a granular taxonomy of illocutionary acts that cause representational harms, going beyond the high-level taxonomies presented in previous work. We also discuss the ways that our framework and taxonomy can support the development of valid measurement instruments. Finally, we demonstrate the utility of our framework and taxonomy via a case study that engages with recent conceptual debates about what constitutes a representational harm and how such harms should be measured.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (4 more...)
Coupling quantum-like cognition with the neuronal networks within generalized probability theory
Khrennikov, Andrei, Ozawa, Masanao, Benninger, Felix, Shor, Oded
The past few years have seen a surge in the application of quantum theory methodologies and quantum-like modeling in fields such as cognition, psychology, and decision-making. Despite the success of this approach in explaining various psychological phenomena such as order, conjunction, disjunction, and response replicability effects there remains a potential dissatisfaction due to its lack of clear connection to neurophysiological processes in the brain. Currently, it remains a phenomenological approach. In this paper, we develop a quantum-like representation of networks of communicating neurons. This representation is not based on standard quantum theory but on generalized probability theory (GPT), with a focus on the operational measurement framework. Specifically, we use a version of GPT that relies on ordered linear state spaces rather than the traditional complex Hilbert spaces. A network of communicating neurons is modeled as a weighted directed graph, which is encoded by its weight matrix. The state space of these weight matrices is embedded within the GPT framework, incorporating effect observables and state updates within the theory of measurement instruments a critical aspect of this model. This GPT based approach successfully reproduces key quantum-like effects, such as order, non-repeatability, and disjunction effects (commonly associated with decision interference). Moreover, this framework supports quantum-like modeling in medical diagnostics for neurological conditions such as depression and epilepsy. While this paper focuses primarily on cognition and neuronal networks, the proposed formalism and methodology can be directly applied to a wide range of biological and social networks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Switzerland (0.04)
- (11 more...)
From Measurement Instruments to Data: Leveraging Theory-Driven Synthetic Training Data for Classifying Social Constructs
Birkenmaier, Lukas, Roth, Matthias, Sen, Indira
Computational text classification is a challenging task, especially for multi-dimensional social constructs. Recently, there has been increasing discussion that synthetic training data could enhance classification by offering examples of how these constructs are represented in texts. In this paper, we systematically examine the potential of theory-driven synthetic training data for improving the measurement of social constructs. In particular, we explore how researchers can transfer established knowledge from measurement instruments in the social sciences, such as survey scales or annotation codebooks, into theory-driven generation of synthetic data. Using two studies on measuring sexism and political topics, we assess the added value of synthetic training data for fine-tuning text classification models. Although the results of the sexism study were less promising, our findings demonstrate that synthetic data can be highly effective in reducing the need for labeled data in political topic classification. With only a minimal drop in performance, synthetic data allows for substituting large amounts of labeled data. Furthermore, theory-driven synthetic data performed markedly better than data generated without conceptual information in mind.
- North America > United States (0.28)
- South America (0.04)
- Oceania > New Zealand (0.04)
- (6 more...)
- Health & Medicine (0.94)
- Law > Civil Rights & Constitutional Law (0.93)
- Banking & Finance > Economy (0.68)
- Government > Regional Government (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.74)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Investigating a domain adaptation approach for integrating different measurement instruments in a longitudinal clinical registry
Hackenberg, Maren, Pfaffenlehner, Michelle, Behrens, Max, Pechmann, Astrid, Kirschner, Janbernd, Binder, Harald
In a longitudinal clinical registry, different measurement instruments might have been used for assessing individuals at different time points. To combine them, we investigate deep learning techniques for obtaining a joint latent representation, to which the items of different measurement instruments are mapped. This corresponds to domain adaptation, an established concept in computer science for image data. Using the proposed approach as an example, we evaluate the potential of domain adaptation in a longitudinal cohort setting with a rather small number of time points, motivated by an application with different motor function measurement instruments in a registry of spinal muscular atrophy (SMA) patients. There, we model trajectories in the latent representation by ordinary differential equations (ODEs), where person-specific ODE parameters are inferred from baseline characteristics. The goodness of fit and complexity of the ODE solutions then allows to judge the measurement instrument mappings. We subsequently explore how alignment can be improved by incorporating corresponding penalty terms into model fitting. To systematically investigate the effect of differences between measurement instruments, we consider several scenarios based on modified SMA data, including scenarios where a mapping should be feasible in principle and scenarios where no perfect mapping is available. While misalignment increases in more complex scenarios, some structure is still recovered, even if the availability of measurement instruments depends on patient state. A reasonable mapping is feasible also in the more complex real SMA dataset. These results indicate that domain adaptation might be more generally useful in statistical modeling for longitudinal registry data.
- Europe > Germany > Baden-Württemberg > Freiburg (0.05)
- North America > United States > New York (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > Middle East > Jordan (0.04)
MAILS -- Meta AI Literacy Scale: Development and Testing of an AI Literacy Questionnaire Based on Well-Founded Competency Models and Psychological Change- and Meta-Competencies
Carolus, Astrid, Koch, Martin, Straka, Samantha, Latoschik, Marc Erich, Wienrich, Carolin
The goal of the present paper is to develop and validate a questionnaire to assess AI literacy. In particular, the questionnaire should be deeply grounded in the existing literature on AI literacy, should be modular (i.e., including different facets that can be used independently of each other) to be flexibly applicable in professional life depending on the goals and use cases, and should meet psychological requirements and thus includes further psychological competencies in addition to the typical facets of AIL. We derived 60 items to represent different facets of AI Literacy according to Ng and colleagues conceptualisation of AI literacy and additional 12 items to represent psychological competencies such as problem solving, learning, and emotion regulation in regard to AI. For this purpose, data were collected online from 300 German-speaking adults. The items were tested for factorial structure in confirmatory factor analyses. The result is a measurement instrument that measures AI literacy with the facets Use & apply AI, Understand AI, Detect AI, and AI Ethics and the ability to Create AI as a separate construct, and AI Self-efficacy in learning and problem solving and AI Self-management. This study contributes to the research on AI literacy by providing a measurement instrument relying on profound competency models. In addition, higher-order psychological competencies are included that are particularly important in the context of pervasive change through AI systems.
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- (10 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.88)
- Research Report > New Finding (0.66)
- Health & Medicine (1.00)
- Education > Educational Setting > K-12 Education (0.93)
Digital doubles: In the future, virtual versions of ourselves could predict our behaviour
A digital twin is a copy of a person, product or process that is created using data. This might sound like science fiction, but some have claimed that you will likely have a digital double within the next decade. As a copy of a person, a digital twin would -- ideally -- make the same decisions that you would make if you were presented with the same materials. Read more: What are digital twins? This might seem like yet another speculative claim by futurists.
- Media (0.52)
- Leisure & Entertainment (0.32)
- Information Technology > Security & Privacy (0.31)
- Information Technology > Artificial Intelligence (0.96)
- Information Technology > Data Science > Data Mining > Big Data (0.31)