Goto

Collaborating Authors

 Law


Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models

arXiv.org Artificial Intelligence

Most web and digital trace data do not include information about an individual's nationality due to privacy concerns. The lack of data on nationality can create challenges for migration research. It can lead to a left-censoring issue since we are uncertain about the migrant's country of origin. Once we observe an emigration event, if we know the nationality, we can differentiate it from return migration. We propose methods to detect the nationality with the least available data, i.e., full names. We use the detected nationality in comparison with the country of academic origin, which is a common approach in studying the migration of researchers. We gathered 2.6 million unique name-nationality pairs from Wikipedia and categorized them into families of nationalities with three granularity levels to use as our training data. Using a character-based machine learning model, we achieved a weighted F1 score of 84% for the broadest and 67% for the most granular, country-level categorization. In our empirical study, we used the trained and tested model to assign nationality to 8+ million scholars' full names in Scopus data. Our results show that using the country of first publication as a proxy for nationality underestimates the size of return flows, especially for countries with a more diverse academic workforce, such as the USA, Australia, and Canada. We found that around 48% of emigration from the USA was return migration once we used the country of name origin, in contrast to 33% based on academic origin. In the most recent period, 79% of scholars whose affiliation has consistently changed from the USA to China, and are considered emigrants, have Chinese names in contrast to 41% with a Chinese academic origin. Our proposed methods for addressing left-censoring issues are beneficial for other research that uses digital trace data to study migration.


NeoQA: Evidence-based Question Answering with Generated News Events

arXiv.org Artificial Intelligence

Evaluating Retrieval-Augmented Generation (RAG) in large language models (LLMs) is challenging because benchmarks can quickly become stale. Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News Events for Out-of-training Question Answering), a benchmark designed to address this issue. To construct NeoQA, we generated timelines and knowledge bases of fictional news events and entities along with news articles and Q\&A pairs to prevent LLMs from leveraging pretraining knowledge, ensuring that no prior evidence exists in their training data. We propose our dataset as a new platform for evaluating evidence-based question answering, as it requires LLMs to generate responses exclusively from retrieved evidence and only when sufficient evidence is available. NeoQA enables controlled evaluation across various evidence scenarios, including cases with missing or misleading details. Our findings indicate that LLMs struggle to distinguish subtle mismatches between questions and evidence, and suffer from short-cut reasoning when key information required to answer a question is missing from the evidence, underscoring key limitations in evidence-based reasoning.


Summarisation of German Judgments in conjunction with a Class-based Evaluation

arXiv.org Artificial Intelligence

The automated summarisation of long legal documents can be a great aid for legal experts in their daily work. We automatically create summaries (guiding principles) of German judgments by fine-tuning a decoder-based large language model. We enrich the judgments with information about legal entities before the training. For the evaluation of the created summaries, we define a set of evaluation classes which allows us to measure their language, pertinence, completeness and correctness. Our results show that employing legal entities helps the generative model to find the relevant content, but the quality of the created summaries is not yet sufficient for a use in practice.


Automating Infrastructure Surveying: A Framework for Geometric Measurements and Compliance Assessment Using Point Cloud Data

arXiv.org Artificial Intelligence

Automation can play a prominent role in improving efficiency, accuracy, and scalability in infrastructure surveying and assessing construction and compliance standards. This paper presents a framework for automation of geometric measurements and compliance assessment using point cloud data. The proposed approach integrates deep learning-based detection and segmentation, in conjunction with geometric and signal processing techniques, to automate surveying tasks. As a proof of concept, we apply this framework to automatically evaluate the compliance of curb ramps with the Americans with Disabilities Act (ADA), demonstrating the utility of point cloud data in survey automation. The method leverages a newly collected, large annotated dataset of curb ramps, made publicly available as part of this work, to facilitate robust model training and evaluation. Experimental results, including comparison with manual field measurements of several ramps, validate the accuracy and reliability of the proposed method, highlighting its potential to significantly reduce manual effort and improve consistency in infrastructure assessment. Beyond ADA compliance, the proposed framework lays the groundwork for broader applications in infrastructure surveying and automated construction evaluation, promoting wider adoption of point cloud data in these domains. The annotated database, manual ramp survey data, and developed algorithms are publicly available on the project's GitHub page: https://github.com/Soltanilara/SurveyAutomation.


Crowding Out The Noise: Algorithmic Collective Action Under Differential Privacy

arXiv.org Artificial Intelligence

The integration of AI into daily life has generated considerable attention and excitement, while also raising concerns about automating algorithmic harms and re-entrenching existing social inequities. While the responsible deployment of trustworthy AI systems is a worthy goal, there are many possible ways to realize it, from policy and regulation to improved algorithm design and evaluation. In fact, since AI trains on social data, there is even a possibility for everyday users, citizens, or workers to directly steer its behavior through Algorithmic Collective Action, by deliberately modifying the data they share with a platform to drive its learning process in their favor. This paper considers how these grassroots efforts to influence AI interact with methods already used by AI firms and governments to improve model trustworthiness. In particular, we focus on the setting where the AI firm deploys a differentially private model, motivated by the growing regulatory focus on privacy and data protection. We investigate how the use of Differentially Private Stochastic Gradient Descent (DPSGD) affects the collective's ability to influence the learning process. Our findings show that while differential privacy contributes to the protection of individual data, it introduces challenges for effective algorithmic collective action. We characterize lower bounds on the success of algorithmic collective action under differential privacy as a function of the collective's size and the firm's privacy parameters, and verify these trends experimentally by simulating collective action during the training of deep neural network classifiers across several datasets.


Safety by Measurement: A Systematic Literature Review of AI Safety Evaluation Methods

arXiv.org Artificial Intelligence

As frontier AI systems advance toward transformative capabilities, we need a parallel transformation in how we measure and evaluate these systems to ensure safety and inform governance. While benchmarks have been the primary method for estimating model capabilities, they often fail to establish true upper bounds or predict deployment behavior. This literature review consolidates the rapidly evolving field of AI safety evaluations, proposing a systematic taxonomy around three dimensions: what properties we measure, how we measure them, and how these measurements integrate into frameworks. We show how evaluations go beyond benchmarks by measuring what models can do when pushed to the limit (capabilities), the behavioral tendencies exhibited by default (propensities), and whether our safety measures remain effective even when faced with subversive adversarial AI (control). These properties are measured through behavioral techniques like scaffolding, red teaming and supervised fine-tuning, alongside internal techniques such as representation analysis and mechanistic interpretability. We provide deeper explanations of some safety-critical capabilities like cybersecurity exploitation, deception, autonomous replication, and situational awareness, alongside concerning propensities like power-seeking and scheming. The review explores how these evaluation methods integrate into governance frameworks to translate results into concrete development decisions. We also highlight challenges to safety evaluations - proving absence of capabilities, potential model sandbagging, and incentives for "safetywashing" - while identifying promising research directions. By synthesizing scattered resources, this literature review aims to provide a central reference point for understanding AI safety evaluations.


GenAI in Entrepreneurship: a systematic review of generative artificial intelligence in entrepreneurship research: current issues and future directions

arXiv.org Artificial Intelligence

Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) are recognized to have significant effects on industry and business dynamics, not least because of their impact on the preconditions for entrepreneurship. There is still a lack of knowledge of GenAI as a theme in entrepreneurship research. This paper presents a systematic literature review aimed at identifying and analyzing the evolving landscape of research on the effects of GenAI on entrepreneurship. We analyze 83 peer-reviewed articles obtained from leading academic databases: Web of Science and Scopus. Using natural language processing and unsupervised machine learning techniques with TF-IDF vectorization, Principal Component Analysis (PCA), and hierarchical clustering, five major thematic clusters are identified: (1) Digital Transformation and Behavioral Models, (2) GenAI-Enhanced Education and Learning Systems, (3) Sustainable Innovation and Strategic AI Impact, (4) Business Models and Market Trends, and (5) Data-Driven Technological Trends in Entrepreneurship. Based on the review, we discuss future research directions, gaps in the current literature, as well as ethical concerns raised in the literature. We highlight the need for more macro-level research on GenAI and LLMs as external enablers for entrepreneurship and for research on effective regulatory frameworks that facilitate business experimentation, innovation, and further technology development.


Elton John and Dua Lipa seek protection from AI

BBC News

Not everyone agrees with the artists' approach. Julia Willemyns, co-founder of the Centre for British Progress think tank, said such proposals could hamper the UK and its bid for growth. The measures would "do nothing to stop foreign firms from using content from the British creative industries," she told the BBC. These tools, which can produce new content in response to simple text prompts, have become increasingly popular and available to consumers. But their capabilities have been accompanied by concerns and criticism over their data use and energy demand.


Here's How to Claim Up to 100 in Apple's Siri Settlement

WIRED

In January, Apple agreed to pay out 95 million to settle a class action lawsuit over claims its voice assistant Siri listened in on private conversations. Now, affected users have less than eight weeks to stake their claim to a slice of the cash. The Lopez v Apple Inc. lawsuit was filed back in December, accusing Apple of recording private conversations as a result of unintended Siri activations, and then sharing that data with third parties. Two plaintiffs claim they had related advertisements served to them after having personal conversations about particular brands, with another alleging they received an ad for a medical treatment following a private discussion with a doctor. This is not the first time Siri has been accused of eavesdropping.


Computer Ban Gave the Government Unfair Advantage in Anti-War Activist's Case, Lawyer Says

WIRED

An attorney with the American Civil Liberties Union (ACLU) who's overseeing a high-profile deportation case in Louisiana says she was stripped of her electronics moments before a pivotal hearing, preventing her from accessing evidence and court records that remained available to the three US government attorneys in the room, each of whom were allowed use of a laptop by the court. Louisiana immigration judge Jamee Comans ruled late last month that Columbia graduate student Mahmoud Khalil was eligible for deportation. During that hearing, however, Khalil's attorney Nora Ahmed says she was barred from bringing her laptop into the courtroom, despite having filed the proper paperwork in advance and being a frequent visitor to the immigration facility. "There should not be an advantage, no matter how small or how large, provided to a particular party over the other," says Ahmed. "Because that starts to infect the proceedings themselves and the notion of fundamental fairness that we all uphold in courtroom proceedings." The Justice Department did not respond to a request for comment.