Goto

Collaborating Authors

 Law


Assumed Identities: Quantifying Gender Bias in Machine Translation of Ambiguous Occupational Terms

arXiv.org Artificial Intelligence

Machine Translation (MT) systems frequently encounter ambiguous scenarios where they must assign gender to certain occupations when translating without explicit guidance or contextual cues. While individual translations in such cases may not be inherently biased, systematic patterns-such as the repeated association of certain professions with specific genders-can emerge, reflecting and perpetuating societal stereotypes. This ambiguity challenges traditional instance-level single-answer evaluation approaches, as no single gold standard translation exists. To address this, we propose an approach that evaluates gender bias through aggregated model responses. Specifically, we introduce a methodology to detect gender imbalances between source texts and translations, a benchmarking dataset with ambiguous English inputs, and probability-based metrics to quantify a model's divergence from normative standards or reference distributions.


Blockchain As a Platform For Artificial Intelligence (AI) Transparency

arXiv.org Artificial Intelligence

As artificial intelligence (AI) systems become increasingly complex and autonomous, concerns over transparency and accountability have intensified. The "black box" problem in AI decision-making limits stakeholders' ability to understand, trust, and verify outcomes, particularly in high-stakes sectors such as healthcare, finance, and autonomous systems. Blockchain technology, with its decentralized, immutable, and transparent characteristics, presents a potential solution to enhance AI transparency and auditability. This paper explores the integration of blockchain with AI to improve decision traceability, data provenance, and model accountability. By leveraging blockchain as an immutable record-keeping system, AI decision-making can become more interpretable, fostering trust among users and regulatory compliance. However, challenges such as scalability, integration complexity, and computational overhead must be addressed to fully realize this synergy. This study discusses existing research, proposes a framework for blockchain-enhanced AI transparency, and highlights practical applications, benefits, and limitations. The findings suggest that blockchain could be a foundational technology for ensuring AI systems remain accountable, ethical, and aligned with regulatory standards.


Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning tasks, leading to their widespread deployment. However, recent studies have highlighted concerning biases in these models, particularly in their handling of dialectal variations like African American English (AAE). In this work, we systematically investigate dialectal disparities in LLM reasoning tasks. We develop an experimental framework comparing LLM performance given Standard American English (SAE) and AAE prompts, combining LLM-based dialect conversion with established linguistic analyses. We find that LLMs consistently produce less accurate responses and simpler reasoning chains and explanations for AAE inputs compared to equivalent SAE questions, with disparities most pronounced in social science and humanities domains. These findings highlight systematic differences in how LLMs process and reason about different language varieties, raising important questions about the development and deployment of these systems in our multilingual and multidialectal world. Our code repository is publicly available at https://github.com/Runtaozhou/dialect_bias_eval.


Partial Distribution Alignment via Adaptive Optimal Transport

arXiv.org Artificial Intelligence

As an compare non-parametric probability distributions by exploiting instantiation application, we propose a novel machine learning the geometry of the underlying metric space. To name a few, paradigm based on adaptive optimal transport. It conducts optimal transport plays a crucial role in a wide variety of the partial distribution alignment between source and target machine learning applications, such as generative adversarial domains by treating the noises, outliers, and distribution shifts networks [1], computer vision [2], natural language processing in a principled way. Furthermore, we investigate the mass [3], clustering [4], semi-supervised learning [5], and domain allocation mechanism of adaptive optimal transport and derive adaptation [6]. The essential problem in these applications is the duality theory. The theoretical analysis provides insights how to compare two probability distributions such as aligning into adaptive optimal transport and reinforces its mathematical the fake images with the real images, aligning images with foundation. We believe that adaptive optimal transport is of audio, or aligning the AI generated content with human great interests to the broad areas such as artificial intelligence, feedback in large language model.


The study of short texts in digital politics: Document aggregation for topic modeling

arXiv.org Artificial Intelligence

Statistical topic modeling is widely used in political science to study text. Researchers examine documents of varying lengths, from tweets to speeches. There is ongoing debate on how document length affects the interpretability of topic models. We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus. In our study, we analyze one million tweets by U.S. state legislators from April 2016 to September 2020. We find that for documents aggregated at the account level, topics are more associated with individual states than when using individual tweets. This finding is replicated with Wikipedia pages aggregated by birth cities, showing how document definitions can impact topic modeling results.


A Consensus Privacy Metrics Framework for Synthetic Data

arXiv.org Artificial Intelligence

Synthetic data generation is one approach for sharing individual-level data. However, to meet legislative requirements, it is necessary to demonstrate that the individuals' privacy is adequately protected. There is no consolidated standard for measuring privacy in synthetic data. Through an expert panel and consensus process, we developed a framework for evaluating privacy in synthetic data. Our findings indicate that current similarity metrics fail to measure identity disclosure, and their use is discouraged. For differentially private synthetic data, a privacy budget other than close to zero was not considered interpretable. There was consensus on the importance of membership and attribute disclosure, both of which involve inferring personal information about an individual without necessarily revealing their identity. The resultant framework provides precise recommendations for metrics that address these types of disclosures effectively. Our findings further present specific opportunities for future research that can help with widespread adoption of synthetic data.


SafeArena: Evaluating the Safety of Autonomous Web Agents

arXiv.org Artificial Intelligence

LLM-based agents are becoming increasingly proficient at solving web-based tasks. With this capability comes a greater risk of misuse for malicious purposes, such as posting misinformation in an online forum or selling illicit substances on a website. To evaluate these risks, we propose SafeArena, the first benchmark to focus on the deliberate misuse of web agents. SafeArena comprises 250 safe and 250 harmful tasks across four websites. We classify the harmful tasks into five harm categories -- misinformation, illegal activity, harassment, cybercrime, and social bias, designed to assess realistic misuses of web agents. We evaluate leading LLM-based web agents, including GPT-4o, Claude-3.5 Sonnet, Qwen-2-VL 72B, and Llama-3.2 90B, on our benchmark. To systematically assess their susceptibility to harmful tasks, we introduce the Agent Risk Assessment framework that categorizes agent behavior across four risk levels. We find agents are surprisingly compliant with malicious requests, with GPT-4o and Qwen-2 completing 34.7% and 27.3% of harmful requests, respectively. Our findings highlight the urgent need for safety alignment procedures for web agents. Our benchmark is available here: https://safearena.github.io


IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

arXiv.org Artificial Intelligence

We introduce IFIR, the first comprehensive benchmark designed to evaluate instruction-following information retrieval (IR) in expert domains. IFIR includes 2,426 high-quality examples and covers eight subsets across four specialized domains: finance, law, healthcare, and science literature. Each subset addresses one or more domain-specific retrieval tasks, replicating real-world scenarios where customized instructions are critical. IFIR enables a detailed analysis of instruction-following retrieval capabilities by incorporating instructions at different levels of complexity. We also propose a novel LLM-based evaluation method to provide a more precise and reliable assessment of model performance in following instructions. Through extensive experiments on 15 frontier retrieval models, including those based on LLMs, our results reveal that current models face significant challenges in effectively following complex, domain-specific instructions. We further provide in-depth analyses to highlight these limitations, offering valuable insights to guide future advancements in retriever development.


HalluCounter: Reference-free LLM Hallucination Detection in the Wild!

arXiv.org Artificial Intelligence

Response consistency-based, reference-free hallucination detection (RFHD) methods do not depend on internal model states, such as generation probabilities or gradients, which Grey-box models typically rely on but are inaccessible in closed-source LLMs. However, their inability to capture query-response alignment patterns often results in lower detection accuracy. Additionally, the lack of large-scale benchmark datasets spanning diverse domains remains a challenge, as most existing datasets are limited in size and scope. To this end, we propose HalluCounter, a novel reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. This enables the training of a classifier that detects hallucinations and provides a confidence score and an optimal response for user queries. Furthermore, we introduce HalluCounterEval, a benchmark dataset comprising both synthetically generated and human-curated samples across multiple domains. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90\% average confidence in hallucination detection across datasets.


Privacy Preserving and Robust Aggregation for Cross-Silo Federated Learning in Non-IID Settings

arXiv.org Artificial Intelligence

Federated Averaging remains the most widely used aggregation strategy in federated learning due to its simplicity and scalability. However, its performance degrades significantly in non-IID data settings, where client distributions are highly imbalanced or skewed. Additionally, it relies on clients transmitting metadata, specifically the number of training samples, which introduces privacy risks and may conflict with regulatory frameworks like the European GDPR. In this paper, we propose a novel aggregation strategy that addresses these challenges by introducing class-aware gradient masking. Unlike traditional approaches, our method relies solely on gradient updates, eliminating the need for any additional client metadata, thereby enhancing privacy protection. Furthermore, our approach validates and dynamically weights client contributions based on class-specific importance, ensuring robustness against non-IID distributions, convergence prevention, and backdoor attacks. Extensive experiments on benchmark datasets demonstrate that our method not only outperforms FedAvg and other widely accepted aggregation strategies in non-IID settings but also preserves model integrity in adversarial scenarios. Our results establish the effectiveness of gradient masking as a practical and secure solution for federated learning.