Goto

Collaborating Authors

 Law


WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai

arXiv.org Artificial Intelligence

Large language models excel at instruction-following in English, but their performance in low-resource languages like Thai remains underexplored. Existing benchmarks often rely on translations, missing cultural and domain-specific nuances needed for real-world use. We present WangchanThaiInstruct, a human-authored Thai dataset for evaluation and instruction tuning, covering four professional domains and seven task types. Created through a multi-stage quality control process with annotators, domain experts, and AI researchers, WangchanThaiInstruct supports two studies: (1) a zero-shot evaluation showing performance gaps on culturally and professionally specific tasks, and (2) an instruction tuning study with ablations isolating the effect of native supervision. Models fine-tuned on WangchanThaiInstruct outperform those using translated data in both in-domain and out-of-domain benchmarks. These findings underscore the need for culturally and professionally grounded instruction data to improve LLM alignment in low-resource, linguistically diverse settings.


Beyond Linear Steering: Unified Multi-Attribute Control for Language Models

arXiv.org Artificial Intelligence

Controlling multiple behavioral attributes in large language models (LLMs) at inference time is a challenging problem due to interference between attributes and the limitations of linear steering methods, which assume additive behavior in activation space and require per-attribute tuning. We introduce K-Steering, a unified and flexible approach that trains a single non-linear multi-label classifier on hidden activations and computes intervention directions via gradients at inference time. This avoids linearity assumptions, removes the need for storing and tuning separate attribute vectors, and allows dynamic composition of behaviors without retraining. To evaluate our method, we propose two new benchmarks, ToneBank and DebateMix, targeting compositional behavioral control. Empirical results across 3 model families, validated by both activation-based classifiers and LLM-based judges, demonstrate that K-Steering outperforms strong baselines in accurately steering multiple behaviors.


US children among five killed in Israeli drone strike on southern Lebanon

Al Jazeera

Why is Israel still in southern Lebanon? A war to shape Lebanon's future An Israeli drone strike has killed five people, including three children, in the southern Lebanese town of Bint Jbeil, Lebanon's Health Ministry has said, as Israel continues to target its neighbour despite a US-brokered truce that took effect in November. The state-run National News Agency (NNA) reported on Sunday that the strike targeted a motorcycle and a vehicle, and wounded two other people. Why then did Israel attack Syria? The mother of the children was injured in the attack.


Chatbot site depicting child sexual abuse images raises fears over misuse of AI

The Guardian

The IWF said it had been alerted to a chatbot site that offered scenarios including'child prostitute in a hotel' and'child and teacher alone after class'. The IWF said it had been alerted to a chatbot site that offered scenarios including'child prostitute in a hotel' and'child and teacher alone after class'. A chatbot site offering explicit scenarios with preteen characters, illustrated by illegal abuse images has raised fresh fears about the misuse of artificial intelligence. A report by a child safety watchdog has triggered calls for the UK government to impose safety guidelines on AI companies, amid a surge in child sexual abuse material (CSAM) created by the technology. The Internet Watch Foundation said it had been alerted to a chatbot site that offered a number of scenarios including "child prostitute in a hotel", "sex with your child while your wife is on holiday" and "child and teacher alone after class".


Meta Accused of Torrenting Porn to Advance Its Goal of AI 'Superintelligence'

WIRED

The complaint, filed in July, alleges Meta has been torrenting and seeding Strike 3's videos since 2018. Associated exhibits and details of the complaint were unsealed last week. Strike 3 alleges Meta's motive was partly to obtain otherwise difficult to scrape visual angles, parts of the human body, and extended, uninterrupted scenes--rare in mainstream movies and TV--to help it create what Mark Zuckerberg calls AI "superintelligence." "They have an interest in getting our content because it can give them a competitive advantage for the quality, fluidity, and humanity of the AI," alleges Christian Waugh, an attorney for Strike 3. This process made Strike 3's porn videos accessible to minors, the complaint alleges, since BitTorrent does not have age verification.


ChatGPT was used 'to help scammers do their thing' at Asia fraud compound

The Japan Times

ChatGPT was used'to help scammers do their thing' at Asia fraud compound ChatGPT owner OpenAI says it actively works to identify and disrupt scam-related misuse of ChatGPT." Duncan Okindo says he was lured to Southeast Asia last year by the promise of a customer service job in Thailand. Instead, he ended up spending four months in a scam compound on the lawless Myanmar-Thai border, where he saw first-hand how criminal groups are at scale. Okindo, 26, says he was struggling to find a job as the breadwinner for his family in his native Kenya when a local recruitment agency promised him work in Bangkok. The flight was his first trip overseas. On landing, he says, he was abducted at the airport and spirited across the border, into the notorious KK Park complex, guarded by heavily armed men and fortified like it was meant for war."


Trustless Autonomy: Understanding Motivations, Benefits, and Governance Dilemmas in Self-Sovereign Decentralized AI Agents

arXiv.org Artificial Intelligence

The recent trend of self-sovereign Decentralized AI Agents (DeAgents) combines Large Language Model (LLM)-based AI agents with decentralization technologies such as blockchain smart contracts and trusted execution environments (TEEs). These tamper-resistant trustless substrates allow agents to achieve self-sovereignty through ownership of cryptowallet private keys and control of digital assets and social media accounts. DeAgents eliminate centralized control and reduce human intervention, addressing key trust concerns inherent in centralized AI systems. This contributes to social computing by enabling new human cooperative paradigm "intelligence as commons." However, given ongoing challenges in LLM reliability such as hallucinations, this creates paradoxical tension between trustlessness and unreliable autonomy. This study addresses this empirical research gap through interviews with DeAgents stakeholders-experts, founders, and developers-to examine their motivations, benefits, and governance dilemmas. The findings will guide future DeAgents system and protocol design and inform discussions about governance in sociotechnical AI systems in the future agentic web.


CausalPre: Scalable and Effective Data Pre-processing for Causal Fairness

arXiv.org Artificial Intelligence

Abstract--Causal fairness in databases is crucial to preventing biased and inaccurate outcomes in downstream tasks. While most prior work assumes a known causal model, recent efforts relax this assumption by enforcing additional constraints. However, these approaches often fail to capture broader attribute relationships that are critical to maintaining utility. This raises a fundamental question: Can we harness the benefits of causal reasoning to design efficient and effective fairness solutions without relying on strong assumptions about the underlying causal model? In this paper, we seek to answer this question by introducing CausalPre, a scalable and effective causality-guided data pre-processing framework that guarantees justifiable fairness, a strong causal notion of fairness. CausalPre extracts causally fair relationships by reformulating the originally complex and computationally infeasible extraction task into a tailored distribution estimation problem. T o ensure scalability, CausalPre adopts a carefully crafted variant of low-dimensional marginal factorization to approximate the joint distribution, complemented by a heuristic algorithm that efficiently tackles the associated computational challenge. Extensive experiments on benchmark datasets demonstrate that CausalPre is both effective and scalable, challenging the conventional belief that achieving causal fairness requires trading off relationship coverage for relaxed model assumptions. Machine learning (ML) systems are increasingly integrated into decision-making processes in domains such as education [1], finance [2], employment [3], advertising [4], and law enforcement [5], [6]. While these systems offer efficiency and scalability, they also pose serious concerns about fairness [7]- [14]. In particular, their reliance on historical data can unintentionally amplify biases, producing inaccurate, discriminatory outcomes with severe real-world impacts in high-stakes areas like criminal justice. These concerns have motivated the development of fairness-aware data pre-processing techniques within database management systems (DBMS) [15]-[22]. Compared to traditional fairness interventions at the model training or inference stages [23]-[28], pre-processing methods offer: (i) a once-for-all benefit, meaning that once data is calibrated for fairness, it can be used in any downstream task, regardless of the ML model employed; and (ii) a user-friendly workflow, as fairness considerations are directly embedded into the data pre-processing pipeline, enabling practitioners to focus on the downstream task without specialized fairness expertise. A straightforward approach to achieve this is to remove all sensitive attributes (e.g., gender and race) from the training data. However, such ad hoc solutions often fail in practice, as non-sensitive attributes may act as proxies for sensitive ones, particularly when strong correlations exist [18], [29].


Emergent Alignment via Competition

arXiv.org Artificial Intelligence

Aligning AI systems with human values remains a fundamental challenge, but does our inability to create perfectly aligned models preclude obtaining the benefits of alignment? We study a strategic setting where a human user interacts with multiple differently misaligned AI agents, none of which are individually well-aligned. Our key insight is that when the users utility lies approximately within the convex hull of the agents utilities, a condition that becomes easier to satisfy as model diversity increases, strategic competition can yield outcomes comparable to interacting with a perfectly aligned model. We model this as a multi-leader Stackelberg game, extending Bayesian persuasion to multi-round conversations between differently informed parties, and prove three results: (1) when perfect alignment would allow the user to learn her Bayes-optimal action, she can also do so in all equilibria under the convex hull condition (2) under weaker assumptions requiring only approximate utility learning, a non-strategic user employing quantal response achieves near-optimal utility in all equilibria and (3) when the user selects the best single AI after an evaluation period, equilibrium guarantees remain near-optimal without further distributional assumptions. We complement the theory with two sets of experiments.


Sentinel Agents for Secure and Trustworthy Agentic AI in Multi-Agent Systems

arXiv.org Artificial Intelligence

This paper proposes a novel architectural framework aimed at enhancing security and reliability in multi-agent systems (MAS). A central component of this framework is a network of Sentinel Agents, functioning as a distributed security layer that integrates techniques such as semantic analysis via large language models (LLMs), behavioral analytics, retrieval-augmented verification, and cross-agent anomaly detection. Such agents can potentially oversee inter-agent communications, identify potential threats, enforce privacy and access controls, and maintain comprehensive audit records. Complementary to the idea of Sentinel Agents is the use of a Coordinator Agent. The Coordinator Agent supervises policy implementation, and manages agent participation. In addition, the Coordinator also ingests alerts from Sentinel Agents. Based on these alerts, it can adapt policies, isolate or quarantine misbehaving agents, and contain threats to maintain the integrity of the MAS ecosystem. This dual-layered security approach, combining the continuous monitoring of Sentinel Agents with the governance functions of Coordinator Agents, supports dynamic and adaptive defense mechanisms against a range of threats, including prompt injection, collusive agent behavior, hallucinations generated by LLMs, privacy breaches, and coordinated multi-agent attacks. In addition to the architectural design, we present a simulation study where 162 synthetic attacks of different families (prompt injection, hallucination, and data exfiltration) were injected into a multi-agent conversational environment. The Sentinel Agents successfully detected the attack attempts, confirming the practical feasibility of the proposed monitoring approach. The framework also offers enhanced system observability, supports regulatory compliance, and enables policy evolution over time.