AITopics

The auditing of financial documents, historically a labor-intensive process, stands on the precipice of transformation. AI-driven solutions have made inroads into streamlining this process by recommending pertinent text passages from financial reports to align with the legal requirements of accounting standards. However, a glaring limitation remains: these systems commonly fall short in verifying if the recommended excerpts indeed comply with the specific legal mandates. Hence, in this paper, we probe the efficiency of publicly available Large Language Models (LLMs) in the realm of regulatory compliance across different model configurations. We place particular emphasis on comparing cutting-edge open-source LLMs, such as Llama-2, with their proprietary counterparts like OpenAI's GPT models. This comparative analysis leverages two custom datasets provided by our partner PricewaterhouseCoopers (PwC) Germany. We find that the open-source Llama-2 70 billion model demonstrates outstanding performance in detecting non-compliance or true negative occurrences, beating all their proprietary counterparts. Nevertheless, proprietary models such as GPT-4 perform the best in a broad variety of scenarios, particularly in non-English contexts.

large language model, machine learning, natural language, (19 more...)

doi: 10.1109/BigData59044.2023.10386518

2507.16642

Country: Europe > Germany (0.49)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A Comprehensive Data-centric Overview of Federated Graph Learning

Wu, Zhengyu, Li, Xunkai, Zhu, Yinlin, Chen, Zekai, Yan, Guochen, Yan, Yanyu, Zhang, Hao, Ai, Yuming, Jin, Xinmo, Li, Rong-Hua, Wang, Guoren

In the era of big data applications, Federated Graph Learning (FGL) has emerged as a prominent solution that reconcile the tradeoff between optimizing the collective intelligence between decentralized datasets holders and preserving sensitive information to maximum. Existing FGL surveys have contributed meaningfully but largely focus on integrating Federated Learning (FL) and Graph Machine Learning (GML), resulting in early stage taxonomies that emphasis on methodology and simulated scenarios. Notably, a data centric perspective, which systematically examines FGL methods through the lens of data properties and usage, remains unadapted to reorganize FGL research, yet it is critical to assess how FGL studies manage to tackle data centric constraints to enhance model performances. This survey propose a two-level data centric taxonomy: Data Characteristics, which categorizes studies based on the structural and distributional properties of datasets used in FGL, and Data Utilization, which analyzes the training procedures and techniques employed to overcome key data centric challenges. Each taxonomy level is defined by three orthogonal criteria, each representing a distinct data centric configuration. Beyond taxonomy, this survey examines FGL integration with Pretrained Large Models, showcases realistic applications, and highlights future direction aligned with emerging trends in GML.

artificial intelligence, graph, machine learning, (15 more...)

2507.16541

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Phutane, Mahika, Vashistha, Aditya

Disability Across Cultures: A Human-Centered Audit of Ableism in Western and Indic LLMs

People with disabilities (PwD) experience disproportionately high levels of discrimination and hate online, particularly in India, where entrenched stigma and limited resources intensify these challenges. Large language models (LLMs) are increasingly used to identify and mitigate online hate, yet most research on online ableism focuses on Western audiences with Western AI models. Are these models adequately equipped to recognize ableist harm in non-Western places like India? Do localized, Indic language models perform better? To investigate, we adopted and translated a publicly available ableist speech dataset to Hindi, and prompted eight LLMs--four developed in the U.S. (GPT-4, Gemini, Claude, Llama) and four in India (Krutrim, Nanda, Gajendra, Airavata)--to score and explain ableism. In parallel, we recruited 175 PwD from both the U.S. and India to perform the same task, revealing stark differences between groups. Western LLMs consistently overestimated ableist harm, while Indic LLMs underestimated it. Even more concerning, all LLMs were more tolerant of ableism when it was expressed in Hindi and asserted Western framings of ableist harm. In contrast, Indian PwD interpreted harm through intention, relationality, and resilience--emphasizing a desire to inform and educate perpetrators. This work provides groundwork for global, inclusive standards of ableism, demonstrating the need to center local disability experiences in the design and evaluation of AI systems.

large language model, machine learning, pwd, (20 more...)

2507.1613

Country:

North America > United States (1.00)
Asia > India (0.89)
Asia > Middle East > UAE (0.46)

Genre: Research Report > New Finding (0.95)

Industry:

Law (1.00)
Government (1.00)
Health & Medicine > Consumer Health (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task

Bock, Michael R., Molisee, Kara, Ozer, Zachary, Shah, Sumit

Can AI file your taxes? Not yet. Calculating US personal income taxes is a task that requires building an understanding of vast amounts of English text and using that knowledge to carefully compute results. We propose TaxCalcBench, a benchmark for determining models' abilities to calculate personal income tax returns given all of the necessary information. Our experiment shows that state-of-the-art models succeed in calculating less than a third of federal income tax returns even on this simplified sample set. Our analysis concludes that models consistently misuse tax tables, make errors in tax calculation, and incorrectly determine eligibility. Our findings point to the need for additional infrastructure to apply LLMs to the personal income tax calculation task.

large language model, machine learning, natural language, (22 more...)

2507.16126

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Law > Taxation Law (1.00)
Government > Tax (1.00)
Government > Regional Government > North America Government > United States Government (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

"Just a strange pic": Evaluating 'safety' in GenAI Image safety annotation tasks from diverse annotators' perspectives

Wang, Ding, Díaz, Mark, Rastogi, Charvi, Davani, Aida, Prabhakaran, Vinodkumar, Mishra, Pushkar, Patel, Roma, Parrish, Alicia, Ashwood, Zoe, Paganini, Michela, Teh, Tian Huey, Rieser, Verena, Aroyo, Lora

Understanding what constitutes safety in AI-generated content is complex. While developers often rely on predefined taxonomies, real-world safety judgments also involve personal, social, and cultural perceptions of harm. This paper examines how annotators evaluate the safety of AI-generated images, focusing on the qualitative reasoning behind their judgments. Analyzing 5,372 open-ended comments, we find that annotators consistently invoke moral, emotional, and contextual reasoning that extends beyond structured safety categories. Many reflect on potential harm to others more than to themselves, grounding their judgments in lived experience, collective risk, and sociocultural awareness. Beyond individual perceptions, we also find that the structure of the task itself -- including annotation guidelines -- shapes how annotators interpret and express harm. Guidelines influence not only which images are flagged, but also the moral judgment behind the justifications. Annotators frequently cite factors such as image quality, visual distortion, and mismatches between prompt and output as contributing to perceived harm dimensions, which are often overlooked in standard evaluation frameworks. Our findings reveal that existing safety pipelines miss critical forms of reasoning that annotators bring to the task. We argue for evaluation designs that scaffold moral reflection, differentiate types of harm, and make space for subjective, context-sensitive interpretations of AI-generated content.

annotator, machine learning, natural language, (21 more...)

2507.16033

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Los Angeles TimesJul-22-2025, 18:47:17 GMT

Chabria: 3 things that should scare us about Trump's fake video of Obama

On Sunday, our thoughtful and reserved president reposted on his Truth Social site a video generated by artificial intelligence that falsely showed former President Obama being arrested and imprisoned. There are those among you who think this is high humor; those among you who who find it as tiresome as it is offensive; and those among you blissfully unaware of the mental morass that is Truth Social. Whatever camp you fall into, the video crosses all demographics by being expected -- just another crazy Trump stunt in a repetitive cycle of division and diversion so frequent it makes Groundhog Day seem fresh. But there are three reasons why this particular video -- not made by the president but amplified to thousands -- is worth noting, and maybe even worth fearing. First, it is flat-out racist. In it, Obama is ripped out of a chair in the Oval Office and forced onto his knees, almost bowing, to a laughing Trump.

artificial intelligence, social media, trump, (11 more...)

Los Angeles Times

Country:

North America > United States > Ohio (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
Europe > Russia (0.05)
Asia > Russia (0.05)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law (0.95)

Technology:

Information Technology > Artificial Intelligence (0.55)
Information Technology > Communications > Social Media (0.46)

LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra

Karten, Seth, Li, Wenzhe, Ding, Zihan, Kleiner, Samuel, Bai, Yu, Jin, Chi

We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower level, bounded rational worker agents -- instantiated as persona-conditioned prompts sampled from U.S. Census-calibrated income and demographic statistics -- choose labor supply to maximize text-based utility functions learned in-context. At the upper level, a planner agent employs in-context reinforcement learning to propose piecewise-linear marginal tax schedules anchored to the current U.S. federal brackets. This construction endows economic simulacra with three capabilities requisite for credible fiscal experimentation: (i) optimization of heterogeneous utilities, (ii) principled generation of large, demographically realistic agent populations, and (iii) mechanism design -- the ultimate nudging problem -- expressed entirely in natural language. Experiments with populations of up to one hundred interacting agents show that the planner converges near Stackelberg equilibria that improve aggregate social welfare relative to Saez solutions, while a periodic, persona-level voting procedure furthers these gains under decentralized governance. These results demonstrate that large language model-based agents can jointly model, simulate, and govern complex economic systems, providing a tractable test bed for policy evaluation at the societal scale to help build better civilizations.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2507.15815

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Taxation Law (1.00)
Health & Medicine (1.00)
Government > Tax (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Rodríguez-Barroso, Nuria, García-Márquez, Mario, Luzón, M. Victoria, Herrera, Francisco

Challenges of Trustworthy Federated Learning: What's Done, Current Trends and Remaining Work

In recent years, the development of Trustworthy Artificial Intelligence (TAI) has emerged as a critical objective in the deployment of AI systems across sensitive and high-risk domains. TAI frameworks articulate a comprehensive set of ethical, legal, and technical requirements to ensure that AI technologies are aligned with human values, rights, and societal expectations. Among the various AI paradigms, Federated Learning (FL) presents a promising solution to pressing privacy concerns. However, aligning FL with the rest of the requirements of TAI presents a series of challenges, most of which arise from its inherently distributed nature. In this work, we adopt the requirements TAI as a guiding structure to systematically analyze the challenges of adapting FL to TAI. Specifically, we classify and examine the key obstacles to aligning FL with TAI, providing a detailed exploration of what has been done, the trends, and the remaining work within each of the identified challenges.

artificial intelligence, data mining, machine learning, (15 more...)

2507.15796

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.87)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Hanmongkolchai, Manatsawin

Applying the Chinese Wall Reverse Engineering Technique to Large Language Model Code Editing

This work does not provide legal advice, and do not claims that any legal opinion provided are correct 1 and block the model's output accordingly. This technique might not completely block partial matches and does not work for open source development as reproduction of the original code is expected. Some models address this issue by using curated datasets with appropriate licensed contents. For example, the Stack v2 dataset [2] and Starcoder2 model limits data to permissively licensed sources and contents with unknown license. The Common Pile dataset [3] and the accompanied Comma model improves on this by limiting the dataset to permissive licensed contents only. Most permissive licenses only have attribution as its primary sole licensing condition and may be easier to comply with than the GPLv2 license. Ideally, models that are trained on public domain contents may be the best in terms of legal compliance as they have no restrictions or requirements, but to our knowledge no such text generation models exist today with reasonable quality.

large language model, machine learning, natural language, (21 more...)

2507.15599

Genre: Research Report (0.40)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)

The Constitutional Controller: Doubt-Calibrated Steering of Compliant Agents

Kohaut, Simon, Divo, Felix, Hamid, Navid, Flade, Benedict, Eggert, Julian, Dhami, Devendra Singh, Kersting, Kristian

Ensuring reliable and rule-compliant behavior of autonomous agents in uncertain environments remains a fundamental challenge in modern robotics. Our work shows how neuro-symbolic systems, which integrate probabilistic, symbolic white-box reasoning models with deep learning methods, offer a powerful solution to this challenge. This enables the simultaneous consideration of explicit rules and neural models trained on noisy data, combining the strength of structured reasoning with flexible representations. To this end, we introduce the Constitutional Controller (CoCo), a novel framework designed to enhance the safety and reliability of agents by reasoning over deep probabilistic logic programs representing constraints such as those found in shared traffic spaces. Furthermore, we propose the concept of self-doubt, implemented as a probability density conditioned on doubt features such as travel velocity, employed sensors, or health factors. In a real-world aerial mobility study, we demonstrate CoCo's advantages for intelligent autonomous systems to learn appropriate doubts and navigate complex and uncertain environments safely and compliantly.

artificial intelligence, deep learning, machine learning, (17 more...)

2507.15478

Country: Europe > Germany (0.69)

Genre: Research Report (0.50)

Industry:

Transportation > Air (1.00)
Law (1.00)
Transportation > Infrastructure & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)