AITopics

2503.23979

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > United Kingdom > England > Bristol (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Banking & Finance > Credit (1.00)
Law (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Bermudez-Villalva, Adrian, Mehrnezhad, Maryam, Toreini, Ehsan

Measuring Online Hate on 4chan using Pre-trained Deep Learning Models

-- Online hate speech can harmfully impact individuals and groups, specifically on non - moderated platforms such as 4chan where users can post anonymous content. This work focuses on analy s ing and measuring the prevalence of online hat e on 4chan's politically incorrect board (/pol/) using state - of - the - art Natural Language Processing (NLP) models, specifically transformer - based models such as RoBERTa and Detoxify . By leveraging these advanced models, we provide an in - depth analysis of hate speech dynamics and quantify the extent of online hate non - moderated platforms. The study advances understanding through multi - class classification of hate speech (racism, sexism, religion, etc.), while also incorporating the classification of toxic content (e.g., identity attacks and threats) and a further topic modelling analysis. The results show that 11.20% of this dataset is identified as containing hate in different categories. These evaluations show that online hate is manifested in various forms, confirming the complicated and volatile nature of detection in the wild. Index Terms -- Hate speech, machine learning, natural language processing (NLP), online hate, toxicity analysis. INTRODUCTION H E SPREAD of hate speech on online platforms has become a serious problem in our society. As digital communication becomes ubiquitous, platforms like 4chan, known for their anonymity and minimal moderation, have become hotspots for this harmful behaviour . This is particularly evident on its politically incorrect board, /pol/, a notorious board dedicated to discussing politics and current events, often associated with hate speech, extremist content, and conspiracy theories [1] . The anonymity provided by these platforms often encourages users to express extreme ideologies [2] . This issue raises significant concerns about the impact on at - risk and vulnerable groups as it can cause real - world harm, including psychological trauma. Therefore, a systematic approach is needed to measure and understand the prevalence and forms of online hate. Received 28 August 2024; revised 23 December 2024, 10 February 2025, and 6 March 2025; accepted 6 March 2025. This work is supported by the UK Research and Innovation (UKRI), through the Strategic Priority Fund as part of the Protecting Citizens Online programme (AGENCY: Assuring Citizen Agency in a World with Complex Online Harms, EP/W032481/2).

artificial intelligence, machine learning, natural language, (20 more...)

doi: 10.1109/TTS.2025.3549931

2504.00045

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Surrey > Guildford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Law Enforcement & Public Safety (0.88)
Law > Civil Rights & Constitutional Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

MiZero: The Shadowy Defender Against Text Style Infringements

Zhang, Ziwei, Wen, Juan, Peng, Wanli, Wu, Zhengxian, Zhou, Yinghan, Xue, Yiming

In-Context Learning (ICL) and efficient fine-tuning methods significantly enhanced the efficiency of applying Large Language Models (LLMs) to downstream tasks. However, they also raise concerns about the imitation and infringement of personal creative data. Current methods for data copyright protection primarily focuses on content security but lacks effectiveness in protecting the copyrights of text styles. In this paper, we introduce a novel implicit zero-watermarking scheme, namely MiZero. This scheme establishes a precise watermark domain to protect the copyrighted style, surpassing traditional watermarking methods that distort the style characteristics. Specifically, we employ LLMs to extract condensed-lists utilizing the designed instance delimitation mechanism. These lists guide MiZero in generating the watermark. Extensive experiments demonstrate that MiZero effectively verifies text style copyright ownership against AI imitation.

large language model, machine learning, natural language, (18 more...)

2504.00035

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

JudgeLRM: Large Reasoning Models as a Judge

Chen, Nuo, Hu, Zhiyuan, Zou, Qingyun, Wu, Jiaying, Wang, Qian, Hooi, Bryan, He, Bingsheng

The rise of Large Language Models (LLMs) as evaluators offers a scalable alternative to human annotation, yet existing Supervised Fine-Tuning (SFT) for judges approaches often fall short in domains requiring complex reasoning. In this work, we investigate whether LLM judges truly benefit from enhanced reasoning capabilities. Through a detailed analysis of reasoning requirements across evaluation tasks, we reveal a negative correlation between SFT performance gains and the proportion of reasoning-demanding samples - highlighting the limitations of SFT in such scenarios. To address this, we introduce JudgeLRM, a family of judgment-oriented LLMs trained using reinforcement learning (RL) with judge-wise, outcome-driven rewards. JudgeLRM models consistently outperform both SFT-tuned and state-of-the-art reasoning models. Notably, JudgeLRM-3B surpasses GPT-4, and JudgeLRM-7B outperforms DeepSeek-R1 by 2.79% in F1 score, particularly excelling in judge tasks requiring deep reasoning.

large language model, machine learning, natural language, (20 more...)

2504.0005

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Media (1.00)
Health & Medicine > Consumer Health (0.93)
Law > Civil Rights & Constitutional Law (0.68)
Health & Medicine > Therapeutic Area > Dermatology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Artificial intelligence and democracy: Towards digital authoritarianism or a democratic upgrade?

Panagopoulou, Fereniki

I) Introduction Do robots vote? Do machines make decisions instead of us? No, (at least not yet), but this is something that could happen . At the most important level, that of the electoral process, it is noted that it is not determined by the AI, but it is greatly impacted by its multiple applications . New types of online campaigns, driven by AI applications, are replacing traditional ones. The potential for manipulating voters and indirectly influencing the electoral outcome should not be underestimated. Certainly, instances of voter manipulation are not absent from traditional political campaigns, with the only difference being that digital manipulation is often carried out without our knowledge, e.g. by monitoring our behavior on social media. Nevertheless, we should not overlook the positive impact that AI has in the upgrading of democratic institutions by providing a forum for participation in decision - making . In this context, as a first step, we look into the potential jeopardization of democratic processes posed by the use of AI tools. Secondly, we consider the possibility of strengthening democratic processes by using AI, as well as the democratization of AI itself through the possibilities it offers. And thirdly, the impact of AI on the representative system is also discussed. The paper is concluded with recommendations and conclusions. II) Risks posed for democracy Misuse of AI tools can lead to the undermining of democratic political processes or the manipulation of individuals through specific targeting, which will destabilize democracy.

artificial intelligence, democracy, social media, (14 more...)

2504.01034

Country:

Europe > France (0.14)
Asia > Taiwan (0.14)
Asia > Russia (0.14)
(18 more...)

Genre: Research Report (0.40)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

The GuardianMar-29-2025, 01:50:01 GMT

Elon Musk's xAI firm buys social media platform X for 33bn

Elon Musk's xAI artificial intelligence firm has acquired Musk's X – the social media platform formerly known as Twitter – for 33bn, marking the latest twist in the billionaire's rapid consolidation of power. The all-stock deal announced on Friday combines two of Musk's multiple portfolio companies, which also include automaker Tesla and SpaceX, and potentially eases Musk's ability to train his AI model known as Grok. Musk announced the transaction in a post on X, saying: "The combination values xAI at 80bn and X at 33bn ( 45B less 12B debt)." "xAI and X's futures are intertwined," he wrote. "Today, we officially take the step to combine the data, models, compute, distribution and talent."

large language model, machine learning, musk, (18 more...)

The Guardian

Country:

North America > United States > California (0.16)
North America > United States > Tennessee > Shelby County > Memphis (0.05)
North America > United States > District of Columbia > Washington (0.05)

Industry:

Law (0.73)
Government (0.51)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Mezzi, Emanuele, Mertzani, Asimina, Manis, Michael P., Lilova, Siyanna, Vadivoulis, Nicholas, Gatirdakis, Stamatis, Roussou, Styliani, Hmede, Rodayna

Who Owns the Output? Bridging Law and Technology in LLMs Attribution

arXiv.org Artificial IntelligenceMar-29-2025

Since the introduction of ChatGPT in 2022, Large language models (LLMs) and Large Multimodal Models (LMM) have transformed content creation, enabling the generation of human-quality content, spanning every medium, text, images, videos, and audio. The chances offered by generative AI models are endless and are drastically reducing the time required to generate content and usually raising the quality of the generation. However, considering the complexity and the difficult traceability of the generated content, the use of these tools provides challenges in attributing AI-generated content. The difficult attribution resides for a variety of reasons, starting from the lack of a systematic fingerprinting of the generated content and ending with the enormous amount of data on which LLMs and LMM are trained, which makes it difficult to connect generated content to the training data. This scenario is raising concerns about intellectual property and ethical responsibilities. To address these concerns, in this paper, we bridge the technological, ethical, and legislative aspects, by proposing a review of the legislative and technological instruments today available and proposing a legal framework to ensure accountability. In the end, we propose three use cases of how these can be combined to guarantee that attribution is respected. However, even though the techniques available today can guarantee a greater attribution to a greater extent, strong limitations still apply, that can be solved uniquely by the development of new attribution techniques, to be applied to LLMs and LMMs.

attribution, large language model, machine learning, (19 more...)

2504.01032

Country:

Europe > United Kingdom (0.14)
Europe > Bulgaria (0.04)
Europe > Belgium (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media (1.00)
Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Hadan, Hilda, Mogavi, Reza Hadi, Zhang-Kennedy, Leah, Nacke, Lennart E.

Who is Responsible When AI Fails? Mapping Causes, Entities, and Consequences of AI Privacy and Ethical Incidents

arXiv.org Artificial IntelligenceMar-28-2025

The rapid growth of artificial intelligence (AI) technologies has changed decision-making in many fields. But, it has also raised major privacy and ethical concerns. However, many AI incidents taxonomies and guidelines for academia, industry, and government lack grounding in real-world incidents. We analyzed 202 real-world AI privacy and ethical incidents. This produced a taxonomy that classifies incident types across AI lifecycle stages. It accounts for contextual factors such as causes, responsible entities, disclosure sources, and impacts. Our findings show insufficient incident reporting from AI developers and users. Many incidents are caused by poor organizational decisions and legal non-compliance. Only a few legal actions and corrective measures exist, while risk-mitigation efforts are limited. Our taxonomy contributes a structured approach in reporting of future AI incidents. Our findings demonstrate that current AI governance frameworks are inadequate. We urgently need child-specific protections and AI policies on social media. They must moderate and reduce the spread of harmful AI-generated content. Our research provides insights for policymakers and practitioners, which lets them design ethical AI. It also support AI incident detection and risk management. Finally, it guides AI policy development. Improved policies will protect people from harmful AI applications and support innovation in AI systems.

artificial intelligence, incident, machine learning, (18 more...)

doi: 10.13140/RG.2.2.31076.90244

2504.01029

Country:

Asia > Philippines (0.14)
North America > United States > District of Columbia > Washington (0.14)
Africa > Kenya (0.14)
(23 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Gurung, Alexander, Lapata, Mirella

Learning to Reason for Long-Form Story Generation

arXiv.org Artificial IntelligenceMar-28-2025

Generating high-quality stories spanning thousands of tokens requires competency across a variety of skills, from tracking plot and character arcs to keeping a consistent and engaging style. Due to the difficulty of sourcing labeled datasets and precise quality measurements, most work using large language models (LLMs) for long-form story generation uses combinations of hand-designed prompting techniques to elicit author-like behavior. This is a manual process that is highly dependent on the specific story-generation task. Motivated by the recent success of applying RL with Verifiable Rewards to domains like math and coding, we propose a general story-generation task (Next-Chapter Prediction) and a reward formulation (Verified Rewards via Completion Likelihood Improvement) that allows us to use an unlabeled book dataset as a learning signal for reasoning. We learn to reason over a story's condensed information and generate a detailed plan for the next chapter. Our reasoning is evaluated via the chapters it helps a story-generator create, and compared against non-trained and supervised finetuning (SFT) baselines. Pairwise human judgments reveal the chapters our learned reasoning produces are preferred across almost all metrics, and the effect is more pronounced in Scifi and Fantasy genres.

large language model, machine learning, natural language, (17 more...)

2503.22828

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Majumdar, Ayan, Kanubala, Deborah D., Gupta, Kavya, Valera, Isabel

A Causal Framework to Measure and Mitigate Non-binary Treatment Discrimination

arXiv.org Artificial IntelligenceMar-28-2025

Fairness studies of algorithmic decision-making systems often simplify complex decision processes, such as bail or loan approvals, into binary classification tasks. However, these approaches overlook that such decisions are not inherently binary (e.g., approve or not approve bail or loan); they also involve non-binary treatment decisions (e.g., bail conditions or loan terms) that can influence the downstream outcomes (e.g., loan repayment or reoffending). In this paper, we argue that non-binary treatment decisions are integral to the decision process and controlled by decision-makers and, therefore, should be central to fairness analyses in algorithmic decision-making. We propose a causal framework that extends fairness analyses and explicitly distinguishes between decision-subjects' covariates and the treatment decisions. This specification allows decision-makers to use our framework to (i) measure treatment disparity and its downstream effects in historical data and, using counterfactual reasoning, (ii) mitigate the impact of past unfair treatment decisions when automating decision-making. We use our framework to empirically analyze four widely used loan approval datasets to reveal potential disparity in non-binary treatment decisions and their discriminatory impact on outcomes, highlighting the need to incorporate treatment decisions in fairness assessments. Moreover, by intervening in treatment decisions, we show that our framework effectively mitigates treatment discrimination from historical data to ensure fair risk score estimation and (non-binary) decision-making processes that benefit all stakeholders.

artificial intelligence, machine learning, treatment decision, (18 more...)

2503.22454

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Saarland (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Banking & Finance > Loans (1.00)
Banking & Finance > Credit (0.93)
Health & Medicine (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)