AITopics

The Japan TimesAug-26-2025, 07:54:00 GMT

Nikkei and Asahi Shimbun sue Perplexity AI over alleged copyright violations

The newspapers are seeking an injunction and 2.2 billion ( 15 million) each in damages from Perplexity, they said in a joint statement Tuesday. The suit was filed at the Tokyo District Court. The legal action by the Nikkei, which owns Japan's biggest financial newspaper, and the left-leaning Asahi underscores a widening rift between publishers and AI companies over who controls -- and profits from -- the distribution of news. The media industry argues that AI tools using their work without licenses siphons away readership and ad revenue, threatening already fragile business models. "These actions amount to continuous and large-scale freeloading on journalists' time and effort," Nikkei and Asahi said in the statement.

artificial intelligence, asahi shimbun sue perplexity ai, newspaper, (4 more...)

The Japan Times

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > United States > California > San Francisco County > San Francisco (0.08)
Europe (0.08)

Industry:

Media > News (1.00)
Law (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

The Japan TimesAug-26-2025, 02:10:00 GMT

Musk sues Apple and OpenAI, saying they hurt AI competition

Elon Musk has accused Apple and OpenAI in a lawsuit of unfairly favoring the artificial intelligence company across iPhones and thwarting competition for other chatbot makers. Musk's X and xAI seek billions of dollars in damages in the suit filed Monday in U.S. federal court in Fort Worth, Texas, arguing that Apple's decision to integrate OpenAI into the iPhone's operating system inhibits rivalry and innovation within the AI industry and harms consumers by depriving them of choice. The billionaire founder of xAI, which now houses the Grok AI team and X social network, said Apple makes it impossible for anyone other than OpenAI's ChatGPT to reach the top of the App Store charts, a sought-after global spotlight for app developers.

apple and openai, large language model, machine learning, (7 more...)

The Japan Times

Country: North America > United States > Texas > Tarrant County > Fort Worth (0.32)

Industry:

Law (0.69)
Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Fallah, Alireza, Jordan, Michael I., Ulichney, Annie

The Statistical Fairness-Accuracy Frontier

arXiv.org Machine LearningAug-26-2025

Machine learning models must balance accuracy and fairness, but these goals often conflict, particularly when data come from multiple demographic groups. A useful tool for understanding this trade-off is the fairness-accuracy (FA) frontier, which characterizes the set of models that cannot be simultaneously improved in both fairness and accuracy. Prior analyses of the FA frontier provide a full characterization under the assumption of complete knowledge of population distributions -- an unrealistic ideal. We study the FA frontier in the finite-sample regime, showing how it deviates from its population counterpart and quantifying the worst-case gap between them. In particular, we derive minimax-optimal estimators that depend on the designer's knowledge of the covariate distribution. For each estimator, we characterize how finite-sample effects asymmetrically impact each group's risk, and identify optimal sample allocation strategies. Our results transform the FA frontier from a theoretical construct into a practical tool for policymakers and practitioners who must often design algorithms with limited data.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2508.17622

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Law (0.67)
Government (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Zhao, Jiahao, Dong, Liwei

Jinx: Unlimited LLMs for Probing Alignment Failures

Unlimited, or so-called helpful-only language models are trained without safety alignment constraints and never refuse user queries. They are widely used by leading AI companies as internal tools for red teaming and alignment evaluation. For example, if a safety-aligned model produces harmful outputs similar to an unlimited model, this indicates alignment failures that require further attention. Despite their essential role in assessing alignment, such models are not available to the research community. We introduce Jinx, a helpful-only variant of popular open-weight LLMs. Jinx responds to all queries without refusals or safety filtering, while preserving the base model's capabilities in reasoning and instruction following. It provides researchers with an accessible tool for probing alignment failures, evaluating safety boundaries, and systematically studying failure modes in language model safety.

category, large language model, machine learning, (17 more...)

2508.08243

Genre: Research Report (0.50)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (0.96)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Silcenco, Oleg, Machad, Marcos R., Ugulino, Wallace C., Braun, Daniel

A Retail-Corpus for Aspect-Based Sentiment Analysis with Large Language Models

Aspect-based sentiment analysis enhances sentiment detection by associating it with specific aspects, offering deeper insights than traditional sentiment analysis. This study introduces a manually annotated dataset of 10,814 multilingual customer reviews covering brick-and-mortar retail stores, labeled with eight aspect categories and their sentiment. Using this dataset, the performance of GPT-4 and LLaMA-3 in aspect based sentiment analysis is evaluated to establish a baseline for the newly introduced data. The results show both models achieving over 85% accuracy, while GPT-4 outperforms LLaMA-3 overall with regard to all relevant metrics.

large language model, machine learning, sentiment analysis, (17 more...)

2508.17994

Country:

Europe (1.00)
North America > United States (0.93)

Genre:

Overview (0.68)
Research Report > New Finding (0.48)

Industry:

Retail (1.00)
Information Technology > Services (0.94)
Law (0.93)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mirsch, Marie, Wegner, Laila, Strube, Jonas, Leicht-Scholten, Carmen

A Feminist Account of Intersectional Algorithmic Fairness

Intersectionality has profoundly influenced research and political action by revealing how interconnected systems of privilege and oppression influence lived experiences, yet its integration into algorithmic fairness research remains limited. Existing approaches often rely on single - axis or formal subgroup frameworks that risk oversimplifying social realities and neglecting structural inequalities. We propose Substantive Intersectional Algorithmic Fairness, extending Green's (2022) notion of substantive algorithmic fairness with insights from intersectional feminist theory. Buil ding on this foundation, we introduce ten desiderata within the ROOF methodology to guide the design, assessment, and deployment of algorithmic systems in ways that address systemic inequities while mitigating harms to intersectionally marginalized communi ties . Rather than prescribing fixed operationalizations, these desiderata encourage reflection on assumptions of neutrality, the use of protect ed attributes, the inclusion of multiply marginalized groups, and enhancing algorithmic systems' potential. Our a pproach emphasizes that fairness cannot be separated from social context, and that in some cases, principled non - deployment may be necessary. By bridging computational and social science perspectives, we provide actionable guidance for more equitable, incl usive, and context - sensitive intersectional algorithmic practices.

artificial intelligence, machine learning, natural language, (17 more...)

2508.17944

Country:

North America > United States > California (0.46)
Europe > United Kingdom > England (0.28)
North America > United States > Massachusetts > Middlesex County (0.28)
(2 more...)

Genre: Research Report (0.64)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Education (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)

Savigny, Henri, Yun, Bruno

AMELIA: A Family of Multi-task End-to-end Language Models for Argumentation

Argument mining is a subfield of argumentation that aims to automatically extract argumentative structures and their relations from natural language texts. This paper investigates how a single large language model can be leveraged to perform one or several argument mining tasks. Our contributions are two-fold. First, we construct a multi-task dataset by surveying and converting 19 well-known argument mining datasets from the literature into a unified format. Second, we explore various training strategies using Meta AI's Llama-3.1-8B-Instruct model: (1) fine-tuning on individual tasks, (2) fine-tuning jointly on multiple tasks, and (3) merging models fine-tuned separately on individual tasks. Our experiments show that task-specific fine-tuning significantly improves individual performance across all tasks. Moreover, multi-task fine-tuning maintains strong performance without degradation, suggesting effective transfer learning across related tasks. Finally, we demonstrate that model merging offers a viable compromise: it yields competitive performance while mitigating the computational costs associated with full multi-task fine-tuning.

large language model, machine learning, natural language, (17 more...)

2508.17926

Country:

Europe (1.00)
North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Government (0.93)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

UQ: Assessing Language Models on Unsolved Questions

Nie, Fan, Liu, Ken Ziyu, Wang, Zihao, Sun, Rui, Liu, Wei, Shi, Weijia, Yao, Huaxiu, Zhang, Linjun, Ng, Andrew Y., Zou, James, Koyejo, Sanmi, Choi, Yejin, Liang, Percy, Muennighoff, Niklas

Benchmarks shape progress in AI research. A useful benchmark should be both difficult and realistic: questions should challenge frontier models while also reflecting real-world usage. Yet, current paradigms face a difficulty-realism tension: exam-style benchmarks are often made artificially difficult with limited real-world value, while benchmarks based on real user interaction often skew toward easy, high-frequency problems. In this work, we explore a radically different paradigm: assessing models on unsolved questions. Rather than a static benchmark scored once, we curate unsolved questions and evaluate models asynchronously over time with validator-assisted screening and community verification. We introduce UQ, a testbed of 500 challenging, diverse questions sourced from Stack Exchange, spanning topics from CS theory and math to sci-fi and history, probing capabilities including reasoning, factuality, and browsing. UQ is difficult and realistic by construction: unsolved questions are often hard and naturally arise when humans seek answers, thus solving them yields direct real-world value. Our contributions are threefold: (1) UQ-Dataset and its collection pipeline combining rule-based filters, LLM judges, and human review to ensure question quality (e.g., well-defined and difficult); (2) UQ-Validators, compound validation strategies that leverage the generator-validator gap to provide evaluation signals and pre-screen candidate solutions for human review; and (3) UQ-Platform, an open platform where experts collectively verify questions and solutions. The top model passes UQ-validation on only 15% of questions, and preliminary human verification has already identified correct answers among those that passed. UQ charts a path for evaluating frontier models on real-world, open-ended challenges, where success pushes the frontier of human knowledge. We release UQ at https://uq.stanford.edu.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

2508.1758

Country:

Europe (1.00)
North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Banking & Finance > Trading (0.92)
Transportation (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chinese Court Simulation with LLM-Based Agent System

Zhang, Kaiyuan, Li, Jiaqi, Wu, Yueyue, Li, Haitao, Luo, Cheng, Zou, Shaokun, Zhou, Yujia, Su, Weihang, Ai, Qingyao, Liu, Yiqun

Mock trial has long served as an important platform for legal professional training and education. It not only helps students learn about realistic trial procedures, but also provides practical value for case analysis and judgment prediction. Traditional mock trials are difficult to access by the public because they rely on professional tutors and human participants. Fortunately, the rise of large language models (LLMs) provides new opportunities for creating more accessible and scalable court simulations. While promising, existing research mainly focuses on agent construction while ignoring the systematic design and evaluation of court simulations, which are actually more important for the credibility and usage of court simulation in practice. To this end, we present the first court simulation framework -- SimCourt -- based on the real-world procedure structure of Chinese courts. Our framework replicates all 5 core stages of a Chinese trial and incorporates 5 courtroom roles, faithfully following the procedural definitions in China. To simulate trial participants with different roles, we propose and craft legal agents equipped with memory, planning, and reflection abilities. Experiment on legal judgment prediction show that our framework can generate simulated trials that better guide the system to predict the imprisonment, probation, and fine of each case. Further annotations by human experts show that agents' responses under our simulation framework even outperformed judges and lawyers from the real trials in many scenarios. These further demonstrate the potential of LLM-based court simulation.

artificial intelligence, large language model, natural language, (16 more...)

2508.17322

Country: Asia > China (1.00)

Genre: Research Report (0.64)

Industry:

Law > Litigation (1.00)
Government > Regional Government > Asia Government > China Government (0.70)
Education > Curriculum > Subject-Specific Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)