AITopics

Rao, Delip, Young, Jonathan, Dietterich, Thomas, Callison-Burch, Chris

WithdrarXiv: A Large-Scale Dataset for Retraction Study

Retractions play a vital role in maintaining scientific integrity, yet systematic studies of retractions in computer science and other STEM fields remain scarce. We present WithdrarXiv, the first large-scale dataset of withdrawn papers from arXiv, containing over 14,000 papers and their associated retraction comments spanning the repository's entire history through September 2024. Through careful analysis of author comments, we develop a comprehensive taxonomy of retraction reasons, identifying 10 distinct categories ranging from critical errors to policy violations. We demonstrate a simple yet highly accurate zero-shot automatic categorization of retraction reasons, achieving a weighted average F1-score of 0.96. Additionally, we release WithdrarXiv-SciFy, an enriched version including scripts for parsed full-text PDFs, specifically designed to enable research in scientific feasibility studies, claim verification, and automated theorem proving. These findings provide valuable insights for improving scientific quality control and automated verification systems. Finally, and most importantly, we discuss ethical issues and take a number of steps to implement responsible data release while fostering open science in this area.

category, large language model, machine learning, (21 more...)

2412.03775

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Oregon (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.85)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.94)
Law (0.70)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Felfernig, Alexander, Wundara, Manfred, Tran, Thi Ngoc Trang, Polat-Erdeniz, Seda, Lubos, Sebastian, El-Mansi, Merfat, Garber, Damian, Le, Viet-Man

Recommender Systems for Sustainability: Overview and Research Issues

Sustainability development goals (SDGs) are regarded as a universal call to action with the overall objectives of planet protection, ending of poverty, and ensuring peace and prosperity for all people. In order to achieve these objectives, different AI technologies play a major role. Specifically, recommender systems can provide support for organizations and individuals to achieve the defined goals. Recommender systems integrate AI technologies such as machine learning, explainable AI (XAI), case-based reasoning, and constraint solving in order to find and explain user-relevant alternatives from a potentially large set of options. In this article, we summarize the state of the art in applying recommender systems to support the achievement of sustainability development goals. In this context, we discuss open issues for future research.

artificial intelligence, recommendation, recommender system, (12 more...)

doi: 10.3389/fdata.2023.1284511

2412.0362

Country:

Europe > Austria > Styria > Graz (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Brazil (0.04)
(3 more...)

Genre: Overview (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (1.00)
(7 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents

Mou, Xinyi, Ding, Xuanwen, He, Qi, Wang, Liang, Liang, Jingcong, Zhang, Xinnong, Sun, Libo, Lin, Jiayu, Zhou, Jie, Huang, Xuanjing, Wei, Zhongyu

Traditional sociological research often relies on human participation, which, though effective, is expensive, challenging to scale, and with ethical concerns. Recent advancements in large language models (LLMs) highlight their potential to simulate human behavior, enabling the replication of individual responses and facilitating studies on many interdisciplinary studies. In this paper, we conduct a comprehensive survey of this field, illustrating the recent progress in simulation driven by LLM-empowered agents. We categorize the simulations into three types: (1) Individual Simulation, which mimics specific individuals or demographic groups; (2) Scenario Simulation, where multiple agents collaborate to achieve goals within specific contexts; and (3) Society Simulation, which models interactions within agent societies to reflect the complexity and variety of real-world dynamics. These simulations follow a progression, ranging from detailed individual modeling to large-scale societal phenomena. We provide a detailed discussion of each simulation type, including the architecture or key components of the simulation, the classification of objectives or scenarios and the evaluation method. Afterward, we summarize commonly used datasets and benchmarks. Finally, we discuss the trends across these three types of simulation. A repository for the related sources is at {\url{https://github.com/FudanDISC/SocialAgent}}.

large language model, natural language, simulation, (17 more...)

2412.03563

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(6 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (0.87)
Research Report (0.81)

Industry:

Law (1.00)
Health & Medicine (1.00)
Information Technology (0.93)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Park, Joonyong, Saito, Daisuke, Minematsu, Nobuaki

Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning Model

Abstract--We examine the text-free speech representations of raw audio obtained from a self-supervised learning (SSL) model by analyzing the synthesized speech using the SSL representations instead of conventional text representations. Since raw audio does not have paired speech representations as transcribed texts do, obtaining speech representations from unpaired speech is crucial for augmenting available datasets for speech synthesis. Specifically, the proposed speech synthesis is conducted using discrete symbol representations from the SSL model in comparison with text representations, and analytical examinations of the synthesized speech have been carried out. The results empirically show that using text representations is advantageous for preserving semantic information, while using discrete symbol representations is superior for preserving acoustic content, including prosodic and intonational information. Current speech synthesis has significantly advanced through deep learning models, greatly surpassing the performance of traditional speech synthesis models [1].

input representation, representation, speech, (16 more...)

2412.03074

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.64)

Industry: Law > Civil Rights & Constitutional Law (0.60)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Hausenloy, Jason, McClements, Duncan, Thakur, Madhavendra

Towards Data Governance of Frontier AI Models

Data is essential to train and fine-tune today's frontier artificial intelligence (AI) models and to develop future ones. To date, academic, legal, and regulatory work has primarily addressed how data can directly harm consumers and creators, such as through privacy breaches, copyright infringements, and bias and discrimination. Our work, instead, focuses on the comparatively neglected question of how data can enable new governance capacities for frontier AI models. This approach for "frontier data governance" opens up new avenues for monitoring and mitigating risks from advanced AI models, particularly as they scale and acquire specific dangerous capabilities. Still, frontier data governance faces challenges that stem from the fundamental properties of data itself: data is non-rival, often non-excludable, easily replicable, and increasingly synthesizable. Despite these inherent difficulties, we propose a set of policy mechanisms targeting key actors along the data supply chain, including data producers, aggregators, model developers, and data vendors. We provide a brief overview of 15 governance mechanisms, of which we centrally introduce five, underexplored policy recommendations. These include developing canary tokens to detect unauthorized use for producers; (automated) data filtering to remove malicious content for pre-training and post-training datasets; mandatory dataset reporting requirements for developers and vendors; improved security for datasets and data generation algorithms; and know-your-customer requirements for vendors. By considering data not just as a source of potential harm, but as a critical governance lever, this work aims to equip policymakers with a new tool for the governance and regulation of frontier AI models.

arxiv, dataset, mechanism, (14 more...)

2412.03824

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > Massachusetts (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (0.88)

Industry:

Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
Law > Intellectual Property & Technology Law (0.68)
(4 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Innocenti, Francesco, Kinghorn, Paul, Yun-Farmbrough, Will, Varona, Miguel De Llanza, Singh, Ryan, Buckley, Christopher L.

JPC: Flexible Inference for Predictive Coding Networks in JAX

We introduce JPC, a JAX library for training neural networks with Predictive Coding. JPC provides a simple, fast and flexible interface to train a variety of PC networks (PCNs) including discriminative, generative and hybrid models. Unlike existing libraries, JPC leverages ordinary differential equation solvers to integrate the gradient flow inference dynamics of PCNs. We find that a second-order solver achieves significantly faster runtimes compared to standard Euler integration, with comparable performance on a range of datasets and network depths. JPC also provides some theoretical tools that can be used to study PCNs. We hope that JPC will facilitate future research of PC.

accuracy, max mean accuracy, test accuracy, (12 more...)

2412.03676

Country: Africa > Mali (0.04)

Genre: Research Report (0.64)

Industry: Law > Litigation (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Singh, Shivalika, Romanou, Angelika, Fourrier, Clémentine, Adelani, David I., Ngui, Jian Gang, Vila-Suero, Daniel, Limkonchotiwat, Peerat, Marchisio, Kelly, Leong, Wei Qi, Susanto, Yosephine, Ng, Raymond, Longpre, Shayne, Ko, Wei-Yin, Smith, Madeline, Bosselut, Antoine, Oh, Alice, Martins, Andre F. T., Choshen, Leshem, Ippolito, Daphne, Ferrante, Enzo, Fadaee, Marzieh, Ermis, Beyza, Hooker, Sara

Cultural biases in multilingual datasets pose significant challenges for their effectiveness as global benchmarks. These biases stem not only from language but also from the cultural knowledge required to interpret questions, reducing the practical utility of translated datasets like MMLU. Furthermore, translation often introduces artifacts that can distort the meaning or clarity of questions in the target language. A common practice in multilingual evaluation is to rely on machine-translated evaluation sets, but simply translating a dataset is insufficient to address these challenges. In this work, we trace the impact of both of these issues on multilingual evaluations and ensuing model performances. Our large-scale evaluation of state-of-the-art open and proprietary models illustrates that progress on MMLU depends heavily on learning Western-centric concepts, with 28% of all questions requiring culturally sensitive knowledge. Moreover, for questions requiring geographic knowledge, an astounding 84.9% focus on either North American or European regions. Rankings of model evaluations change depending on whether they are evaluated on the full portion or the subset of questions annotated as culturally sensitive, showing the distortion to model rankings when blindly relying on translated MMLU. We release Global-MMLU, an improved MMLU with evaluation coverage across 42 languages -- with improved overall quality by engaging with compensated professional and community annotators to verify translation quality while also rigorously evaluating cultural biases present in the original dataset. This comprehensive Global-MMLU set also includes designated subsets labeled as culturally sensitive and culturally agnostic to allow for more holistic, complete evaluation.

dataset, evaluation, knowledge, (14 more...)

2412.03304

Country:

Europe > United Kingdom (0.14)
North America > Central America (0.14)
Asia > Middle East > Israel (0.04)
(54 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Law (1.00)
Government (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Bhaskar, Uday, Gayen, Jayadratha, Sharma, Charu, Manwani, Naresh

Node Classification With Integrated Reject Option

One of the key tasks in graph learning is node classification. While Graph neural networks have been used for various applications, their adaptivity to reject option setting is not previously explored. In this paper, we propose NCwR, a novel approach to node classification in Graph Neural Networks (GNNs) with an integrated reject option, which allows the model to abstain from making predictions when uncertainty is high. We propose both cost-based and coverage-based methods for classification with abstention in node classification setting using GNNs. We perform experiments using our method on three standard citation network datasets Cora, Citeseer and Pubmed and compare with relevant baselines. We also model the Legal judgment prediction problem on ILDC dataset as a node classification problem where nodes represent legal cases and edges represent citations. We further interpret the model by analyzing the cases that the model abstains from predicting by visualizing which part of the input features influenced this decision.

classification, dataset, prediction, (16 more...)

2412.0319

Country:

Asia > India (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Pantha, Nishan, Ramasubramanian, Muthukumaran, Gurung, Iksha, Maskey, Manil, Ramachandran, Rahul

Challenges in Guardrailing Large Language Models for Science

The rapid development in large language models (LLMs) has transformed the landscape of natural language processing and understanding (NLP/NLU), offering significant benefits across various domains. However, when applied to scientific research, these powerful models exhibit critical failure modes related to scientific integrity and trustworthiness. Existing general-purpose LLM guardrails are insufficient to address these unique challenges in the scientific domain. We provide comprehensive guidelines for deploying LLM guardrails in the scientific domain. We identify specific challenges -- including time sensitivity, knowledge contextualization, conflict resolution, and intellectual property concerns -- and propose a guideline framework for the guardrails that can align with scientific needs. These guardrail dimensions include trustworthiness, ethics & bias, safety, and legal aspects. We also outline in detail the implementation strategies that employ white-box, black-box, and gray-box methodologies that can be enforced within scientific contexts.

language model

2411.08181

Genre: Research Report (0.69)

Industry: Law (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)