AITopics

2501.18177

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland (0.04)
Africa > Nigeria (0.04)
(25 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Taxation Law (1.00)
Law > Criminal Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.92)

Tomei, Philip Moreira, Jain, Rupal, Franklin, Matija

AI Governance through Markets

This paper argues that market governance mechanisms should be considered a key approach in the governance of artificial intelligence (AI), alongside traditional regulatory frameworks. While current governance approaches have predominantly focused on regulation, we contend that market-based mechanisms offer effective incentives for responsible AI development. We examine four emerging vectors of market governance: insurance, auditing, procurement, and due diligence, demonstrating how these mechanisms can affirm the relationship between AI risk and financial risk while addressing capital allocation inefficiencies. While we do not claim that market forces alone can adequately protect societal interests, we maintain that standardised AI disclosures and market mechanisms can create powerful incentives for safe and responsible AI development. This paper urges regulators, economists, and machine learning researchers to investigate and implement market-based approaches to AI governance.

artificial intelligence, arxiv preprint arxiv, machine learning, (14 more...)

2501.17755

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > Experimental Study (0.67)

Industry:

Social Sector (1.00)
Law > Statutes (1.00)
Law > Intellectual Property & Technology Law (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

International AI Safety Report

Bengio, Yoshua, Mindermann, Sören, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Fox, Philip, Garfinkel, Ben, Goldfarb, Danielle, Heidari, Hoda, Ho, Anson, Kapoor, Sayash, Khalatbari, Leila, Longpre, Shayne, Manning, Sam, Mavroudis, Vasilios, Mazeika, Mantas, Michael, Julian, Newman, Jessica, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Sastry, Girish, Seger, Elizabeth, Skeadas, Theodora, South, Tobin, Strubell, Emma, Tramèr, Florian, Velasco, Lucia, Wheeler, Nicole, Acemoglu, Daron, Adekanmbi, Olubayo, Dalrymple, David, Dietterich, Thomas G., Felten, Edward W., Fung, Pascale, Gourinchas, Pierre-Olivier, Heintz, Fredrik, Hinton, Geoffrey, Jennings, Nick, Krause, Andreas, Leavy, Susan, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John, Munga, Jane, Narayanan, Arvind, Nelson, Alondra, Neppel, Clara, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Schölkopf, Bernhard, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaël, Yao, Andrew, Zhang, Ya-Qin, Albalawi, Fahad, Alserkal, Marwan, Ajala, Olubunmi, Avrin, Guillaume, Busch, Christian, de Carvalho, André Carlos Ponce de Leon Ferreira, Fox, Bronwyn, Gill, Amandeep Singh, Hatip, Ahmet Halit, Heikkilä, Juha, Jolly, Gill, Katzir, Ziv, Kitano, Hiroaki, Krüger, Antonio, Johnson, Chris, Khan, Saif M., Lee, Kyoung Mu, Ligot, Dominic Vincent, Molchanovskyi, Oleksii, Monti, Andrea, Mwamanzi, Nusu, Nemer, Mona, Oliver, Nuria, Portillo, José Ramón López, Ravindran, Balaraman, Rivera, Raquel Pezoa, Riza, Hammam, Rugege, Crystal, Seoighe, Ciarán, Sheehan, Jerry, Sheikh, Haroon, Wong, Denise, Zeng, Yi

I am honoured to present the International AI Safety Report. It is the work of 96 international AI experts who collaborated in an unprecedented effort to establish an internationally shared scientific understanding of risks from advanced AI and methods for managing them. We embarked on this journey just over a year ago, shortly after the countries present at the Bletchley Park AI Safety Summit agreed to support the creation of this report. Since then, we published an Interim Report in May 2024, which was presented at the AI Seoul Summit. We are now pleased to publish the present, full report ahead of the AI Action Summit in Paris in February 2025. Since the Bletchley Summit, the capabilities of general-purpose AI, the type of AI this report focuses on, have increased further. For example, new models have shown markedly better performance at tests of Professor Yoshua Bengio programming and scientific reasoning.

data mining, large language model, machine learning, (27 more...)

2501.17805

Country:

South America (1.00)
North America > Canada (1.00)
Asia > Middle East (1.00)
(7 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(5 more...)

Industry:

Transportation > Air (1.00)
Social Sector (1.00)
Media > News (1.00)
(30 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Quality (1.00)
(21 more...)

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Huang, Tiansheng, Hu, Sihao, Ilhan, Fatih, Tekin, Selim Furkan, Liu, Ling

Recent research shows that Large Language Models (LLMs) are vulnerable to harmful fine-tuning attacks -- models lose their safety alignment ability after fine-tuning on a few harmful samples. For risk mitigation, a guardrail is typically used to filter out harmful samples before fine-tuning. By designing a new red-teaming method, we in this paper show that purely relying on the moderation guardrail for data filtration is not reliable. Our proposed attack method, dubbed Virus, easily bypasses the guardrail moderation by slightly modifying the harmful data. Experimental results show that the harmful data optimized by Virus is not detectable by the guardrail with up to 100\% leakage ratio, and can simultaneously achieve superior attack performance. Finally, the key message we want to convey through this paper is that: \textbf{it is reckless to consider guardrail moderation as a clutch at straws towards harmful fine-tuning attack}, as it cannot solve the inherent safety issue of the pre-trained LLMs. Our code is available at https://github.com/git-disl/Virus

large language model, machine learning, natural language, (15 more...)

2501.17433

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Law > Criminal Law (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

A Comprehensive Survey on Legal Summarization: Challenges and Future Directions

Akter, Mousumi, Çano, Erion, Weber, Erik, Dobler, Dennis, Habernal, Ivan

The constant engagement with extensive written materials is fundamental and immensely time-consuming [104]. Legal professionals often spend hours, if not days, combing through documents to find precedents or relevant cases that could be pivotal to their current cases. This laborious process is a significant part of the workload of legal professionals like lawyers and judges, taking up lots of time that could be invested otherwise. Automatic summarization tools could help to condense lengthy legal documents into concise summaries, helping to save both time and costs. Moreover, integrating advanced Natural Language Processing (NLP) techniques into legal research holds significant promise for democratizing access to legal information. Figure 1 shows the general pipeline for legal summarization. Compared to other domains, legal texts present unique challenges that distinguish them from other document types. Legal documents tend to be longer and more detailed than those from other domains.

information retrieval, large language model, machine learning, (20 more...)

2501.1783

Country:

Europe > Germany (0.14)
Oceania > Australia (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(34 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law > Litigation (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Cheng, Myra, Lee, Angela Y., Rapuano, Kristina, Niederhoffer, Kate, Liebscher, Alex, Hancock, Jeffrey

From tools to thieves: Measuring and understanding public perceptions of AI through crowdsourced metaphors

How has the public responded to the increasing prevalence of artificial intelligence (AI)-based technologies? We investigate public perceptions of AI by collecting over 12,000 responses over 12 months from a nationally representative U.S. sample. Participants provided open-ended metaphors reflecting their mental models of AI, a methodology that overcomes the limitations of traditional self-reported measures. Using a mixed-methods approach combining quantitative clustering and qualitative coding, we identify 20 dominant metaphors shaping public understanding of AI. To analyze these metaphors systematically, we present a scalable framework integrating language modeling (LM)-based techniques to measure key dimensions of public perception: anthropomorphism (attribution of human-like qualities), warmth, and competence. We find that Americans generally view AI as warm and competent, and that over the past year, perceptions of AI's human-likeness and warmth have significantly increased ($+34\%, r = 0.80, p < 0.01; +41\%, r = 0.62, p < 0.05$). Furthermore, these implicit perceptions, along with the identified dominant metaphors, strongly predict trust in and willingness to adopt AI ($r^2 = 0.21, 0.18, p < 0.001$). We further explore how differences in metaphors and implicit perceptions--such as the higher propensity of women, older individuals, and people of color to anthropomorphize AI--shed light on demographic disparities in trust and adoption. In addition to our dataset and framework for tracking evolving public attitudes, we provide actionable insights on using metaphors for inclusive and responsible AI development.

machine learning, metaphor, natural language, (19 more...)

2501.18045

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Government (1.00)
Education > Educational Setting (0.69)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(3 more...)

Li, Yuxuan, Shirado, Hirokazu, Das, Sauvik

Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models

While advances in fairness and alignment have helped mitigate overt biases exhibited by large language models (LLMs) when explicitly prompted, we hypothesize that these models may still exhibit implicit biases when simulating human behavior. To test this hypothesis, we propose a technique to systematically uncover such biases across a broad range of sociodemographic categories by assessing decision-making disparities among agents with LLM-generated, sociodemographically-informed personas. Using our technique, we tested six LLMs across three sociodemographic groups and four decision-making scenarios. Our results show that state-of-the-art LLMs exhibit significant sociodemographic disparities in nearly all simulations, with more advanced models exhibiting greater implicit biases despite reducing explicit biases. Furthermore, when comparing our findings to real-world disparities reported in empirical studies, we find that the biases we uncovered are directionally aligned but markedly amplified. This directional alignment highlights the utility of our technique in uncovering systematic biases in LLMs rather than random variations; moreover, the presence and amplification of implicit biases emphasizes the need for novel strategies to address these biases.

large language model, machine learning, natural language, (16 more...)

2501.1742

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Law (0.93)
Government > Regional Government > North America Government > United States Government (0.68)
Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Arrieta, Aitor, Ugarte, Miriam, Valle, Pablo, Parejo, José Antonio, Segura, Sergio

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Large Language Models (LLMs) have become an integral part of our daily lives. However, they impose certain risks, including those that can harm individuals' privacy, perpetuate biases and spread misinformation. These risks highlight the need for robust safety mechanisms, ethical guidelines, and thorough testing to ensure their responsible deployment. Safety of LLMs is a key property that needs to be thoroughly tested prior the model to be deployed and accessible to the general users. This paper reports the external safety testing experience conducted by researchers from Mondragon University and University of Seville on OpenAI's new o3-mini LLM as part of OpenAI's early access for safety testing program. In particular, we apply our tool, ASTRAL, to automatically and systematically generate up to date unsafe test inputs (i.e., prompts) that helps us test and assess different safety categories of LLMs. We automatically generate and execute a total of 10,080 unsafe test input on a early o3-mini beta version. After manually verifying the test cases classified as unsafe by ASTRAL, we identify a total of 87 actual instances of unsafe LLM behavior. We highlight key insights and findings uncovered during the pre-deployment external testing phase of OpenAI's latest LLM.

large language model, machine learning, test input, (17 more...)

2501.17749

Country:

North America > United States (0.46)
Europe > Spain > Andalusia > Seville Province > Seville (0.04)
Europe > Spain > Basque Country (0.04)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.04)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government (0.94)
Media (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Revisiting gender bias research in bibliometrics: Standardizing methodological variability using Scholarly Data Analysis (SoDA) Cards

Lee, HaeJin, Mishra, Shubhanshu, Mishra, Apratim, You, Zhiwen, Kim, Jinseok, Diesner, Jana

Gender biases in scholarly metrics remain a persistent concern, despite numerous bibliometric studies exploring their presence and absence across productivity, impact, acknowledgment, and self-citations. However, methodological inconsistencies, particularly in author name disambiguation and gender identification, limit the reliability and comparability of these studies, potentially perpetuating misperceptions and hindering effective interventions. A review of 70 relevant publications over the past 12 years reveals a wide range of approaches, from name-based and manual searches to more algorithmic and gold-standard methods, with no clear consensus on best practices. This variability, compounded by challenges such as accurately disambiguating Asian names and managing unassigned gender labels, underscores the urgent need for standardized and robust methodologies. To address this critical gap, we propose the development and implementation of ``Scholarly Data Analysis (SoDA) Cards." These cards will provide a structured framework for documenting and reporting key methodological choices in scholarly data analysis, including author name disambiguation and gender identification procedures. By promoting transparency and reproducibility, SoDA Cards will facilitate more accurate comparisons and aggregations of research findings, ultimately supporting evidence-informed policymaking and enabling the longitudinal tracking of analytical approaches in the study of gender and other social biases in academia.

artificial intelligence, machine learning, natural language, (15 more...)

2501.18129

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > North Korea (0.14)
North America > Canada > Quebec (0.04)
(18 more...)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government (0.93)
Law > Civil Rights & Constitutional Law (0.69)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Tavares, Luiz, Mazzon, Jose, Paletta, Francisco, Barros, Fabio

Bankruptcy analysis using images and convolutional neural networks (CNN)

The marketing departments of financial institutions strive to craft products and services that cater to the diverse needs of businesses of all sizes. However, it is evident upon analysis that larger corporations often receive a more substantial portion of available funds. This disparity arises from the relative ease of assessing the risk of default and bankruptcy in these more prominent companies. Historically, risk analysis studies have focused on data from publicly traded or stock exchange-listed companies, leaving a gap in knowledge about small and medium-sized enterprises (SMEs). Addressing this gap, this study introduces a method for evaluating SMEs by generating images for processing via a convolutional neural network (CNN). To this end, more than 10,000 images, one for each company in the sample, were created to identify scenarios in which the CNN can operate with higher assertiveness and reduced training error probability. The findings demonstrate a significant predictive capacity, achieving 97.8% accuracy, when a substantial number of images are utilized. Moreover, the image creation method paves the way for potential applications of this technique in various sectors and for different analytical purposes.

accuracy, dataset, neural network, (17 more...)

2502.15726

Country:

South America > Brazil > São Paulo (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Statutes (1.00)
Law > Business Law (1.00)
Government (0.93)
Banking & Finance > Economy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)