AITopics

2503.19339

Country: Asia > Middle East > Saudi Arabia (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Communications > Networks (1.00)
(2 more...)

Anagnou, Stavros, Salge, Christoph, Lewis, Peter R.

Uncertainty, bias and the institution bootstrapping problem

Institutions play a critical role in enabling communities to manage common-pool resources and avert tragedies of the commons. However, a fundamental issue arises: Individuals typically perceive participation as advantageous only after an institution is established, creating a paradox: How can institutions form if no one will join before a critical mass exists? We term this conundrum the institution bootstrapping problem and propose that misperception, specifically, agents' erroneous belief that an institution already exists, could resolve this paradox. By integrating well-documented psychological phenomena, including cognitive biases, probability distortion, and perceptual noise, into a game-theoretic framework, we demonstrate how these factors collectively mitigate the bootstrapping problem. Notably, unbiased perceptual noise (e.g., noise arising from agents' heterogeneous physical or social contexts) drastically reduces the critical mass of cooperators required for institutional emergence. This effect intensifies with greater diversity of perceptions. We explain this counter-intuitive result through asymmetric boundary conditions: proportional underestimation of low-probability sanctions produces distinct outcomes compared to equivalent overestimation. Furthermore, the type of perceptual distortion, proportional versus absolute, yields qualitatively different evolutionary pathways. These findings challenge conventional assumptions about rationality in institutional design, highlighting how "noisy" cognition can paradoxically enhance cooperation. Finally, we contextualize these insights within broader discussions of multi-agent system design and collective action. Our analysis underscores the importance of incorporating human-like cognitive constraints, not just idealized rationality, into models of institutional emergence and resilience.

agent, artificial intelligence, machine learning, (19 more...)

2504.21579

Country:

North America (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.83)

Shahbazi, Shermin, Nasiri, Mohammad-Reza, Ramezani, Majid

MPEC: Manifold-Preserved EEG Classification via an Ensemble of Clustering-Based Classifiers

ORCID: 0000 - 0003 - 0886 - 7023 Abstract -- Accurate classification of EEG signals is crucial for brain - computer interfaces (BCIs) and neuroprosthetic applications, yet many existing methods fail to account for the non - Euclidean, manifold structure of EEG data, resulting in suboptimal performance. Preserving this manifold information is essential to capture the true geometry of EEG signals, but tradition al classification techniques largely overlook this need. To this end, w e propose MPEC (Manifold - Preserved EEG Classification via an Ensemble of Clus tering - Based Classifiers), that introduces two key innovations: (1) a feature engineering phase that combines covariance matrices and Radial Basis Function (RBF) kernels to capture both linear and non - linear relationships among EEG channels, and (2) a clustering phase that employs a modified K - means al gorithm tailored for the Riemannian manifold space, ensuring local geometric sensitivity. Ensembling multiple clustering - based classifiers, MPEC achieves superior results, validated by significant improvements on the BCI Competition IV dataset 2a. Keywords -- brain - computer interfaces (BCIs), EEG signal classification, ensemble modeling, clustering - based classification. EEG signal classification is essential in brain - computer interfaces (BCIs) and neuroprosthetics, where precise interpretation supports real - time control and cognitive applications. However, traditional techniques often overlook the non - Euclidean, manifold structure of EEG data, leading to suboptimal results [1] . We propose Manifold - Preserved EEG Classification via an Ensemble of Clustering - Based Classifiers (MPEC), a novel method that enhances classification accuracy by preserving the intrinsic manifold structure of EEG signals.

artificial intelligence, classification, machine learning, (13 more...)

2504.21427

Country: Asia > Middle East > Iran (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Abdel-Ghaffar, Samy, Galatzer-Levy, Isaac, Heneghan, Conor, Liu, Xin, Kernasovskiy, Sarah, Garrett, Brennan, Barakat, Andrew, McDuff, Daniel

Passive Measurement of Autonomic Arousal in Real-World Settings

The autonomic nervous system (ANS) is activated during stress, which can have negative effects on cardiovascular health, sleep, the immune system, and mental health. While there are ways to quantify ANS activity in laboratories, there is a paucity of methods that have been validated in real-world contexts. We present the Fitbit Body Response Algorithm, an approach to continuous remote measurement of ANS activation through widely available remote wrist-based sensors. The design was validated via two experiments, a Trier Social Stress Test (n = 45) and ecological momentary assessments (EMA) of perceived stress (n=87), providing both controlled and ecologically valid test data. Model performance predicting perceived stress when using all available sensor modalities was consistent with expectations (accuracy=0.85) and outperformed models with access to only a subset of the signals. We discuss and address challenges to sensing that arise in real world settings that do not present in conventional lab environments.

algorithm, artificial intelligence, machine learning, (15 more...)

2504.21242

Country: North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.90)

Technology:

Information Technology > Communications (0.94)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Soltaniani, Farnaz, Ghafari, Mohammad, Sayagh, Mohammed

Security Bug Report Prediction Within and Across Projects: A Comparative Study of BERT and Random Forest

Early detection of security bug reports (SBRs) is crucial for preventing vulnerabilities and ensuring system reliability. While machine learning models have been developed for SBR prediction, their predictive performance still has room for improvement. In this study, we conduct a comprehensive comparison between BERT and Random Forest (RF), a competitive baseline for predicting SBRs. The results show that RF outperforms BERT with a 34% higher average G-measure for within-project predictions. Adding only SBRs from various projects improves both models' average performance. However, including both security and nonsecurity bug reports significantly reduces RF's average performance to 46%, while boosts BERT to its best average performance of 66%, surpassing RF. In cross-project SBR prediction, BERT achieves a remarkable 62% G-measure, which is substantially higher than RF.

large language model, machine learning, natural language, (19 more...)

2504.21037

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

arXiv.org Artificial IntelligenceApr-30-2025

A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages

Vykopal, Ivan, Hyben, Martin, Moro, Robert, Gregor, Michal, Simko, Jakub

Online disinformation poses a global challenge, placing significant demands on fact-checkers who must verify claims efficiently to prevent the spread of false information. A major issue in this process is the redundant verification of already fact-checked claims, which increases workload and delays responses to newly emerging claims. This research introduces an approach that retrieves previously fact-checked claims, evaluates their relevance to a given input, and provides supplementary information to support fact-checkers. Our method employs large language models (LLMs) to filter irrelevant fact-checks and generate concise summaries and explanations, enabling fact-checkers to faster assess whether a claim has been verified before. In addition, we evaluate our approach through both automatic and human assessments, where humans interact with the developed tool to review its effectiveness. Our results demonstrate that LLMs are able to filter out many irrelevant fact-checks and, therefore, reduce effort and streamline the fact-checking process.

large language model, machine learning, natural language, (21 more...)

2504.20668

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > France (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (0.66)
Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Kapar, Jan, Koenen, Niklas, Jullum, Martin

What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models

arXiv.org Machine LearningApr-30-2025

Evaluating synthetic tabular data is challenging, since they can differ from the real data in so many ways. There exist numerous metrics of synthetic data quality, ranging from statistical distances to predictive performance, often providing conflicting results. Moreover, they fail to explain or pinpoint the specific weaknesses in the synthetic data. To address this, we apply explainable AI (XAI) techniques to a binary detection classifier trained to distinguish real from synthetic data. While the classifier identifies distributional differences, XAI concepts such as feature importance and feature effects, analyzed through methods like permutation feature importance, partial dependence plots, Shapley values and counterfactual explanations, reveal why synthetic data are distinguishable, highlighting inconsistencies, unrealistic dependencies, or missing patterns. This interpretability increases transparency in synthetic data evaluation and provides deeper insights beyond conventional metrics, helping diagnose and improve synthetic data quality. We apply our approach to two tabular datasets and generative models, showing that it uncovers issues overlooked by standard evaluation techniques.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2504.20687

Country:

Europe > Germany > Bremen > Bremen (0.14)
North America > United States (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(4 more...)

Yang, Kun, Krishnan, Nikhil, Kulkarni, Sanjeev R.

Financial Data Analysis with Robust Federated Logistic Regression

arXiv.org Machine LearningApr-30-2025

Financial data analysis plays a pivotal role in today's business landscape [1, 2, 3, 4, 5, 6, 7], including credit risk assessment (such as loan prediction and credit scoring), fraud detection, and cost optimization, etc. However, when we develop solutions to address financial problems, we will inevitably encounter a number of key challenges [1, 2, 3, 4, 5]. For example, financial data is often voluminous, dynamically and frequently generated in real time, and distributed across diverse locations, making it challenging to process and analyze in a centralized manner[1], e.g., the New Y ork Stock Exchange (NYSE) alone has billions of transactions per day. Similarly, other major exchanges, such as the Shanghai Stock Exchange (SSE) and the London Stock Exchange (LSE), also generate vast amounts of stock data. Additionally, noise and missing values unavoidably occur in financial data, which can cause results and predictions to be skewed (or even completely wrong). These challenges require firms to come up with more efficient and smarter solutions. In recent decades, machine learning has achieved remarkable success across various domains [8, 9, 10], owing to its effective generalization ability and adaptability, and has also received increasing attention in financial data analysis [11, 12], such as credit risk assessment, resource allocation, and cost optimization. However, these classical (supervised) machine learning based solutions, such as logistic regression and random forest, usually implicitly assume that 1) all the data is stored and centralized at one location, typically a single machine, and that we have full access to the entire data; 2) these algorithms expect to run on a single machine with minimal concerns for memory or disk storage limitations; and 3) the provided data is clean and free from outliers introduced by malicious adversaries, as it is stored at a single location equipped with high security protection mechanisms to prevent data corruption. Nonetheless, these assumptions do not always hold in practice.

artificial intelligence, machine learning, outlier, (13 more...)

arXiv.org Machine Learning

2504.2025

Country:

Europe > United Kingdom > England > Greater London > London > City of London (0.24)
Asia > China > Shanghai > Shanghai (0.24)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Credit (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Guimard, Quentin, D'Incà, Moreno, Mancini, Massimiliano, Ricci, Elisa

Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers

arXiv.org Artificial IntelligenceApr-30-2025

A person downloading a pre-trained model from the web should be aware of its biases. Existing approaches for bias identification rely on datasets containing labels for the task of interest, something that a non-expert may not have access to, or may not have the necessary resources to collect: this greatly limits the number of tasks where model biases can be identified. In this work, we present Classifier-to-Bias (C2B), the first bias discovery framework that works without access to any labeled data: it only relies on a textual description of the classification task to identify biases in the target classification model. This description is fed to a large language model to generate bias proposals and corresponding captions depicting biases together with task-specific target labels. A retrieval model collects images for those captions, which are then used to assess the accuracy of the model w.r.t. the given biases. C2B is training-free, does not require any annotations, has no constraints on the list of biases, and can be applied to any pre-trained model on any classification task. Experiments on two publicly available datasets show that C2B discovers biases beyond those of the original datasets and outperforms a recent state-of-the-art bias detection baseline that relies on task-specific annotations, being a promising first step toward addressing task-agnostic unsupervised bias detection.

large language model, machine learning, natural language, (21 more...)

2504.20902

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Dell'Anna, Stefano, Montibeller, Andrea, Boato, Giulia

TrueFake: A Real World Case Dataset of Last Generation Fake Images also Shared on Social Networks

arXiv.org Artificial IntelligenceApr-30-2025

--AI-generated synthetic media are increasingly used in real-world scenarios, often with the purpose of spreading misinformation and propaganda through social media platforms, where compression and other processing can degrade fake detection cues. Currently, many forensic tools fail to account for these in-the-wild challenges. In this work, we introduce TrueFake, a large-scale benchmarking dataset of 600,000 images including top notch generative techniques and sharing via three different social networks. This dataset allows for rigorous evaluation of state-of-the-art fake image detectors under very realistic and challenging conditions. Through extensive experimentation, we analyze how social media sharing impacts detection performance, and identify current most effective detection and training strategies. Our findings highlight the need for evaluating forensic models in conditions that mirror real-world use. In recent years, AI-generated media (such as images, videos, and audio) have increasingly become part of everyday life [3] becoming widely used in the entertainment industry, including movie production and advertising. The literature provides a broad range of AI media generators capable of producing hyper-realistic images [4], [5], videos [6], and even audio [7].

artificial intelligence, detector, machine learning, (18 more...)

2504.20658

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)