AITopics | wilcoxon test

Collaborating Authors

wilcoxon test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation (Appendix) A Details of the considered distributions

Neural Information Processing SystemsAug-18-2025, 22:19:23 GMT

In this paper, we consider various distributions for the node coordinates in VRPs, followed which we randomly generate instances for both training and testing. Below we present details on how to generate those instances. It considers uniformly distributed nodes. An exemplary instance is displayed in Figure 1(i). It considers a mixture of the two distributions above, each with half of the nodes.

artificial intelligence, exemplar distribution, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Industry: Transportation > Freight & Logistics Services (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Vision language models are unreliable at trivial spatial cognition

Khemlani, Sangeet, Tran, Tyler, Gyory, Nathaniel, Harrison, Anthony M., Lawson, Wallace E., Thielstrom, Ravenna, Thompson, Hunter, Singh, Taaren, Trafton, J. Gregory

arXiv.org Artificial IntelligenceApr-23-2025

Vision language models (VLMs) are designed to extract relevant visuospatial information from images. Some research suggests that VLMs can exhibit humanlike scene understanding, while other investigations reveal difficulties in their ability to process relational information. To achieve widespread applicability, VLMs must perform reliably, yielding comparable competence across a wide variety of related tasks. We sought to test how reliable these architectures are at engaging in trivial spatial cognition, e.g., recognizing whether one object is left of another in an uncluttered scene. We developed a benchmark dataset -- TableTest -- whose images depict 3D scenes of objects arranged on a table, and used it to evaluate state-of-the-art VLMs. Results show that performance could be degraded by minor variations of prompts that use logically equivalent descriptions. These analyses suggest limitations in how VLMs may reason about spatial relations in real-world applications. They also reveal novel opportunities for bolstering image caption corpora for more efficient training and testing.

large language model, machine learning, relation, (19 more...)

arXiv.org Artificial Intelligence

2504.16061

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.46)
Government > Military > Navy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Analyzing Human Perceptions of a MEDEVAC Robot in a Simulated Evacuation Scenario

Jordan, Tyson, Pandey, Pranav, Doshi, Prashant, Parasuraman, Ramviyas, Goodie, Adam

arXiv.org Artificial IntelligenceOct-29-2024

The use of autonomous systems in medical evacuation (MEDEVAC) scenarios is promising, but existing implementations overlook key insights from human-robot interaction (HRI) research. Studies on human-machine teams demonstrate that human perceptions of a machine teammate are critical in governing the machine's performance. Here, we present a mixed factorial design to assess human perceptions of a MEDEVAC robot in a simulated evacuation scenario. Participants were assigned to the role of casualty (CAS) or bystander (BYS) and subjected to three within-subjects conditions based on the MEDEVAC robot's operating mode: autonomous-slow (AS), autonomous-fast (AF), and teleoperation (TO). During each trial, a MEDEVAC robot navigated an 11-meter path, acquiring a casualty and transporting them to an ambulance exchange point while avoiding an idle bystander. Following each trial, subjects completed a questionnaire measuring their emotional states, perceived safety, and social compatibility with the robot. Results indicate a consistent main effect of operating mode on reported emotional states and perceived safety. Pairwise analyses suggest that the employment of the AF operating mode negatively impacted perceptions along these dimensions. There were no persistent differences between casualty and bystander responses.

medevac robot, perception, robot, (17 more...)

arXiv.org Artificial Intelligence

2410.19072

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government > Military > Army (0.68)
Education (0.68)
Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Proactive and Reactive Constraint Programming for Stochastic Project Scheduling with Maximal Time-Lags

Houten, Kim van den, Planken, Léon, Freydell, Esteban, Tax, David M. J., de Weerdt, Mathijs

arXiv.org Artificial IntelligenceSep-13-2024

This study investigates scheduling strategies for the stochastic resource-constrained project scheduling problem with maximal time lags (SRCPSP/max)). Recent advances in Constraint Programming (CP) and Temporal Networks have reinvoked interest in evaluating the advantages and drawbacks of various proactive and reactive scheduling methods. First, we present a new, CP-based fully proactive method. Second, we show how a reactive approach can be constructed using an online rescheduling procedure. A third contribution is based on partial order schedules and uses Simple Temporal Networks with Uncertainty (STNUs). Our statistical analysis shows that the STNU-based algorithm performs best in terms of solution quality, while also showing good relative offline and online computation time.

proactive and reactive constraint programming, saa, stnu, (11 more...)

arXiv.org Artificial Intelligence

2409.09107

Country: Europe > Ireland > Munster > County Cork > Cork (0.04)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Too Good to be True? Turn Any Model Differentially Private With DP-Weights

Zagardo, David

arXiv.org Artificial IntelligenceJun-27-2024

Imagine training a machine learning model with Differentially Private Stochastic Gradient Descent (DP-SGD), only to discover post-training that the noise level was either too high, crippling your model's utility, or too low, compromising privacy. The dreaded realization hits: you must start the lengthy training process from scratch. But what if you could avoid this retraining nightmare? In this study, we introduce a groundbreaking approach (to our knowledge) that applies differential privacy noise to the model's weights after training. We offer a comprehensive mathematical proof for this novel approach's privacy bounds, use formal methods to validate its privacy guarantees, and empirically evaluate its effectiveness using membership inference attacks and performance evaluations. This method allows for a single training run, followed by post-hoc noise adjustments to achieve optimal privacy-utility trade-offs. We compare this novel fine-tuned model (DP-Weights model) to a traditional DP-SGD model, demonstrating that our approach yields statistically similar performance and privacy guarantees. Our results validate the efficacy of post-training noise application, promising significant time savings and flexibility in fine-tuning differential privacy parameters, making it a practical alternative for deploying differentially private models in real-world scenarios.

coefficient, nan nan pearson correlation, wilcoxon test, (13 more...)

arXiv.org Artificial Intelligence

2406.19507

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.97)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

An Approach to Multiple Comparison Benchmark Evaluations that is Stable Under Manipulation of the Comparate Set

Ismail-Fawaz, Ali, Dempster, Angus, Tan, Chang Wei, Herrmann, Matthieu, Miller, Lynn, Schmidt, Daniel F., Berretti, Stefano, Weber, Jonathan, Devanne, Maxime, Forestier, Germain, Webb, Geoffrey I.

arXiv.org Artificial IntelligenceMay-19-2023

The measurement of progress using benchmarks evaluations is ubiquitous in computer science and machine learning. However, common approaches to analyzing and presenting the results of benchmark comparisons of multiple algorithms over multiple datasets, such as the critical difference diagram introduced by Dem\v{s}ar (2006), have important shortcomings and, we show, are open to both inadvertent and intentional manipulation. To address these issues, we propose a new approach to presenting the results of benchmark comparisons, the Multiple Comparison Matrix (MCM), that prioritizes pairwise comparisons and precludes the means of manipulating experimental results in existing approaches. MCM can be used to show the results of an all-pairs comparison, or to show the results of a comparison between one or more selected algorithms and the state of the art. MCM is implemented in Python and is publicly available.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2305.11921

Country: Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Interpretable ML-driven Strategy for Automated Trading Pattern Extraction

Sokolovsky, Artur, Arnaboldi, Luca, Bacardit, Jaume, Gross, Thomas

arXiv.org Artificial IntelligenceMar-23-2021

Financial markets are a source of non-stationary multidimensional time series which has been drawing attention for decades. Each financial instrument has its specific changing over time properties, making their analysis a complex task. Improvement of understanding and development of methods for financial time series analysis is essential for successful operation on financial markets. In this study we propose a volume-based data pre-processing method for making financial time series more suitable for machine learning pipelines. We use a statistical approach for assessing the performance of the method. Namely, we formally state the hypotheses, set up associated classification tasks, compute effect sizes with confidence intervals, and run statistical tests to validate the hypotheses. We additionally assess the trading performance of the proposed method on historical data and compare it to a previously published approach. Our analysis shows that the proposed volume-based method allows successful classification of the financial time series patterns, and also leads to better classification performance than a price action-based method, excelling specifically on more liquid financial instruments. Finally, we propose an approach for obtaining feature interactions directly from tree-based models on example of CatBoost estimator, as well as formally assess the relatedness of the proposed approach and SHAP feature interactions with a positive outcome.

configuration, effect size, instrument, (15 more...)

arXiv.org Artificial Intelligence

2103.12419

Country: Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Data Science > Data Quality > Data Cleaning (0.34)

Add feedback

Comparison of Atom Representations in Graph Neural Networks for Molecular Property Prediction

Pocha, Agnieszka, Danel, Tomasz, Maziarka, Łukasz

arXiv.org Machine LearningNov-23-2020

Graph neural networks have recently become a standard method for analysing chemical compounds. In the field of molecular property prediction, the emphasis is now put on designing new model architectures, and the importance of atom featurisation is oftentimes belittled. When contrasting two graph neural networks, the use of different atom features possibly leads to the incorrect attribution of the results to the network architecture. To provide a better understanding of this issue, we compare multiple atom representations for graph models and evaluate them on the prediction of free energy, solubility, and metabolic stability. To the best of our knowledge, this is the first methodological study that focuses on the relevance of atom representation to the predictive performance of graph neural networks.

dataset, molecule, representation, (17 more...)

arXiv.org Machine Learning

2012.04444

Country: Europe > Poland (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Materials > Chemicals (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Bayes metaclassifier and Soft-confusion-matrix classifier in the task of multi-label classification

Trajdos, Pawel, Majak, Marcin

arXiv.org Machine LearningJan-25-2019

The aim of this paper was to compare soft confusion matrix approach and Bayes metaclassifier under the multi-label classification framework. Although the methods were successfully applied under the multi-label classification framework, they have not been compared directly thus far. Such comparison is of vital importance because both methods are quite similar as they are both based on the concept of randomized reference classifier. Since both algorithms were designed to deal with single-label problems, they are combined with the problem-transformation approach to multi-label classification. Present study included 29 benchmark datasets and four different base classifiers. The algorithms were compared in terms of 11 quality criteria and the results were subjected to statistical analysis.

base classifier, classification, classifier, (14 more...)

arXiv.org Machine Learning

1901.08827

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Newton (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Add feedback

Dynamic Interaction Mechanics CrossAnt

Gomes, Samuel, Martinho, Carlos

arXiv.org Artificial IntelligenceNov-17-2018

Nowadays, big effort is being put to study gamification and what gamified applications can do to engage players. Therefore, aspects such as the impact social game mechanics have are being approached. In this work, we focus on the generation of certain types of interaction mechanics to lead players to achieve what we think are the three basic types of social interactions: cooperation, competition and individual exploration. This was done by adapting a game called CrossAnt so that certain interaction mechanics could be generated in certain moments. Our results show that although cooperation could be promoted, longer interactions may be needed so that the other types of behavior can emerge.

artificial intelligence, interaction, social behavior, (15 more...)

arXiv.org Artificial Intelligence

1811.07243

Country: Europe > Portugal (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.35)

Add feedback