AITopics

2302.11049

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (0.40)

Industry:

Transportation > Air (1.00)
Aerospace & Defense (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Pale, Una, Teijeiro, Tomas, Atienza, David

Importance of methodological choices in data manipulation for validating epileptic seizure detection models

arXiv.org Artificial IntelligenceFeb-21-2023

Epilepsy is a chronic neurological disorder that affects a significant portion of the human population and imposes serious risks in the daily life of patients. Despite advances in machine learning and IoT, small, nonstigmatizing wearable devices for continuous monitoring and detection in outpatient environments are not yet available. Part of the reason is the complexity of epilepsy itself, including highly imbalanced data, multimodal nature, and very subject-specific signatures. However, another problem is the heterogeneity of methodological approaches in research, leading to slower progress, difficulty comparing results, and low reproducibility. Therefore, this article identifies a wide range of methodological decisions that must be made and reported when training and evaluating the performance of epilepsy detection systems. We characterize the influence of individual choices using a typical ensemble random-forest model and the publicly available CHB-MIT database, providing a broader picture of each decision and giving good-practice recommendations, based on our experience, where possible.

artificial intelligence, machine learning, seizure, (17 more...)

2302.10672

Country: Europe (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Dyrland, K., Lundervold, A. S., Mana, P. G. L. Porta

Does the evaluation stand up to evaluation? A first-principle approach to the evaluation of classifiers

arXiv.org Artificial IntelligenceFeb-21-2023

How can one meaningfully make a measurement, if the meter does not conform to any standard and its scale expands or shrinks depending on what is measured? In the present work it is argued that current evaluation practices for machine-learning classifiers are affected by this kind of problem, leading to negative consequences when classifiers are put to real use; consequences that could have been avoided. It is proposed that evaluation be grounded on Decision Theory, and the implications of such foundation are explored. The main result is that every evaluation metric must be a linear combination of confusion-matrix elements, with coefficients - "utilities" - that depend on the specific classification problem. For binary classification, the space of such possible metrics is effectively two-dimensional. It is shown that popular metrics such as precision, balanced accuracy, Matthews Correlation Coefficient, Fowlkes-Mallows index, F1-measure, and Area Under the Curve are never optimal: they always give rise to an in-principle avoidable fraction of incorrect evaluations. This fraction is even larger than would be caused by the use of a decision-theoretic metric with moderately wrong coefficients.

artificial intelligence, machine learning, matrix, (16 more...)

2302.12006

Country:

North America > United States (0.47)
Europe (0.46)

Genre:

Research Report (0.82)
Instructional Material (0.67)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Poyatos, Javier, Molina, Daniel, Martínez, Aitor, Del Ser, Javier, Herrera, Francisco

Multiobjective Evolutionary Pruning of Deep Neural Networks with Transfer Learning for improving their Performance and Robustness

Evolutionary Computation algorithms have been used to solve optimization problems in relation with architectural, hyper-parameter or training configuration, forging the field known today as Neural Architecture Search. These algorithms have been combined with other techniques such as the pruning of Neural Networks, which reduces the complexity of the network, and the Transfer Learning, which lets the import of knowledge from another problem related to the one at hand. The usage of several criteria to evaluate the quality of the evolutionary proposals is also a common case, in which the performance and complexity of the network are the most used criteria. This work proposes MO-EvoPruneDeepTL, a multi-objective evolutionary pruning algorithm. \proposal uses Transfer Learning to adapt the last layers of Deep Neural Networks, by replacing them with sparse layers evolved by a genetic algorithm, which guides the evolution based in the performance, complexity and robustness of the network, being the robustness a great quality indicator for the evolved models. We carry out different experiments with several datasets to assess the benefits of our proposal. Results show that our proposal achieves promising results in all the objectives, and direct relation are presented among them. The experiments also show that the most influential neurons help us explain which parts of the input images are the most relevant for the prediction of the pruned neural network. Lastly, by virtue of the diversity within the Pareto front of pruning patterns produced by the proposal, it is shown that an ensemble of differently pruned models improves the overall performance and robustness of the trained networks.

artificial intelligence, deep learning, machine learning, (18 more...)

2302.10253

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Schumann, Julian Frederik, Kober, Jens, Zgonnikov, Arkady

Benchmark for Models Predicting Human Behavior in Gap Acceptance Scenarios

Autonomous vehicles currently suffer from a time-inefficient driving style caused by uncertainty about human behavior in traffic interactions. Accurate and reliable prediction models enabling more efficient trajectory planning could make autonomous vehicles more assertive in such interactions. However, the evaluation of such models is commonly oversimplistic, ignoring the asymmetric importance of prediction errors and the heterogeneity of the datasets used for testing. We examine the potential of recasting interactions between vehicles as gap acceptance scenarios and evaluating models in this structured environment. To that end, we develop a framework aiming to facilitate the evaluation of any model, by any metric, and in any scenario. We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.

data mining, machine learning, prediction, (15 more...)

doi: 10.1109/TIV.2023.3244280

2211.05455

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > California (0.04)
Europe > Ireland > Connaught > County Galway > Galway (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Zhou, Sheng, Blanchart, Pierre, Crucianu, Michel, Ferecatu, Marin

Why is the prediction wrong? Towards underfitting case explanation via meta-classification

In this paper we present a heuristic method to provide individual explanations for those elements in a dataset (data points) which are wrongly predicted by a given classifier. Since the general case is too difficult, in the present work we focus on faulty data from an underfitted model. First, we project the faulty data into a hand-crafted, and thus human readable, intermediate representation (meta-representation, profile vectors), with the aim of separating the two main causes of miss-classification: the classifier is not strong enough, or the data point belongs to an area of the input space where classes are not separable. Second, in the space of these profile vectors, we present a method to fit a meta-classifier (decision tree) and express its output as a set of interpretable (human readable) explanation rules, which leads to several target diagnosis labels: data point is either correctly classified, or faulty due to a too weak model, or faulty due to mixed (overlapped) classes in the input space. Experimental results on several real datasets show more than 80% diagnosis label accuracy and confirm that the proposed intermediate representation allows to achieve a high degree of invariance with respect to the classifier used in the input space and to the dataset being classified, i.e. we can learn the metaclassifier on a dataset with a given classifier and successfully predict diagnosis labels for a different dataset or classifier (or both).

classifier, data mining, machine learning, (18 more...)

doi: 10.1109/DSAA54385.2022.10032332

2302.09952

Country:

Oceania > Australia (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Søgaard, Anders, Hershcovich, Daniel, de Lhoneux, Miryam

A Two-Sided Discussion of Preregistration of NLP Research

Van Miltenburg et al. (2021) suggest NLP research should adopt preregistration to prevent fishing expeditions and to promote publication of negative results. At face value, this is a very reasonable suggestion, seemingly solving many methodological problems with NLP research. We discuss pros and cons -- some old, some new: a) Preregistration is challenged by the practice of retrieving hypotheses after the results are known; b) preregistration may bias NLP toward confirmatory research; c) preregistration must allow for reclassification of research as exploratory; d) preregistration may increase publication bias; e) preregistration may increase flag-planting; f) preregistration may increase p-hacking; and finally, g) preregistration may make us less risk tolerant. We cast our discussion as a dialogue, presenting both sides of the debate.

machine learning, natural language, preregistration, (18 more...)

2302.10086

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry:

Health & Medicine > Epidemiology (0.47)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Badian, Yael, Ophir, Yaakov, Tikochinski, Refael, Calderon, Nitay, Klomek, Anat Brunstein, Reichart, Roi

A Picture May Be Worth a Thousand Lives: An Interpretable Artificial Intelligence Strategy for Predictions of Suicide Risk from Social Media Images

arXiv.org Artificial IntelligenceFeb-19-2023

The promising research on Artificial Intelligence usages in suicide prevention has principal gaps, including black box methodologies, inadequate outcome measures, and scarce research on non-verbal inputs, such as social media images (despite their popularity today, in our digital era). This study addresses these gaps and combines theory-driven and bottom-up strategies to construct a hybrid and interpretable prediction model of valid suicide risk from images. The lead hypothesis was that images contain valuable information about emotions and interpersonal relationships, two central concepts in suicide-related treatments and theories. The dataset included 177,220 images by 841 Facebook users who completed a gold-standard suicide scale. The images were represented with CLIP, a state-of-the-art algorithm, which was utilized, unconventionally, to extract predefined features that served as inputs to a simple logistic-regression prediction model (in contrast to complex neural networks). The features addressed basic and theory-driven visual elements using everyday language (e.g., bright photo, photo of sad people). The results of the hybrid model (that integrated theory-driven and bottom-up methods) indicated high prediction performance that surpassed common bottom-up algorithms, thus providing a first proof that images (alone) can be leveraged to predict validated suicide risk. Corresponding with the lead hypothesis, at-risk users had images with increased negative emotions and decreased belonginess. The results are discussed in the context of non-verbal warning signs of suicide. Notably, the study illustrates the advantages of hybrid models in such complicated tasks and provides simple and flexible prediction strategies that could be utilized to develop real-life monitoring tools of suicide.

artificial intelligence, machine learning, suicide risk, (17 more...)

2302.09488

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas > Kleberg County (0.04)
North America > United States > Texas > Chambers County (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Shaer, Shalev, Maman, Gal, Romano, Yaniv

Model-X Sequential Testing for Conditional Independence via Testing by Betting

arXiv.org Machine LearningFeb-19-2023

This paper develops a model-free sequential test for conditional independence. The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure, and safely conclude whether a feature is conditionally associated with the response under study. We allow the processing of data points online, as soon as they arrive, and stop data acquisition once significant results are detected, rigorously controlling the type-I error rate. Our test can work with any sophisticated machine learning algorithm to enhance data efficiency to the extent possible. The developed method is inspired by two statistical frameworks. The first is the model-X conditional randomization test, a test for conditional independence that is valid in offline settings where the sample size is fixed in advance. The second is testing by betting, a ``game-theoretic'' approach for sequential hypothesis testing. We conduct synthetic experiments to demonstrate the advantage of our test over out-of-the-box sequential tests that account for the multiplicity of tests in the time horizon, and demonstrate the practicality of our proposal by applying it to real-world tasks.

artificial intelligence, information technology 0, machine learning, (13 more...)

arXiv.org Machine Learning

2210.00354

Country:

Asia > Middle East > Israel (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceFeb-19-2023

Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset

Qi, Jiexing, Li, Shuhao, Guo, Zhixin, Huang, Yusheng, Zhou, Chenghu, Zhang, Weinan, Wang, Xinbing, Lin, Zhouhan

Real-world data usually exhibits a long-tailed distribution,with a few frequent labels and a lot of few-shot labels. The study of institution name normalization is a perfect application case showing this phenomenon. There are many institutions worldwide with enormous variations of their names in the publicly available literature. In this work, we first collect a large-scale institution name normalization dataset LoT-insts1, which contains over 25k classes that exhibit a naturally long-tailed distribution. In order to isolate the few-shot and zero-shot learning scenarios from the massive many-shot classes, we construct our test set from four different subsets: many-, medium-, and few-shot sets, as well as a zero-shot open set. We also replicate several important baseline methods on our data, covering a wide range from search-based methods to neural network methods that use the pretrained BERT model. Further, we propose our specially pretrained, BERT-based model that shows better out-of-distribution generalization on few-shot and zero-shot test sets. Compared to other datasets focusing on the long-tailed phenomenon, our dataset has one order of magnitude more training data than the largest existing long-tailed datasets and is naturally long-tailed rather than manually synthesized. We believe it provides an important and different scenario to study this problem. To our best knowledge, this is the first natural language dataset that focuses on long-tailed and open-set classification problems.

large language model, machine learning, natural language, (21 more...)

2302.09509

Country:

North America > United States > New Jersey > Burlington County (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Ohio (0.05)
(6 more...)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)