AITopics

2006.03487

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Virginia (0.04)

Genre: Research Report (0.50)

Industry:

Government (0.68)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.74)

Vilone, Giulia, Longo, Luca

Explainable Artificial Intelligence: a Systematic Review

arXiv.org Artificial IntelligenceJun-4-2020

This has led to the development of a plethora of domain-dependent and context-specific methods for dealing with the interpretation of machine learning (ML) models and the formation of explanations for humans. Unfortunately, this trend is far from being over, with an abundance of knowledge in the field which is scattered and needs organisation. The goal of this article is to systematically review research works in the field of XAI and to try to define some boundaries in the field. From several hundreds of research articles focused on the concept of explainability, about 350 have been considered for review by using the following search methodology. In a first phase, Google Scholar was queried to find papers related to "explainable artificial intelligence", "explainable machine learning" and "interpretable machine learning". Subsequently, the bibliographic section of these articles was thoroughly examined to retrieve further relevant scientific studies. The first noticeable thing, as shown in figure 2 (a), is the distribution of the publication dates of selected research articles: sporadic in the 70s and 80s, receiving preliminary attention in the 90s, showing raising interest in 2000 and becoming a recognised body of knowledge after 2010. The first research concerned the development of an explanation-based system and its integration in a computer program designed to help doctors make diagnoses [3]. Some of the more recent papers focus on work devoted to the clustering of methods for explainability, motivating the need for organising the XAI literature [4, 5, 6].

machine learning, natural language, neural information processing system, (17 more...)

arXiv.org Artificial Intelligence

2006.00093

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York > New York County > New York City (0.14)
(90 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(5 more...)

arXiv.org Machine LearningJun-4-2020

COVID-19 diagnosis by routine blood tests using machine learning

Kukar, Matjaž, Gunčar, Gregor, Vovko, Tomaž, Podnar, Simon, Černelč, Peter, Brvar, Miran, Zalaznik, Mateja, Notar, Mateja, Moškon, Sašo, Notar, Marko

Physicians taking care of patients with coronavirus disease (COVID-19) have described different changes in routine blood parameters. However, these changes, hinder them from performing COVID-19 diagnosis. We constructed a machine learning predictive model for COVID-19 diagnosis. The model was based and cross-validated on the routine blood tests of 5,333 patients with various bacterial and viral infections, and 160 COVID-19-positive patients. We selected operational ROC point at a sensitivity of 81.9% and specificity of 97.9%. The cross-validated area under the curve (AUC) was 0.97. The five most useful routine blood parameters for COVID19 diagnosis according to the feature importance scoring of the XGBoost algorithm were MCHC, eosinophil count, albumin, INR, and prothrombin activity percentage. tSNE visualization showed that the blood parameters of the patients with severe COVID-19 course are more like the parameters of bacterial than viral infection. The reported diagnostic accuracy is at least comparable and probably complementary to RT-PCR and chest CT studies. Patients with fever, cough, myalgia, and other symptoms can now have initial routine blood tests assessed by our diagnostic tool. All patients with a positive COVID-19 prediction would then undergo standard RT-PCR studies to confirm the diagnosis. We believe that our results present a significant contribution to improvements in COVID-19 diagnosis.

artificial intelligence, diagnosis, machine learning, (16 more...)

2006.03476

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
(10 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Chandra, Andreas, Stefanus, Ruben

Experiments on Paraphrase Identification Using Quora Question Pairs Dataset

arXiv.org Artificial IntelligenceJun-4-2020

We modeled the Quora question pairs dataset to identify a similar question. The dataset that we use is provided by Quora. The task is a binary classification. We tried several methods and algorithms and different approach from previous works. For feature extraction, we used Bag of Words including Count Vectorizer, and Term Frequency-Inverse Document Frequency with unigram for XGBoost and CatBoost. Furthermore, we also experimented with WordPiece tokenizer which improves the model performance significantly. We achieved up to 97 percent accuracy. Code and Dataset.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2006.02648

Country: North America > United States (0.34)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.69)

Los Angeles TimesJun-3-2020, 13:48:00 GMT

Hype and science collide as FDA tries to rein in 'wild West' of COVID-19 blood tests

Save your business while saving lives,

antibody test, artificial intelligence, machine learning, (15 more...)

Los Angeles Times

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

arXiv.org Machine LearningJun-3-2020

Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm

Kaya, Semih, Vural, Elif

While many approaches exist in the literature to learn representations for data collections in multiple modalities, the generalizability of the learnt representations to previously unseen data is a largely overlooked subject. In this work, we first present a theoretical analysis of learning multi-modal nonlinear embeddings in a supervised setting. Our performance bounds indicate that for successful generalization in multi-modal classification and retrieval problems, the regularity of the interpolation functions extending the embedding to the whole data space is as important as the between-class separation and cross-modal alignment criteria. We then propose a multi-modal nonlinear representation learning algorithm that is motivated by these theoretical findings, where the embeddings of the training samples are optimized jointly with the Lipschitz regularity of the interpolators. Experimental comparison to recent multi-modal and single-modal learning algorithms suggests that the proposed method yields promising performance in multi-modal image classification and cross-modal image-text retrieval applications.

artificial intelligence, data mining, machine learning, (21 more...)

2006.0233

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Incorporating Physical Knowledge into Machine Learning for Planetary Space Physics

Azari, A. R., Lockhart, J. W., Liemohn, M. W., Jia, X.

Recent improvements in data collection volume from planetary and space physics missions have allowed the application of novel data science techniques. The Cassini mission for example collected over 600 gigabytes of scientific data from 2004 to 2017. This represents a surge of data on the Saturn system. Machine learning can help scientists work with data on this larger scale. Unlike many applications of machine learning, a primary use in planetary space physics applications is to infer behavior about the system itself. This raises three concerns: first, the performance of the machine learning model, second, the need for interpretable applications to answer scientific questions, and third, how characteristics of spacecraft data change these applications. In comparison to these concerns, uses of black box or un-interpretable machine learning methods tend toward evaluations of performance only either ignoring the underlying physical process or, less often, providing misleading explanations for it. We build off a previous effort applying a semi-supervised physics-based classification of plasma instabilities in Saturn's magnetosphere. We then use this previous effort in comparison to other machine learning classifiers with varying data size access, and physical information access. We show that incorporating knowledge of these orbiting spacecraft data characteristics improves the performance and interpretability of machine learning methods, which is essential for deriving scientific meaning. Building on these findings, we present a framework on incorporating physics knowledge into machine learning problems targeting semi-supervised classification for space physics data in planetary environments. These findings present a path forward for incorporating physical knowledge into space physics and planetary mission data analyses for scientific discovery.

application, artificial intelligence, machine learning, (14 more...)

2006.01927

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Industry:

Transportation (0.71)
Government > Regional Government > North America Government > United States Government (0.68)
Government > Space Agency (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Cochrane, Courtney, Castineira, David, Shiban, Nisreen, Protopapas, Pavlos

Application of Machine Learning to Predict the Risk of Alzheimer's Disease: An Accurate and Practical Solution for Early Diagnostics

Alzheimer's Disease (AD) ravages the cognitive ability of more than 5 million Americans and creates an enormous strain on the health care system. This paper proposes a machine learning predictive model for AD development without medical imaging and with fewer clinical visits and tests, in hopes of earlier and cheaper diagnoses. That earlier diagnoses could be critical in the effectiveness of any drug or medical treatment to cure this disease. Our model is trained and validated using demographic, biomarker and cognitive test data from two prominent research studies: Alzheimer's Disease Neuroimaging Initiative (ADNI) and Australian Imaging, Biomarker & Lifestyle Flagship Study of Aging (AIBL). We systematically explore different machine learning models, pre-processing methods and feature selection techniques. The most performant model demonstrates greater than 90% accuracy and recall in predicting AD, and the results generalize across sub-studies of ADNI and to the independent AIBL study. We also demonstrate that these results are robust to reducing the number of clinical visits or tests per visit. Using a metaclassification algorithm and longitudinal data analysis we are able to produce a "lean" diagnostic protocol with only 3 tests and 4 clinical visits that can predict Alzheimer's development with 87% accuracy and 79% recall. This novel work can be adapted into a practical early diagnostic tool for predicting the development of Alzheimer's that maximizes accuracy while minimizing the number of necessary diagnostic tests and clinical visits.

alzheimer, artificial intelligence, machine learning, (17 more...)

2006.08702

Country:

North America > United States > New York (0.04)
Oceania > Australia (0.04)
North America > United States > New Jersey > Hudson County > Secaucus (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Schuler, Alejandro, Bhardwaj, Aashish, Liu, Vincent

Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model

Clinical researchers often select among and evaluate risk prediction models using standard machine learning metrics based on confusion matrices. However, if these models are used to allocate interventions to patients, standard metrics calculated from retrospective data are only related to model utility (in terms of reductions in outcomes) under certain assumptions. When predictions are delivered repeatedly throughout time (e.g. in a patient encounter), the relationship between standard metrics and utility is further complicated. Several kinds of evaluations have been used in the literature, but it has not been clear what the target of estimation is in each evaluation. We synthesize these approaches, determine what is being estimated in each of them, and discuss under what assumptions those estimates are valid. We demonstrate our insights using simulated data as well as real data used in the design of an early warning system. Our theoretical and empirical results show that evaluations without interventional data either do not estimate meaningful quantities, require strong assumptions, or are limited to estimating best-case scenario bounds.

alert, artificial intelligence, machine learning, (18 more...)

2006.01752

Country: North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)

Tay, J. Kenneth, Aghaeepour, Nima, Hastie, Trevor, Tibshirani, Robert

Feature-weighted elastic net: using "features of features" for better prediction

In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method which leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net penalty. In our simulations, fwelnet outperforms the lasso in terms of test mean squared error and usually gives an improvement in true positive rate or false positive rate for feature selection. We also apply this method to early prediction of preeclampsia, where fwelnet outperforms the lasso in terms of 10-fold cross-validated area under the curve (0.86 vs. 0.80). We also provide a connection between fwelnet and the group lasso and suggest how fwelnet might be used for multi-task learning.

artificial intelligence, fwelnet, machine learning, (18 more...)

2006.01395

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)