AITopics

2101.1071

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Denmark > North Jutland > Aalborg (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(3 more...)

#artificialintelligenceJan-24-2021, 10:25:05 GMT

Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data

High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.

genome sequence, sequence, variant, (10 more...)

#artificialintelligence

Genre: Research Report (0.49)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Qiu, Lin, Murrugarra-Llerena, Nils, Silva, Vítor, Lin, Lin, Chinchilli, Vernon M.

NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy

arXiv.org Machine LearningJan-24-2021

Controlling false discovery rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science. Existing methods rely on the test-level covariates while ignoring possible hierarchy among the covariates. This strategy may not be optimal for complex large-scale problems, where hierarchical information often exists among those test-level covariates. We propose NeurT-FDR which boosts statistical power and controls FDR for multiple hypothesis testing while leveraging the hierarchy among test-level covariates. Our method parametrizes the test-level covariates as a neural network and adjusts the feature hierarchy through a regression framework, which enables flexible handling of high-dimensional features as well as efficient end-to-end optimization. We show that NeurT-FDR has strong FDR guarantees and makes substantially more discoveries in synthetic and real datasets compared to competitive baselines.

covariate, discovery, hypothesis, (14 more...)

2101.09809

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

arXiv.org Machine LearningJan-22-2021

Predicting Recession Probabilities Using Term Spreads: New Evidence from a Machine Learning Approach

Choi, Jaehyuk, Ge, Desheng, Kang, Kyu Ho, Sohn, Sungbin

The literature on using yield curves to forecast recessions typically measures the term spread as the difference between the 10-year and the three-month Treasury rates. Furthermore, using the term spread constrains the long- and short-term interest rates to have the same absolute effect on the recession probability. In this study, we adopt a machine learning method to investigate whether the predictive ability of interest rates can be improved. The machine learning algorithm identifies the best maturity pair, separating the effects of interest rates from those of the term spread. Our comprehensive empirical exercise shows that, despite the likelihood gain, the machine learning approach does not significantly improve the predictive accuracy, owing to the estimation error. Our finding supports the conventional use of the 10-year--three-month Treasury yield spread. This is robust to the forecasting horizon, control variable, sample period, and oversampling of the recession observations.

coefficient, forecasting horizon, recession, (14 more...)

2101.09394

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Missouri > Jackson County > Kansas City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Yang, Eddie, Roberts, Margaret E.

Censorship of Online Encyclopedias: Implications for NLP Models

arXiv.org Artificial IntelligenceJan-22-2021

NLP impacts how firms provide products to users, content individuals receive through search and social media, and how While artificial intelligence provides the backbone for many tools individuals interact with news and emails. Despite the growing people use around the world, recent work has brought to attention importance of NLP algorithms in shaping our lives, recently scholars, that the algorithms powering AI are not free of politics, stereotypes, policymakers, and the business community have raised the and bias. While most work in this area has focused on the ways alarm of how gender and racial biases may be baked into these algorithms. in which AI can exacerbate existing inequalities and discrimination, Because they are trained on human data, the algorithms very little work has studied how governments actively shape themselves can replicate implicit and explicit human biases and training data. We describe how censorship has affected the development aggravate discrimination [6, 8, 39]. Additionally, training data that of Wikipedia corpuses, text data which are regularly used over-represents a subset of the population may do a worse job for pre-trained inputs into NLP algorithms. We show that word embeddings at predicting outcomes for other groups in the population [13].

baidu baike, category, wikipedia, (14 more...)

doi: 10.1145/3442188.3445916

2101.09294

Country:

North America > United States > California > San Diego County > La Jolla (0.14)
North America > Canada (0.05)
Asia > China > Hong Kong (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceJan-22-2021

How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations

Jesus, Sérgio, Belém, Catarina, Balayan, Vladimir, Bento, João, Saleiro, Pedro, Bizarro, Pedro, Gama, João

There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hurt the overall performance of the combined system of ML model + end-users. This study aims to bridge this gap by proposing XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information. We conducted an experiment following XAI Test to evaluate three popular post-hoc explanation methods -- LIME, SHAP, and TreeInterpreter -- on a real-world fraud detection task, with real data, a deployed ML model, and fraud analysts. During the experiment, we gradually increased the information provided to the fraud analysts in three stages: Data Only, i.e., just transaction data without access to model score nor explanations, Data + ML Model Score, and Data + ML Model Score + Explanations. Using strong statistical analysis, we show that, in general, these popular explainers have a worse impact than desired. Some of the conclusion highlights include: i) showing Data Only results in the highest decision accuracy and the slowest decision time among all variants tested, ii) all the explainers improve accuracy over the Data + ML Model Score variant but still result in lower accuracy when compared with Data Only; iii) LIME was the least preferred by users, probably due to its substantially lower variability of explanations from case to case.

experiment, explainer, explanation, (13 more...)

doi: 10.1145/3442188.3445941 10.1145/3442188.3445941 10.1145/3442188.3445941 10.1145/3442188.3445941

2101.08758

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Portugal (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.67)
Law Enforcement & Public Safety > Fraud (0.48)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

arXiv.org Artificial IntelligenceJan-21-2021

Noisy intermediate-scale quantum (NISQ) algorithms

Bharti, Kishor, Cervera-Lierta, Alba, Kyaw, Thi Ha, Haug, Tobias, Alperin-Lea, Sumner, Anand, Abhinav, Degroote, Matthias, Heimonen, Hermanni, Kottmann, Jakob S., Menke, Tim, Mok, Wai-Keong, Sim, Sukin, Kwek, Leong-Chuan, Aspuru-Guzik, Alán

A universal fault-tolerant quantum computer that can solve efficiently problems such as integer factorization and unstructured database search requires millions of qubits with low error rates and long coherence times. While the experimental advancement towards realizing such devices will potentially take decades of research, noisy intermediate-scale quantum (NISQ) computers already exist. These computers are composed of hundreds of noisy qubits, i.e. qubits that are not error-corrected, and therefore perform imperfect operations in a limited coherence time. In the search for quantum advantage with these devices, algorithms have been proposed for applications in various disciplines spanning physics, machine learning, quantum chemistry and combinatorial optimization. The goal of such algorithms is to leverage the limited available resources to perform classically challenging tasks. In this review, we provide a thorough summary of NISQ computational paradigms and algorithms. We discuss the key structure of these algorithms, their limitations, and advantages. We additionally provide a comprehensive overview of various benchmarking and software tools useful for programming and testing NISQ devices.

neural network, output generation problem, upstream oil & gas, (26 more...)

2101.08448

Country:

Europe > United Kingdom > England (0.45)
Europe > Netherlands (0.27)
Asia > Middle East (0.14)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry:

Banking & Finance (1.00)
Energy > Oil & Gas > Upstream (0.67)
Government (0.67)

Technology:

Information Technology > Software (1.00)
Information Technology > Mathematics of Computing (1.00)
Information Technology > Hardware (1.00)
(6 more...)

Pata, Joosep, Duarte, Javier, Vlimant, Jean-Roch, Pierini, Maurizio, Spiropulu, Maria

MLPF: Efficient machine-learned particle-flow reconstruction using graph neural networks

arXiv.org Machine LearningJan-21-2021

In general-purpose particle detectors, the particle flow algorithm may be used to reconstruct a coherent particle-level view of the event by combining information from the calorimeters and the trackers, significantly improving the detector resolution for jets and the missing transverse momentum. In view of the planned high-luminosity upgrade of the CERN Large Hadron Collider, it is necessary to revisit existing reconstruction algorithms and ensure that both the physics and computational performance are sufficient in a high-pileup environment. Recent developments in machine learning may offer a prospect for efficient event reconstruction based on parametric models. We introduce MLPF, an end-to-end trainable machine-learned particle flow algorithm for reconstructing particle flow candidates based on parallelizable, computationally efficient, scalable graph neural networks and a multi-task objective. We report the physics and computational performance of the MLPF algorithm on on a synthetic dataset of ttbar events in HL-LHC running conditions, including the simulation of multiple interaction effects, and discuss potential next steps and considerations towards ML-based reconstruction in a general purpose particle detector.

algorithm, arxiv, reconstruction, (16 more...)

2101.08578

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Zhang, Sikai, Lang, Zi-Qiang

Orthogonal Least Squares Based Fast Feature Selection for Linear Classification

arXiv.org Machine LearningJan-21-2021

An Orthogonal Least Squares (OLS) based feature selection method is proposed for both binomial and multinomial classification. The novel Squared Orthogonal Correlation Coefficient (SOCC) is defined based on Error Reduction Ratio (ERR) in OLS and used as the feature ranking criterion. The equivalence between the canonical correlation coefficient, Fisher's criterion, and the sum of the SOCCs is revealed, which unveils the statistical implication of ERR in OLS for the first time. It is also shown that the OLS based feature selection method has speed advantages when applied for greedy search. The proposed method is comprehensively compared with the mutual information based feature selection methods in 2 synthetic and 7 real world datasets. The results show that the proposed method is always in the top 5 among the 10 candidate methods. Besides, the proposed method can be directly applied to continuous features without discretisation, which is another significant advantage over mutual information based methods.

canonical correlation coefficient, correlation coefficient, feature selection method, (12 more...)

2101.08539

Country:

Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

arXiv.org Artificial IntelligenceJan-21-2021

A scalable approach for developing clinical risk prediction applications in different hospitals

Sun, Hong, Depraetere, Kristof, Meesseman, Laurent, De Roo, Jos, Vanbiervliet, Martijn, De Baerdemaeker, Jos, Muys, Herman, von Dossow, Vera, Hulde, Nikolai, Szymanowsky, Ralph

Objective: Machine learning algorithms are now widely used in predicting acute events for clinical applications. While most of such prediction applications are developed to predict the risk of a particular acute event at one hospital, few efforts have been made in extending the developed solutions to other events or to different hospitals. We provide a scalable solution to extend the process of clinical risk prediction model development of multiple diseases and their deployment in different Electronic Health Records (EHR) systems. Materials and Methods: We defined a generic process for clinical risk prediction model development. A calibration tool has been created to automate the model generation process. We applied the model calibration process at four hospitals, and generated risk prediction models for delirium, sepsis and acute kidney injury (AKI) respectively at each of these hospitals. Results: The delirium risk prediction models achieved area under the receiver-operating characteristic curve (AUROC) ranging from 0.82 to 0.95 over different stages of a hospital stay on the test datasets of the four hospitals. The sepsis models achieved AUROC ranging from 0.88 to 0.95, and the AKI models achieved AUROC ranging from 0.85 to 0.92. Discussion: The scalability discussed in this paper is based on building common data representations (syntactic interoperability) between EHRs stored in different hospitals. Semantic interoperability, a more challenging requirement that different EHRs share the same meaning of data, e.g. a same lab coding system, is not mandated with our approach. Conclusions: Our study describes a method to develop and deploy clinical risk prediction models in a scalable way. We demonstrate its feasibility by developing risk prediction models for three diseases across four hospitals.

hospital, prediction model, risk prediction model, (14 more...)

2101.10268

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > United Kingdom > Wales (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)

Genre: Research Report > Experimental Study (0.94)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)