AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Differential Network Learning Beyond Data Samples

Sekhon, Arshdeep, Wang, Beilun, Wang, Zhe, Qi, Yanjun

arXiv.org Machine LearningApr-23-2020

Learning the change of statistical dependencies between random variables is an essential task for many real-life applications, mostly in the high dimensional low sample regime. In this paper, we propose a novel differential parameter estimator that, in comparison to current methods, simultaneously allows (a) the flexible integration of multiple sources of information (data samples, variable groupings, extra pairwise evidence, etc.), (b) being scalable to a large number of variables, and (c) achieving a sharp asymptotic convergence rate. Our experiments, on more than 100 simulated and two real-world datasets, validate the flexibility of our approach and highlight the benefits of integrating spatial and anatomic information for brain connectome change discovery and epigenetic network identification.

baseline, kdiffnet, knowledge, (15 more...)

arXiv.org Machine Learning

2004.11494

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Data Science (0.92)

Add feedback

Rapidly Bootstrapping a Question Answering Dataset for COVID-19

Tang, Raphael, Nogueira, Rodrigo, Zhang, Edwin, Gupta, Nikhil, Cam, Phuong, Cho, Kyunghyun, Lin, Jimmy

arXiv.org Artificial IntelligenceApr-23-2020

We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While this dataset, comprising 124 question-article pairs as of the present version 0.1 release, does not have sufficient examples for supervised machine learning, we believe that it can be helpful for evaluating the zero-shot or transfer capabilities of existing models on topics specifically related to COVID-19. This paper describes our methodology for constructing the dataset and presents the effectiveness of a number of baselines, including term-based techniques and various transformer-based models. The dataset is available at http://covidqa.ai/

dataset, effectiveness, natural language question, (14 more...)

arXiv.org Artificial Intelligence

2004.11339

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

A Snapshot of the Frontiers of Fairness in Machine Learning

Communications of the ACMApr-22-2020, 00:40:12 GMT

The last decade has seen a vast increase both in the diversity of applications to which machine learning is applied, and to the import of those applications. Machine learning is no longer just the engine behind ad placements and spam filters; it is now used to filter loan applicants, deploy police officers, and inform bail and parole decisions, among other things. The result has been a major concern for the potential for data-driven methods to introduce and perpetuate discriminatory practices, and to otherwise be unfair. And this concern has not been without reason: a steady stream of empirical findings has shown that data-driven methods can unintentionally both encode existing human biases and introduce new ones.7,9,11,60 At the same time, the last two years have seen an unprecedented explosion in interest from the academic community in studying fairness and machine learning. "Fairness and transparency" transformed from a niche topic with a trickle of papers produced every year (at least since the work of Pedresh56 to a major subfield of machine learning, complete with a dedicated archival conference--ACM FAT*). But despite the volume and velocity of published work, our understanding of the fundamental questions related to fairness and machine learning remain in its infancy.

fairness, learning, proceedings, (12 more...)

Communications of the ACM

AI-Alerts: 2020 > 2020-04 > AAAI AI-Alert for Apr 28, 2020 (1.00)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.49)
Education (0.46)
Information Technology > Security & Privacy (0.46)
Law > Civil Rights & Constitutional Law (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Risk Estimation of SARS-CoV-2 Transmission from Bluetooth Low Energy Measurements

Sattler, Felix, Ma, Jackie, Wagner, Patrick, Neumann, David, Wenzel, Markus, Schäfer, Ralf, Samek, Wojciech, Müller, Klaus-Robert, Wiegand, Thomas

arXiv.org Machine LearningApr-22-2020

Digital contact tracing approaches based on Bluetooth low energy (BLE) have the potential to efficiently contain and delay outbreaks of infectious diseases such as the ongoing SARS-CoV-2 pandemic. In this work we propose a novel machine learning based approach to reliably detect subjects that have spent enough time in close proximity to be at risk of being infected. Our study is an important proof of concept that will aid the battery of epidemiological policies aiming to slow down the rapid spread of COVID-19.

epidemiological model, proximity, time sery, (11 more...)

arXiv.org Machine Learning

2004.11841

Country:

Asia > Singapore (0.15)
Europe > Germany > Berlin (0.04)
Africa > Liberia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Language-agnostic Multilingual Modeling

Datta, Arindrima, Ramabhadran, Bhuvana, Emond, Jesse, Kannan, Anjuli, Roark, Brian

arXiv.org Machine LearningApr-20-2020

Multilingual Automated Speech Recognition (ASR) systems allow for the joint training of data-rich and data-scarce languages in a single model. This enables data and parameter sharing across languages, which is especially beneficial for the data-scarce languages. However, most state-of-the-art multilingual models require the encoding of language information and therefore are not as flexible or scalable when expanding to newer languages. Language-independent multilingual models help to address this issue, and are also better suited for multicultural societies where several languages are frequently used together (but often rendered with different writing systems). In this paper, we propose a new approach to building a language-agnostic multilingual ASR system which transforms all languages to one writing system through a many-to-one transliteration transducer. Thus, similar sounding acoustics are mapped to a single, canonical target sequence of graphemes, effectively separating the modeling and rendering problems. We show with four Indic languages, namely, Hindi, Bengali, Tamil and Kannada, that the language-agnostic multilingual model achieves up to 10% relative reduction in Word Error Rate (WER) over a language-dependent multilingual model.

multilingual model, speech recognition, training data, (14 more...)

arXiv.org Machine Learning

2004.09571

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Unsupervised crop anomaly detection at the parcel-level using optical and SAR images: application to wheat and rapeseed crops

Mouret, Florian, Albughdadi, Mohanad, Duthoit, Sylvie, Kouamé, Denis, Rieu, Hervé Poilvé Guillaume, Tourneret, Jean-Yves

arXiv.org Machine LearningApr-17-2020

This paper proposes a generic approach for crop anomaly detection at the parcel-level based on unsupervised point anomaly detection techniques. The input data is derived from synthetic aperture radar (SAR) and optical images acquired using Sentinel-1 and Sentinel-2 satellites. The proposed strategy consists of four sequential steps: acquisition and preprocessing of optical and SAR images, extraction of optical and SAR indicators, computation of zonal statistics at the parcel-level and point anomaly detection. This paper analyzes different factors that can affect the results of anomaly detection such as the considered features and the anomaly detection algorithm used. The proposed procedure is validated on two crop types in Beauce (France), namely, rapeseed and wheat crops. Two different parcel delineation databases are considered to validate the robustness of the strategy to changes in parcel boundaries.

anomaly, anomaly detection, parcel, (15 more...)

arXiv.org Machine Learning

2004.08431

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Europe > Czechia > Prague (0.04)
(17 more...)

Genre: Research Report > New Finding (1.00)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A stochastic approach to handle knapsack problems in the creation of ensembles

Hajdu, Andras, Terdik, Gyorgy, Tiba, Attila, Toman, Henrietta

arXiv.org Machine LearningApr-17-2020

Ensemble-based methods are highly popular approaches that increase the accuracy of a decision by aggregating the opinions of individual voters. The common point is to maximize accuracy; however, a natural limitation occurs if incremental costs are also assigned to the individual voters. Consequently, we investigate creating ensembles under an additional constraint on the total cost of the members. This task can be formulated as a knapsack problem, where the energy is the ensemble accuracy formed by some aggregation rules. However, the generally applied aggregation rules lead to a nonseparable energy function, which takes the common solution tools -- such as dynamic programming -- out of action. We introduce a novel stochastic approach that considers the energy as the joint probability function of the member accuracies. This type of knowledge can be efficiently incorporated in a stochastic search process as a stopping rule, since we have the information on the expected accuracy or, alternatively, the probability of finding more accurate ensembles. Experimental analyses of the created ensembles of pattern classifiers and object detectors confirm the efficiency of our approach. Moreover, we propose a novel stochastic search strategy that better fits the energy, compared with general approaches such as simulated annealing.

accuracy, ensemble, selection, (17 more...)

arXiv.org Machine Learning

2004.08101

Country:

Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Singh, Deepak, Sisodia, Dilip Singh, Singh, Pradeep

arXiv.org Artificial IntelligenceApr-16-2020

Biomedical data is filled with continuous real values; these values in the feature set tend to create problems like underfitting, the curse of dimensionality and increase in misclassification rate because of higher variance. In response, pre-processing techniques on dataset minimizes the side effects and have shown success in maintaining the adequate accuracy. Feature selection and discretization are the two necessary preprocessing steps that were effectively employed to handle the data redundancies in the biomedical data. However, in the previous works, the absence of unified effort by integrating feature selection and discretization together in solving the data redundancy problem leads to the disjoint and fragmented field. This paper proposes a novel multi-objective based dimensionality reduction framework, which incorporates both discretization and feature reduction as an ensemble model for performing feature selection and discretization. Selection of optimal features and the categorization of discretized and non-discretized features from the feature subset is governed by the multi-objective genetic algorithm (NSGA-II). The two objective, minimizing the error rate during the feature selection and maximizing the information gain while discretization is considered as fitness criteria.

algorithm, discretization, feature selection, (12 more...)

arXiv.org Artificial Intelligence

2004.07478

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Virginia (0.04)
Asia > India > Chhattisgarh > Raipur (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Dyslexia and Dysgraphia prediction: A new machine learning approach

Richard, Gilles, Serrurier, Mathieu

arXiv.org Machine LearningApr-15-2020

Learning disabilities like dysgraphia, dyslexia, dyspraxia, etc. interfere with academic achievements but have also long terms consequences beyond the academic time. It is widely admitted that between 5% to 10% of the world population is subject to this kind of disabilities. For assessing such disabilities in early childhood, children have to solve a battery of tests. Human experts score these tests, and decide whether the children require specific education strategy on the basis of their marks. The assessment can be lengthy, costly and emotionally painful. In this paper, we investigate how Artificial Intelligence can help in automating this assessment. Gathering a dataset of handwritten text pictures and audio recordings, both from standard children and from dyslexic and/or dysgraphic children, we apply machine learning techniques for classification in order to analyze the differences between dyslexic/dysgraphic and standard readers/writers and to build a model. The model is trained on simple features obtained by analysing the pictures and the audio files. Our preliminary implementation shows relatively high performances on the dataset we have used. This suggests the possibility to screen dyslexia and dysgraphia via non-invasive methods in an accurate way as soon as enough data are available.

algorithm, dysgraphia, dyslexia, (16 more...)

arXiv.org Machine Learning

2005.06401

Country:

Oceania > Australia (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Co-eye: A Multi-resolution Symbolic Representation to TimeSeries Diversified Ensemble Classification

Abdallah, Zahraa S., Gaber, Mohamed Medhat

arXiv.org Machine LearningApr-15-2020

Time series classification (TSC) is a challenging task that attracted many researchers in the last few years. One main challenge in TSC is the diversity of domains where time series data come from. Thus, there is no "one model that fits all" in TSC. Some algorithms are very accurate in classifying a specific type of time series when the whole series is considered, while some only target the existence/non-existence of specific patterns/shapelets. Yet other techniques focus on the frequency of occurrences of discriminating patterns/features. This paper presents a new classification technique that addresses the inherent diversity problem in TSC using a nature-inspired method. The technique is stimulated by how flies look at the world through "compound eyes" that are made up of thousands of lenses, called ommatidia. Each ommatidium is an eye with its own lens, and thousands of them together create a broad field of vision. The developed technique similarly uses different lenses and representations to look at the time series, and then combines them for broader visibility. These lenses have been created through hyper-parameterisation of symbolic representations (Piecewise Aggregate and Fourier approximations). The algorithm builds a random forest for each lens, then performs soft dynamic voting for classifying new instances using the most confident eyes, i.e, forests. We evaluate the new technique, coined Co-eye, using the recently released extended version of UCR archive, containing more than 100 datasets across a wide range of domains. The results show the benefits of bringing together different perspectives reflecting on the accuracy and robustness of Co-eye in comparison to other state-of-the-art techniques.

co-eye, dataset, representation, (16 more...)

arXiv.org Machine Learning

2004.06668

Country:

North America > United States > North Carolina (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback