AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

PeerNomination: Relaxing Exactness for Increased Accuracy in Peer Selection

Mattei, Nicholas, Turrini, Paolo, Zhydkov, Stanislav

arXiv.org Artificial IntelligenceApr-30-2020

In peer selection agents must choose a subset of themselves for an award or a prize. As agents are self-interested, we want to design algorithms that are impartial, so that an individual agent cannot affect their own chance of being selected. This problem has broad application in resource allocation and mechanism design and has received substantial attention in the artificial intelligence literature. Here, we present a novel algorithm for impartial peer selection, PeerNomination, and provide a theoretical analysis of its accuracy. Our algorithm possesses various desirable features. In particular, it does not require an explicit partitioning of the agents, as previous algorithms in the literature. We show empirically that it achieves higher accuracy than the exiting algorithms over several metrics.

agent, algorithm, selection, (15 more...)

arXiv.org Artificial Intelligence

2004.14939

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Macao (0.04)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

Stereotype-Free Classification of Fictitious Faces

Toutiaee, Mohammadhossein, Amirian, Soheyla, Miller, John A., Li, Sheng

arXiv.org Machine LearningApr-29-2020

Equal Opportunity and Fairness are receiving increasing attention in artificial intelligence. Stereotyping is another source of discrimination, which yet has been unstudied in literature. GAN-made faces would be exposed to such discrimination, if they are classified by human perception. It is possible to eliminate the human impact on fictitious faces classification task by the use of statistical approaches. We present a novel approach through penalized regression to label stereotype-free GAN-generated synthetic unlabeled images. The proposed approach aids labeling new data (fictitious output images) by minimizing a penalized version of the least squares cost function between realistic pictures and target pictures.

discrimination, gender, regression, (13 more...)

arXiv.org Machine Learning

2005.02157

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.95)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)

Add feedback

Standardizing and Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing

Alam, Firoj, Sajjad, Hassan, Imran, Muhammad, Ofli, Ferda

arXiv.org Artificial IntelligenceApr-29-2020

Time-critical analysis of social media streams is important for humanitarian organizations to plan rapid response during disasters. The crisis informatics research community has developed several techniques and systems to process and classify big crisis related data posted on social media. However, due to the dispersed nature of the datasets used in the literature, it is not possible to compare the results and measure the progress made towards better models for crisis informatics. In this work, we attempt to bridge this gap by standardizing various existing crisis-related datasets. We consolidate labels of eight annotated data sources and provide 166.1k and 141.5k tweets for informativeness and humanitarian classification tasks, respectively. The consolidation results in a larger dataset that affords the ability to train more sophisticated models. To that end, we provide baseline results using CNN and BERT models.

dataset, informative request, tweet, (17 more...)

arXiv.org Artificial Intelligence

2004.06774

Country:

North America > United States > Texas (0.15)
North America > Canada > Quebec > Estrie Region > Lac-Mégantic (0.14)
Oceania > Australia > Queensland (0.06)
(28 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Health & Medicine > Therapeutic Area (0.68)
Information Technology (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Add feedback

Detecting Electric Devices in 3D Images of Bags

Bagnall, Anthony, Southam, Paul, Large, James, Harvey, Richard

arXiv.org Machine LearningApr-25-2020

The aviation and transport security industries face the challenge of screening high volumes of baggage for threats and contraband in the minimum time possible. Automation and semi-automation of this procedure offers the potential to increase security by detecting more threats and improve the customer experience by speeding up the process. Traditional 2D x-ray images are often extremely difficult to examine due to the fact that they are tightly packed and contain a wide variety of cluttered and occluded objects. Because of these limitations, major airports are introducing 3D x-ray Computed Tomography (CT) baggage scanning. We investigate whether we can automate the process of detecting electric devices in these 3D images of luggage. Detecting electrical devices is of particular concern as they can be used to conceal explosives. Given the massive volume of luggage that needs to be screened for this threat, the best way to automate the detection is to first filter whether a bag contains an electric device or not, and if it does, to identify the number of devices and their location. We present an algorithm, Unpack, Predict, eXtract, Repack (UXPR), which involves unpacking through segmenting the data at a range of scales using an algorithm known as the Sieve, predicting whether a segment is electrical or not based on the histogram of voxel intensities, then repacking the bag by ensembling the segments and predictions to identify the devices in bags. Through a range of experiments using data provided by ALERT (Awareness and Localization of Explosives-Related Threats) we show that this system can find a high proportion of devices with unsupervised segmentation if a similar device has been seen before, and shows promising results for detecting devices not seen at all based on the properties of its constituent parts.

algorithm, classifier, segmentation, (15 more...)

arXiv.org Machine Learning

2005.02163

Country: Europe > United Kingdom > England (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.66)
Transportation > Air (0.54)
Health & Medicine > Diagnostic Medicine > Imaging (0.49)
Transportation > Infrastructure & Services > Airport (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Cross-Validation for Correlated Data

Rabinowicz, Assaf, Rosset, Saharon

arXiv.org Machine LearningApr-24-2020

Datasets with correlation structures are common in modern statistical applications in various fields, such as geostatistics (Goovaerts 1999), genetics (Maddison 1990) and ecology (Roberts et al. 2017). Different modeling methods address the correlation structure differently. Some modeling methods, such as Gaussian process regression (Rasmussen and Williams 2006, GPR) and generalized least squares (Hansen 2007, GLS), utilize explicitly the correlation structure for achieving better prediction accuracy. Other predictive models, like random forest (Breiman 2001, RF), gradient boosting machines (Friedman 2002, GBM) and other machine learning models, do not consider explicitly the correlation structure but are still potentially able to utilize the correlation implicitly. The analysis in this paper mainly focuses on correlation that appears due to latent objects, such as random effects and random fields as appear in generalized linear mixed models (Verbeke 1997, GLMM) and generalized Gaussian process regression (Rasmussen and Williams 2006, GGPR) in clustered, temporal and spatial datasets.

correlation structure, cov, generalization error, (15 more...)

arXiv.org Machine Learning

1904.02438

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.41)

Add feedback

Concept Drift Detection via Equal Intensity k-means Space Partitioning

Zhang, Anjin Liu Jie Lu Guangquan

arXiv.org Machine LearningApr-24-2020

Data stream poses additional challenges to statistical classification tasks because distributions of the training and target samples may differ as time passes. Such distribution change in streaming data is called concept drift. Numerous histogram-based distribution change detection methods have been proposed to detect drift. Most histograms are developed on grid-based or tree-based space partitioning algorithms which makes the space partitions arbitrary, unexplainable, and may cause drift blind-spots. There is a need to improve the drift detection accuracy for histogram-based methods with the unsupervised setting. To address this problem, we propose a cluster-based histogram, called equal intensity k-means space partitioning (EI-kMeans). In addition, a heuristic method to improve the sensitivity of drift detection is introduced. The fundamental idea of improving the sensitivity is to minimize the risk of creating partitions in distribution offset regions. Pearson's chi-square test is used as the statistical hypothesis test so that the test statistics remain independent of the sample distribution. The number of bins and their shapes, which strongly influence the ability to detect drift, are determined dynamically from the sample based on an asymptotic constraint in the chi-square test. Accordingly, three algorithms are developed to implement concept drift detection, including a greedy centroids initialization algorithm, a cluster amplify-shrink algorithm, and a drift detection algorithm. For drift adaptation, we recommend retraining the learner if a drift is detected. The results of experiments on synthetic and real-world datasets demonstrate the advantages of EI-kMeans and show its efficacy in detecting concept drift.

algorithm, detection, partition, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TCYB.2020.2983962

2004.11587

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Adversarial Machine Learning in Network Intrusion Detection Systems

Alhajjar, Elie, Maxwell, Paul, Bastian, Nathaniel D.

arXiv.org Machine LearningApr-23-2020

It is becoming evident each and every day that machine learning algorithms are achieving impressive results in domains in which it is hard to specify a set of rules for their procedures. Examples of this phenomenon include industries like finance [49, 5], transportation [37], education [42, 22], health care [23] and tasks like image recognition [41, 16, 17], machine translation [43, 7], and speech recognition [46, 24, 53, 50]. Motivated by the ease of adoption and the increased availability of affordable computational power (especially cloud computing services), machine learning algorithms are being explored in almost every commercial application and are offering great promise for the future of automation. Facing such a vast adoption across multiple disciplines, some of their weaknesses are exposed and sometimes exploited by malicious actors. For example, a common challenge to these algorithms is "generalization" or "robustness", which is the ability of the algorithm to maintain performance whenever dealing with data coming from a different distribution with which it was trained. For a long period of time, the sole focus of machine learning researchers was improving the performance of machine learning systems (true positive rate, accuracy, etc.). Nowadays, the robustness of these systems can no longer be ignored; many of them have been shown to be highly vulnerable to intentional adversarial attacks.

chromosome, classifier, vector, (14 more...)

arXiv.org Machine Learning

2004.11898

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Overview (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Differential Network Learning Beyond Data Samples

Sekhon, Arshdeep, Wang, Beilun, Wang, Zhe, Qi, Yanjun

arXiv.org Machine LearningApr-23-2020

Learning the change of statistical dependencies between random variables is an essential task for many real-life applications, mostly in the high dimensional low sample regime. In this paper, we propose a novel differential parameter estimator that, in comparison to current methods, simultaneously allows (a) the flexible integration of multiple sources of information (data samples, variable groupings, extra pairwise evidence, etc.), (b) being scalable to a large number of variables, and (c) achieving a sharp asymptotic convergence rate. Our experiments, on more than 100 simulated and two real-world datasets, validate the flexibility of our approach and highlight the benefits of integrating spatial and anatomic information for brain connectome change discovery and epigenetic network identification.

baseline, kdiffnet, knowledge, (15 more...)

arXiv.org Machine Learning

2004.11494

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Data Science (0.92)

Add feedback

Rapidly Bootstrapping a Question Answering Dataset for COVID-19

Tang, Raphael, Nogueira, Rodrigo, Zhang, Edwin, Gupta, Nikhil, Cam, Phuong, Cho, Kyunghyun, Lin, Jimmy

arXiv.org Artificial IntelligenceApr-23-2020

We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While this dataset, comprising 124 question-article pairs as of the present version 0.1 release, does not have sufficient examples for supervised machine learning, we believe that it can be helpful for evaluating the zero-shot or transfer capabilities of existing models on topics specifically related to COVID-19. This paper describes our methodology for constructing the dataset and presents the effectiveness of a number of baselines, including term-based techniques and various transformer-based models. The dataset is available at http://covidqa.ai/

dataset, effectiveness, natural language question, (14 more...)

arXiv.org Artificial Intelligence

2004.11339

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

A Snapshot of the Frontiers of Fairness in Machine Learning

Communications of the ACMApr-22-2020, 00:40:12 GMT

The last decade has seen a vast increase both in the diversity of applications to which machine learning is applied, and to the import of those applications. Machine learning is no longer just the engine behind ad placements and spam filters; it is now used to filter loan applicants, deploy police officers, and inform bail and parole decisions, among other things. The result has been a major concern for the potential for data-driven methods to introduce and perpetuate discriminatory practices, and to otherwise be unfair. And this concern has not been without reason: a steady stream of empirical findings has shown that data-driven methods can unintentionally both encode existing human biases and introduce new ones.7,9,11,60 At the same time, the last two years have seen an unprecedented explosion in interest from the academic community in studying fairness and machine learning. "Fairness and transparency" transformed from a niche topic with a trickle of papers produced every year (at least since the work of Pedresh56 to a major subfield of machine learning, complete with a dedicated archival conference--ACM FAT*). But despite the volume and velocity of published work, our understanding of the fundamental questions related to fairness and machine learning remain in its infancy.

fairness, learning, proceedings, (12 more...)

Communications of the ACM

AI-Alerts: 2020 > 2020-04 > AAAI AI-Alert for Apr 28, 2020 (1.00)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.49)
Education (0.46)
Information Technology > Security & Privacy (0.46)
Law > Civil Rights & Constitutional Law (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback