AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Boosting and Bagging: How To Develop A Robust Machine Learning Algorithm

#artificialintelligenceJul-13-2020, 09:56:34 GMT

Machine learning and data science require more than just throwing data into a python library and utilizing whatever comes out. Data scientists need to actually understand the data and the processes behind the data to be able to implement a successful system. One key methodology to implementation is knowing when a model might benefit from utilizing bootstrapping methods. These are what are called ensemble models. Some examples of ensemble models are AdaBoost and Stochastic Gradient Boosting.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

Predicting Sequences of Traversed Nodes in Graphs using Network Models with Multiple Higher Orders

Gote, Christoph, Casiraghi, Giona, Schweitzer, Frank, Scholtes, Ingo

arXiv.org Machine LearningJul-13-2020

We propose a novel sequence prediction method for sequential data capturing node traversals in graphs. Our method builds on a statistical modelling framework that combines multiple higher-order network models into a single multi-order model. We develop a technique to fit such multi-order models in empirical sequential data and to select the optimal maximum order. Our framework facilitates both next-element and full sequence prediction given a sequence-prefix of any length. We evaluate our model based on six empirical data sets containing sequences from website navigation as well as public transport systems. The results show that our method out-performs state-of-the-art algorithms for next-element prediction. We further demonstrate the accuracy of our method during out-of-sample sequence prediction and validate that our method can scale to data sets with millions of sequences.

data mining, machine learning, prediction, (15 more...)

arXiv.org Machine Learning

2007.06662

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry: Transportation > Infrastructure & Services (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Exploiting Uncertainties from Ensemble Learners to Improve Decision-Making in Healthcare AI

Tan, Yingshui, Jin, Baihong, Yue, Xiangyu, Chen, Yuxin, Vincentelli, Alberto Sangiovanni

arXiv.org Machine LearningJul-12-2020

Ensemble learning is widely applied in Machine Learning (ML) to improve model performance and to mitigate decision risks. In this approach, predictions from a diverse set of learners are combined to obtain a joint decision. Recently, various methods have been explored in literature for estimating decision uncertainties using ensemble learning; however, determining which metrics are a better fit for certain decision-making applications remains a challenging task. In this paper, we study the following key research question in the selection of uncertainty metrics: when does an uncertainty metric outperforms another? We answer this question via a rigorous analysis of two commonly used uncertainty metrics in ensemble learning, namely ensemble mean and ensemble variance. We show that, under mild assumptions on the ensemble learners, ensemble mean is preferable with respect to ensemble variance as an uncertainty metric for decision making.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2007.06063

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
(2 more...)

Add feedback

The impact of machine learning and AI on the UK economy

#artificialintelligenceJul-11-2020, 13:00:51 GMT

A recent virtual event addressed another such issue: the potential impact machines, imbued with artificial intelligence, may have on the economy and the financial system. The event was organised by the Bank of England, in collaboration with CEPR and the Brevan Howard Centre for Financial Analysis at Imperial College. What follows is a summary of some of the recorded presentations. The full catalogue of videos are available on the Bank of England's website. In his presentation, Stuart Russell (University of California, Berkeley), author of the leading textbook on artificial intelligence (AI), gives a broad historical overview of the field since its emergence in the 1950s, followed by insight into more recent developments.

artificial intelligence, machine learning, survey article, (12 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England (0.37)
North America > United States > California > Alameda County > Berkeley (0.24)

Genre: Overview (0.89)

Industry:

Banking & Finance > Economy (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

On Improving Hotspot Detection Through Synthetic Pattern-Based Database Enhancement

Reddy, Gaurav Rajavendra, Xanthopoulos, Constantinos, Makris, Yiorgos

arXiv.org Machine LearningJul-11-2020

Continuous technology scaling and the introduction of advanced technology nodes in Integrated Circuit (IC) fabrication is constantly exposing new manufacturability issues. One such issue, stemming from complex interaction between design and process, is the problem of design hotspots. Such hotspots are known to vary from design to design and, ideally, should be predicted early and corrected in the design stage itself, as opposed to relying on the foundry to develop process fixes for every hotspot, which would be intractable. In the past, various efforts have been made to address this issue by using a known database of hotspots as the source of information. The majority of these efforts use either Machine Learning (ML) or Pattern Matching (PM) techniques to identify and predict hotspots in new incoming designs. However, almost all of them suffer from high false-alarm rates, mainly because they are oblivious to the root causes of hotspots. In this work, we seek to address this limitation by using a novel database enhancement approach through synthetic pattern generation based on carefully crafted Design of Experiments (DOEs). Effectiveness of the proposed method against the state-of-the-art is evaluated on a 45nm process using industry-standard tools and designs.

classifier, dataset, hotspot, (15 more...)

arXiv.org Machine Learning

2007.05879

Country:

Asia > India (0.04)
North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Semiconductors & Electronics (0.49)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Deep Contextual Clinical Prediction with Reverse Distillation

Kodialam, Rohan S., Boiarsky, Rebecca, Sontag, David

arXiv.org Artificial IntelligenceJul-10-2020

Healthcare providers are increasingly using learned methods to predict and understand long-term patient outcomes in order to make meaningful interventions. However, despite innovations in this area, deep learning models often struggle to match performance of shallow linear models in predicting these outcomes, making it difficult to leverage such techniques in practice. In this work, motivated by the task of clinical prediction from insurance claims, we present a new technique called reverse distillation which pretrains deep models by using high-performing linear models for initialization. We make use of the longitudinal structure of insurance claims datasets to develop Self Attention with Reverse Distillation, or SARD, an architecture that utilizes a combination of contextual embedding, temporal embedding and self-attention mechanisms and most critically is trained via reverse distillation. SARD outperforms state-of-the-art methods on multiple clinical prediction outcomes, with ablation studies revealing that reverse distillation is a primary driver of these improvements.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2007.05611

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Contrastive Training for Improved Out-of-Distribution Detection

Winkens, Jim, Bunel, Rudy, Roy, Abhijit Guha, Stanforth, Robert, Natarajan, Vivek, Ledsam, Joseph R., MacWilliams, Patricia, Kohli, Pushmeet, Karthikesalingam, Alan, Kohl, Simon, Cemgil, Taylan, Eslami, S. M. Ali, Ronneberger, Olaf

arXiv.org Machine LearningJul-10-2020

Reliable detection of out-of-distribution (OOD) inputs is increasingly understood to be a precondition for deployment of machine learning systems. This paper proposes and investigates the use of contrastive training to boost OOD detection performance. Unlike leading methods for OOD detection, our approach does not require access to examples labeled explicitly as OOD, which can be difficult to collect in practice. We show in extensive experiments that contrastive training significantly helps OOD detection performance on a number of common benchmarks. By introducing and employing the Confusion Log Probability (CLP) score, which quantifies the difficulty of the OOD detection task by capturing the similarity of inlier and outlier datasets, we show that our method especially improves performance in the `near OOD' classes -- a particularly challenging setting for previous methods.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Machine Learning

2007.05566

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Predicting Illegal Fishing on the Patagonia Shelf from Oceanographic Seascapes

Woodill, A. John, Kavanaugh, Maria, Harte, Michael, Watson, James R.

arXiv.org Machine LearningJul-10-2020

Many of the world's most important fisheries are experiencing increases in illegal fishing, undermining efforts to sustainably conserve and manage fish stocks. A major challenge to ending illegal, unreported, and unregulated (IUU) fishing is improving our ability to identify whether a vessel is fishing illegally and where illegal fishing is likely to occur in the ocean. However, monitoring the oceans is costly, time-consuming, and logistically challenging for maritime authorities to patrol. To address this problem, we use vessel tracking data and machine learning to predict illegal fishing on the Patagonian Shelf, one of the world's most productive regions for fisheries. Specifically, we focus on Chinese fishing vessels, which have consistently fished illegally in this region. We combine vessel location data with oceanographic seascapes -- classes of oceanic areas based on oceanographic variables -- as well as other remotely sensed oceanographic variables to train a series of machine learning models of varying levels of complexity. These models are able to predict whether a Chinese vessel is operating illegally with 69-96% confidence, depending on the year and predictor variables used. These results offer a promising step towards preempting illegal activities, rather than reacting to them forensically.

artificial intelligence, fishing, machine learning, (16 more...)

arXiv.org Machine Learning

2007.0547

Country:

Africa (0.14)
South America > Argentina (0.05)
Asia > China (0.05)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Food & Agriculture > Fishing (1.00)
Government > Regional Government > North America Government > United States Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)

Add feedback

Reactive Soft Prototype Computing for Concept Drift Streams

Raab, Christoph, Heusinger, Moritz, Schleif, Frank-Michael

arXiv.org Machine LearningJul-10-2020

The amount of real-time communication between agents in an information system has increased rapidly since the beginning of the decade. This is because the use of these systems, e. g. social media, has become commonplace in today's society. This requires analytical algorithms to learn and predict this stream of information in real-time. The nature of these systems is non-static and can be explained, among other things, by the fast pace of trends. This creates an environment in which algorithms must recognize changes and adapt. Recent work shows vital research in the field, but mainly lack stable performance during model adaptation. In this work, a concept drift detection strategy followed by a prototype-based adaptation strategy is proposed. Validated through experimental results on a variety of typical non-static data, our solution provides stable and quick adjustments in times of change.

artificial intelligence, concept drift, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.neucom.2019.11.111

2007.05432

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Oceania > Australia > New South Wales (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Solving Constrained CASH Problems with ADMM

Ram, Parikshit, Liu, Sijia, Vijaykeerthi, Deepak, Wang, Dakuo, Bouneffouf, Djallel, Bramble, Greg, Samulowitz, Horst, Gray, Alexander G.

arXiv.org Machine LearningJul-10-2020

The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization framework to decompose CASH into multiple small problems and demonstrate how ADMM facilitates incorporation of black-box constraints.

constraint, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2006.09635

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback