Goto

Collaborating Authors

 Performance Analysis


Few-shot Learning with LSSVM Base Learner and Transductive Modules

arXiv.org Machine Learning

The performance of meta-learning approaches for few-shot learning generally depends on three aspects: features suitable for comparison, the classifier ( base learner ) suitable for low-data scenarios, and valuable information from the samples to classify. In this work, we make improvements for the last two aspects: 1) although there are many effective base learners, there is a trade-off between generalization performance and computational overhead, so we introduce multi-class least squares support vector machine as our base learner which obtains better generation than existing ones with less computational overhead; 2) further, in order to utilize the information from the query samples, we propose two simple and effective transductive modules which modify the support set using the query samples, i.e., adjusting the support samples basing on the attention mechanism and adding the prototypes of the query set with pseudo labels to the support set as the pseudo support samples. These two modules significantly improve the few-shot classification accuracy, especially for the difficult 1-shot setting. Our model, denoted as FSLSTM (Few-Shot learning with LSsvm base learner and Transductive Modules), achieves state-of-the-art performance on miniImageNet and CIFAR-FS few-shot learning benchmarks.


Learning from Very Few Samples: A Survey

arXiv.org Machine Learning

Few sample learning (FSL) is significant and challenging in the field of machine learning. The capability of learning and generalizing from very few samples successfully is a noticeable demarcation separating artificial intelligence and human intelligence since humans can readily establish their cognition to novelty from just a single or a handful of examples whereas machine learning algorithms typically entail hundreds or thousands of supervised samples to guarantee generalization ability. Despite the long history dated back to the early 2000s and the widespread attention in recent years with booming deep learning technologies, little surveys or reviews for FSL are available until now. In this context, we extensively review 300+ papers of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive survey for FSL. In this survey, we review the evolution history as well as the current progress on FSL, categorize FSL approaches into the generative model based and discriminative model based kinds in principle, and emphasize particularly on the meta learning based FSL approaches. We also summarize several recently emerging extensional topics of FSL and review the latest advances on these topics. Furthermore, we highlight the important FSL applications covering many research hotspots in computer vision, natural language processing, audio and speech, reinforcement learning and robotic, data analysis, etc. Finally, we conclude the survey with a discussion on promising trends in the hope of providing guidance and insights to follow-up researches.


Interpretable Machine Learning Approaches to Prediction of Chronic Homelessness

arXiv.org Artificial Intelligence

A 2016 report claims that annually upwards of 235 000 Canadians endure periods of homelessness, with approximately 35 000 individuals lacking a place to stay each night [1]. Between 2005 and 2014, there was a downward trend in the total number of Canadians using shelters; however, the occupancy rates of shelters has been increasing [1]. One factor accounting for this ongoing decrease in the number of homeless individuals paired with an increase in shelter occupancy is an increase in chronic homelessness. London's Homeless Prevention division identifies an individual as chronically homelessness if they have spent 6 or more months ( 180 days) of the last year in a shelter, which was based on the definition of chronic homelessness outlined by the Canadian government's homelessness strategy directives [2]. In addition to this trend, the demographics of homelessness are changing in Canada. In preceding decades, older, single males are over-represented in the homeless population; in contrast, the homeless population of today is increasingly diverse, with families, women, and youth comprising a greater fraction [1].


Aligning Subjective Ratings in Clinical Decision Making

arXiv.org Machine Learning

While objective indicators are more transparent and robust, the subjective evaluation contains a wealth of expert knowledge and intuition. In this work, we demonstrate the potential of pairwise ranking methods to align the subjective evaluation with objective indicators, creating a new score that combines their advantages and facilitates diagnosis. In a case study on patients at risk for developing Psoriatic Arthritis, we illustrate that the resulting score (1) increases classification accuracy when detecting disease presence/absence, (2) is sparse and (3) provides a nuanced assessment of severity for subsequent analysis.


DART: Data Addition and Removal Trees

arXiv.org Machine Learning

How can we update data for a machine learning model after it has already trained on that data? In this paper, we introduce DART, a variant of random forests that supports adding and removing training data with minimal retraining. Data updates in DART are exact, meaning that adding or removing examples from a DART model yields exactly the same model as retraining from scratch on updated data. DART uses two techniques to make updates efficient. The first is to cache data statistics at each node and training data at each leaf, so that only the necessary subtrees are retrained. The second is to choose the split variable randomly at the upper levels of each tree, so that the choice is completely independent of the data and never needs to change. At the lower levels, split variables are chosen to greedily maximize a split criterion such as Gini index or mutual information. By adjusting the number of random-split levels, DART can trade off between more accurate predictions and more efficient updates. In experiments on ten real-world datasets and one synthetic dataset, we find that DART is orders of magnitude faster than retraining from scratch while sacrificing very little in terms of predictive performance.


Learning Interpretable Characteristic Kernels via Decision Forests

arXiv.org Machine Learning

Decision forests are popular tools for classification and regression. These forests naturally produce proximity matrices measuring how often each pair of observations lies in the same leaf node. It has been demonstrated that these proximity matrices can be thought of as kernels, connecting the decision forest literature to the extensive kernel machine literature. While other kernels are known to have strong theoretical properties such as being characteristic, no similar result is available for any decision forest based kernel. In this manuscript, we prove that the decision forest induced proximity can be made characteristic, which can be used to yield a universally consistent statistic for testing independence. We demonstrate the performance of the induced kernel on a suite of 20 high-dimensional independence test settings. We also show how this learning kernel offers insights into relative feature importance. The decision forest induced kernel typically achieves substantially higher testing power than existing popular methods in statistical tests.


Response to Comment on "No consistent ENSO response to volcanic forcing over the last millennium"

Science

Robock claims that our analysis fails to acknowledge that pan-tropical surface cooling caused by large volcanic eruptions may mask El Niño warming at our central Pacific site, potentially obscuring a volcano–El Niño connection suggested in previous studies. Although observational support for a dynamical response linking volcanic cooling to El Niño remains ambiguous, Robock raises some important questions about our study that we address here. Modeling studies suggest that the El Niño–Southern Oscillation (ENSO) is sensitive to sulfate aerosol forcing associated with explosive volcanism, yet observational support for a dynamical chain of events linking large volcanic cooling to El Niño occurrences remains inconclusive. In Dee et al. (1), we used absolutely dated fossil corals from the central tropical Pacific to test ENSO's response to large volcanic eruptions. Superposed epoch analysis reveals a weak tendency for an El Niño–like response in the year after an eruption, but this response is not statistically significant, nor does it appear after the outsized 1257 Samalas eruption.


Machine Learning Applications in Misuse and Anomaly Detection

arXiv.org Artificial Intelligence

Machine learning and data mining algorithms play important roles in designing intrusion detection systems. Based on their approaches toward the detection of attacks in a network, intrusion detection systems can be broadly categorized into two types. In the misuse detection systems, an attack in a system is detected whenever the sequence of activities in the network matches with a known attack signature. In the anomaly detection approach, on the other hand, anomalous states in a system are identified based on a significant difference in the state transitions of the system from its normal states. This chapter presents a comprehensive discussion on some of the existing schemes of intrusion detection based on misuse detection, anomaly detection and hybrid detection approaches. Some future directions of research in the design of algorithms for intrusion detection are also identified.


Network Traffic Analysis based IoT Device Identification

arXiv.org Artificial Intelligence

Device identification is the process of identifying a device on Internet without using its assigned network or other credentials. The sharp rise of usage in Internet of Things (IoT) devices has imposed new challenges in device identification due to a wide variety of devices, protocols and control interfaces. In a network, conventional IoT devices identify each other by utilizing IP or MAC addresses, which are prone to spoofing. Moreover, IoT devices are low power devices with minimal embedded security solution. To mitigate the issue in IoT devices, fingerprint (DFP) for device identification can be used. DFP identifies a device by using implicit identifiers, such as network traffic (or packets), radio signal, which a device used for its communication over the network. These identifiers are closely related to the device hardware and software features. In this paper, we exploit TCP/IP packet header features to create a device fingerprint utilizing device originated network packets. We present a set of three metrics which separate some features from a packet which contribute actively for device identification. To evaluate our approach, we used publicly accessible two datasets. We observed the accuracy of device genre classification 99.37% and 83.35% of accuracy in the identification of an individual device from IoT Sentinel dataset. However, using UNSW dataset device type identification accuracy reached up to 97.78%.


Large-scale nonlinear Granger causality: A data-driven, multivariate approach to recovering directed networks from short time-series data

arXiv.org Machine Learning

To gain insight into complex systems it is a key challenge to infer nonlinear causal directional relations from observational time-series data. Specifically, estimating causal relationships between interacting components in large systems with only short recordings over few temporal observations remains an important, yet unresolved problem. Here, we introduce a large-scale Nonlinear Granger Causality (lsNGC) approach for inferring directional, nonlinear, multivariate causal interactions between system components from short high-dimensional time-series recordings. By modeling interactions with nonlinear state-space transformations from limited observational data, lsNGC identifies casual relations with no explicit a priori assumptions on functional interdependence between component time-series in a computationally efficient manner. Additionally, our method provides a mathematical formulation revealing statistical significance of inferred causal relations. We extensively study the ability of lsNGC to recovering network structure from two-node to thirty-four node chaotic time-series systems. Our results suggest that lsNGC captures meaningful interactions from limited observational data, where it performs favorably when compared to traditionally used methods. Finally, we demonstrate the applicability of lsNGC to estimating causality in large, real-world systems by inferring directional nonlinear, multivariate causal relationships among a large number of relatively short time-series acquired from functional Magnetic Resonance Imaging (fMRI) data of the human brain.