Editor's note: This post was originally included as an answer to a question posed in our 17 More Must-Know Data Science Interview Questions and Answers series earlier this year. The answer was thorough enough that it was deemed to deserve its own dedicated post. "In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone." Imagine you are playing the game "Who wants to be millionaire?" You have no clue about the question, but you have audience poll and phone a friend life lines.
Ensemble methods have been widely applied in Reinforcement Learning (RL) in order to enhance stability, increase convergence speed, and improve exploration. These methods typically work by employing an aggregation mechanism over actions of different RL algorithms. We show that a variety of these methods can be unified by drawing parallels from committee voting rules in Social Choice Theory. We map the problem of designing an action aggregation mechanism in an ensemble method to a voting problem which, under different voting rules, yield popular ensemble-based RL algorithms like Majority Voting Q-learning or Bootstrapped Q-learning. Our unification framework, in turn, allows us to design new ensemble-RL algorithms with better performance. For instance, we map two diversity-centered committee voting rules, namely Single Non-Transferable Voting Rule and Chamberlin-Courant Rule, into new RL algorithms that demonstrate excellent exploratory behavior in our experiments.
Heterogeneous ensembles built from the predictions of a wide variety and large number of diverse base predictors represent a potent approach to building predictive models for problems where the ideal base/individual predictor may not be obvious. Ensemble selection is an especially promising approach here, not only for improving prediction performance, but also because of its ability to select a collectively predictive subset, often a relatively small one, of the base predictors. In this paper, we present a set of algorithms that explicitly incorporate ensemble diversity, a known factor influencing predictive performance of ensembles, into a reinforcement learning framework for ensemble selection. We rigorously tested these approaches on several challenging problems and associated data sets, yielding that several of them produced more accurate ensembles than those that don't explicitly consider diversity. More importantly, these diversity-incorporating ensembles were much smaller in size, i.e., more parsimonious, than the latter types of ensembles. This can eventually aid the interpretation or reverse engineering of predictive models assimilated into the resultant ensemble(s).
We investigate how random projection can best be used for clustering high dimensional data. Random projection has been shown to have promising theoretical properties. In practice, however, we find that it results in highly unstable clustering performance. Our solution is to use random projection in a cluster ensemble approach. Empirical results show that the proposed approach achieves better and more robust clustering performance compared to not only single runs of random projection/clustering but also clustering with PCA, a traditional data reduction method for high dimensional data. To gain insights into the performance improvement obtained by our ensemble method, we analyze and identify the influence of the quality and the diversity of the individual clustering solutions on the final ensemble performance.
The recent success of Deep Neural Networks (DNNs) has drastically improved the state of the art for many application domains. While achieving high accuracy performance, deploying state-of-the-art DNNs is a challenge since they typically require billions of expensive arithmetic computations. In addition, DNNs are typically deployed in ensemble to boost accuracy performance, which further exacerbates the system requirements. This computational overhead is an issue for many platforms, e.g. data centers and embedded systems, with tight latency and energy budgets. In this article, we introduce flexible DNNs ensemble processing technique, which achieves large reduction in average inference latency while incurring small to negligible accuracy drop. Our technique is flexible in that it allows for dynamic adaptation between quality of results (QoR) and execution runtime. We demonstrate the effectiveness of the technique on AlexNet and ResNet-50 using the ImageNet dataset. This technique can also easily handle other types of networks.