Goto

Collaborating Authors

 Decision Tree Learning


EAPS: Edge-Assisted Predictive Sleep Scheduling for 802.11 IoT Stations

arXiv.org Artificial Intelligence

The broad deployment of 802.11 (a.k.a., WiFi) access points and significant enhancement of the energy efficiency of these wireless transceivers has resulted in increasing interest in building 802.11-based IoT systems. Unfortunately, the main energy efficiency mechanisms of 802.11, namely PSM and APSD, fall short when used in IoT applications. PSM increases latency and intensifies channel access contention after each beacon instance, and APSD does not inform stations about when they need to wake up to receive their downlink packets. In this paper, we present a new mechanism---edge-assisted predictive sleep scheduling (EAPS)---to adjust the sleep duration of stations while they expect downlink packets. We first implement a Linux-based access point that enables us to collect parameters affecting communication latency. Using this access point, we build a testbed that, in addition to offering traffic pattern customization, replicates the characteristics of real-world environments. We then use multiple machine learning algorithms to predict downlink packet delivery. Our empirical evaluations confirm that when using EAPS the energy consumption of IoT stations is as low as PSM, whereas the delay of packet delivery is close to the case where the station is always awake.


DriveML: Self-Drive Machine Learning Projects

#artificialintelligence

Implementing some of the pillars of an automated machine learning pipeline such as (i) Automated data preparation, (ii) Feature engineering, (iii) Model building in classification context that includes techniques such as (a) Regularised regression [1], (b) Logistic regression [2], (c) Random Forest [3], (d) Decision tree [4] and (e) Extreme Gradient Boosting (xgboost) [5], and finally, (iv) Model explanation (using lift chart and partial dependency plots). Also provides some additional features such as generating missing at random (MAR) variables and automated exploratory data analysis. Moreover, function exports the model results with the required plots in an HTML vignette report format that follows the best practices of the industry and the academia.


Learning Optimal Tree Models Under Beam Search

arXiv.org Machine Learning

Retrieving relevant targets from an extremely large target set under computational limits is a common challenge for information retrieval and recommendation systems. Tree models, which formulate targets as leaves of a tree with trainable node-wise scorers, have attracted a lot of interests in tackling this challenge due to their logarithmic computational complexity in both training and testing. Tree-based deep models (TDMs) and probabilistic label trees (PLTs) are two representative kinds of them. Though achieving many practical successes, existing tree models suffer from the training-testing discrepancy, where the retrieval performance deterioration caused by beam search in testing is not considered in training. This leads to an intrinsic gap between the most relevant targets and those retrieved by beam search with even the optimally trained node-wise scorers. We take a first step towards understanding and analyzing this problem theoretically, and develop the concept of Bayes optimality under beam search and calibration under beam search as general analyzing tools for this purpose. Moreover, to eliminate the discrepancy, we propose a novel algorithm for learning optimal tree models under beam search. Experiments on both synthetic and real data verify the rationality of our theoretical analysis and demonstrate the superiority of our algorithm compared to state-of-the-art methods.


Spatio-temporal Sequence Prediction with Point Processes and Self-organizing Decision Trees

arXiv.org Machine Learning

We investigate spatio-temporal prediction and introduce a novel prediction algorithm. Our approach is based on the point processes, which we use to model the event arrivals in both space and time. Although we specifically use the Hawkes process, other processes can be readily used as provided remarks in the paper. Moreover, we partition the given spatial region into subregions by an adaptive decision tree and model each subregion with individual and interacting point processes. With individual point processes for each subregion, we estimate the time and location of the events using the past event times and locations. Furthermore, thanks to the nonstationary and self-exciting point generation mechanism in the Hawkes process and the adaptive partitioning of the space, we model the data as nonstationary in both time and space. Finally, we provide a gradient based joint optimization algorithm for the adaptive tree parameter and the point process parameters. With the joint optimization, our algorithm can infer the source statistics and adaptive partitioning of the region. We also provide a training algorithm for the online setup, where we update the model parameters with newly arrived points. We provide experimental results on both simulated data and real-life data where we compare our approach with the standard approaches and demonstrate significant performance improvements thanks to the adaptive spatial partitioning mechanism and the joint optimization procedure.


The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees

arXiv.org Machine Learning

Decision trees are a widely used method for classification, both by themselves and as the building blocks of multiple different ensemble learning methods. The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree construction, precisely CART Gini. One modification involves an alternative splitting metric, maximum cut, based on maximizing the distance between all pairs of observations belonging to separate classes and separate sides of the threshold value. The other modification is to select the decision feature from a linear combination of the input features constructed using Principal Component Analysis (PCA) locally at each node. Our experiments show that this node-based localized PCA with the novel splitting modification can dramatically improve classification, while also significantly decreasing computational time compared to the baseline decision tree. Moreover, our results are most significant when evaluated on data sets with higher dimensions, or more classes; which, for the example data set CIFAR-100, enable a 49% improvement in accuracy while reducing CPU time by 94%. These introduced modifications dramatically advance the capabilities of decision trees for difficult classification tasks.


How fair can we go in machine learning? Assessing the boundaries of fairness in decision trees

arXiv.org Machine Learning

Beyond the possible misuses of technology, there is an increased awareness that these processes are not neutral and can reproduce and amplify past and current structural inequalities [1, 2]. Within this context, particular interest is paid to the role of machine learning (ML) with well known examples of models biased against historically discriminated groups [3, 4, 5] or the intersection of these groups [6, 7]. Fairness in ML has emerged as a community initially motivated to develop technological solutions to the disparate impact and treatment by biased algorithms [8, 9, 10, 11, 5] that also moves to a broader and multi-disciplinary understanding of the issues of socio-technological interventions [12, 13, 14, 15]. This work contribute to this field by studying how far bias mitigation can go whilst satisfying the accuracy and transparency of the models, thus providing a tool for a wider understanding of the technological boundaries of socio-technical proposals. Bias mitigation techniques can broadly be divided into three non-exclusive categories [16]: (1) preprocessing, (2) inprocessing, and (3) postprocessing. The preprocessing techniques attempt to learn new representations of data to satisfy fairness definitions. The inprocessing methods involve modifying the classifier algorithm by adding a fairness constraint to the optimization problem. The postprocessing methods aim at removing discriminatory decisions after the model is trained. Normally, in inprocessing approaches the fairness criteria are used as an optimization constraint rather than as a guide to build a more equitable prediction model.


Modelling Agent Policies with Interpretable Imitation Learning

arXiv.org Artificial Intelligence

As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents' latent state representations by selecting from a large space of candidate features constructed from the Markov state.


Model family selection for classification using Neural Decision Trees

arXiv.org Machine Learning

Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much information about the dataset itself. In this paper we propose a method to reduce the scope of exploration needed for the task. The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying `rigid' decision boundaries (e.g. decision trees), so as to obtain an equivalent or better model. In our approach, this is realized by progressively relaxing the decision boundaries of the initial decision trees (the RMs) as long as this is beneficial in terms of performance measured on an analyzed dataset. More specifically, this relaxation is performed by making use of a neural decision tree, which is a neural network built from DTs. The final model produced by our method carries non-linear decision boundaries. Measuring the performance of the final model, and its agreement to its seeding RM can help the user to figure out on which family of models he should focus on.


Gradient boosting machine with partially randomized decision trees

arXiv.org Machine Learning

The gradient boosting machine is a powerful ensemble-based machine learning method for solving regression problems. However, one of the difficulties of its using is a possible discontinuity of the regression function, which arises when regions of training data are not densely covered by training points. In order to overcome this difficulty and to reduce the computational complexity of the gradient boosting machine, we propose to apply the partially randomized trees which can be regarded as a special case of the extremely randomized trees applied to the gradient boosting. The gradient boosting machine with the partially randomized trees is illustrated by means of many numerical examples using synthetic and real data.


FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

arXiv.org Machine Learning

This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there are correlated features and do not account for data observed over time. FREEtree deals with longitudinal data by using a piecewise random effects model. It also exploits the network structure of the features by first clustering them using weighted correlation network analysis, namely WGCNA. It then conducts a screening step within each cluster of features and a selection step among the surviving features, that provides a relatively unbiased way to select features. By using dominant principle components as regression variables at each leaf and the original features as splitting variables at splitting nodes, FREEtree maintains its interpretability and improves its computational efficiency. The simulation results show that FREEtree outperforms other tree-based methods in terms of prediction accuracy, feature selection accuracy, as well as the ability to recover the underlying structure.