deep forest
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Network Threat Detection: Addressing Class Imbalanced Data with Deep Forest
With the rapid expansion of Internet of Things (IoT) networks, detecting malicious traffic in real-time has become a critical cybersecurity challenge. This research addresses the detection challenges by presenting a comprehensive empirical analysis of machine learning techniques for malware detection using the IoT-23 dataset provided by the Stratosphere Laboratory. We address the significant class imbalance within the dataset through three resampling strategies. We implement and compare a few machine learning techniques. Our findings demonstrate that the combination of appropriate imbalance treatment techniques with ensemble methods, particularly gcForest, achieves better detection performance compared to traditional approaches. This work contributes significantly to the development of more intelligent and efficient automated threat detection systems for IoT environments, helping to secure critical infrastructure against sophisticated cyber attacks while optimizing computational resource usage.
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.54)
Depth is More Powerful than Width with Prediction Concatenation in Deep Forest
Random Forest (RF) is an ensemble learning algorithm proposed by \citet{breiman2001random} that constructs a large number of randomized decision trees individually and aggregates their predictions by naive averaging. The prediction concatenation (PreConc) operation is crucial for the multi-layer feature transformation in deep forest, though little has been known about its theoretical property. In this paper, we analyze the influence of Preconc on the consistency of deep forest. Especially when the individual tree is inconsistent (as in practice, the individual tree is often set to be fully grown, i.e., there is only one sample at each leaf node), we find that the convergence rate of two-layer DF \textit{w.r.t.} the number of trees M can reach \mathcal{O}(1/M 2) under some mild conditions, while the convergence rate of RF is \mathcal{O}(1/M) . Therefore, with the help of PreConc, DF with deeper layer will be more powerful than the shallower layer.
The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest
Lyu, Shen-Huan, Wu, Jin-Hui, Zheng, Qin-Cheng, Ye, Baoliu
Random forests are classical ensemble algorithms that construct multiple randomized decision trees and aggregate their predictions using naive averaging. \citet{zhou2019deep} further propose a deep forest algorithm with multi-layer forests, which outperforms random forests in various tasks. The performance of deep forests is related to three hyperparameters in practice: depth, width, and tree size, but little has been known about its theoretical explanation. This work provides the first upper and lower bounds on the approximation complexity of deep forests concerning the three hyperparameters. Our results confirm the distinctive role of depth, which can exponentially enhance the expressiveness of deep forests compared with width and tree size. Experiments confirm the theoretical findings.
- North America > United States (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Using Deep Ensemble Forest for High Resolution Mapping of PM2.5 from MODIS MAIAC AOD in Tehran, Iran
High resolution mapping of PM2.5 concentration over Tehran city is challenging because of the complicated behavior of numerous sources of pollution and the insufficient number of ground air quality monitoring stations. Alternatively, high resolution satellite Aerosol Optical Depth (AOD) data can be employed for high resolution mapping of PM2.5. For this purpose, different data-driven methods have been used in the literature. Recently, deep learning methods have demonstrated their ability to estimate PM2.5 from AOD data. However, these methods have several weaknesses in solving the problem of estimating PM2.5 from satellite AOD data. In this paper, the potential of the deep ensemble forest method for estimating the PM2.5 concentration from AOD data was evaluated. The results showed that the deep ensemble forest method with R2 = 0.74 gives a higher accuracy of PM2.5 estimation than deep learning methods (R2 = 0.67) as well as classic data-driven methods such as random forest (R2 = 0.68). Additionally, the estimated values of PM2.5 using the deep ensemble forest algorithm were used along with ground data to generate a high resolution map of PM2.5. Evaluation of the produced PM2.5 map revealed the good performance of the deep ensemble forest for modeling the variation of PM2.5 in the city of Tehran.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.83)
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- (4 more...)
Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule
Zhu, Hongyu, Liang, Sichu, Hu, Wentao, Li, Fang-Qi, yuan, Yali, Wang, Shi-Lin, Cheng, Guang
As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules. Specifically, We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer. Additionally, we propose to incorporate outputs from intermediate layers into a checkpoint ensemble for more stable performance. Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various tabular classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors. The learned policies also transfer effectively to Deep Forest variants, underscoring its potential for enhancing non-differentiable deep learning modules in tabular signal processing.
- North America > United States > California > San Diego County > Carlsbad (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Shanghai > Shanghai (0.04)