Goto

Collaborating Authors

 Diagnosis


Quantized Training of Gradient Boosting Decision Trees

arXiv.org Artificial Intelligence

Recent years have witnessed significant success in Gradient Boosting Decision Trees (GBDT) for a wide range of machine learning applications. Generally, a consensus about GBDT's training algorithms is gradients and statistics are computed based on high-precision floating points. In this paper, we investigate an essentially important question which has been largely ignored by the previous literature: how many bits are needed for representing gradients in training GBDT? To solve this mystery, we propose to quantize all the high-precision gradients in a very simple yet effective way in the GBDT's training algorithm. Surprisingly, both our theoretical analysis and empirical studies show that the necessary precisions of gradients without hurting any performance can be quite low, e.g., 2 or 3 bits. With low-precision gradients, most arithmetic operations in GBDT training can be replaced by integer operations of 8, 16, or 32 bits. Promisingly, these findings may pave the way for much more efficient training of GBDT from several aspects: (1) speeding up the computation of gradient statistics in histograms; (2) compressing the communication cost of high-precision statistical information during distributed training; (3) the inspiration of utilization and development of hardware architectures which well support low-precision computation for GBDT training. Benchmarked on CPUs, GPUs, and distributed clusters, we observe up to 2$\times$ speedup of our simple quantization strategy compared with SOTA GBDT systems on extensive datasets, demonstrating the effectiveness and potential of the low-precision training of GBDT. The code will be released to the official repository of LightGBM.


Efficient anomaly detection method for rooftop PV systems using big data and permutation entropy

arXiv.org Artificial Intelligence

The number of rooftop photovoltaic (PV) systems has significantly increased in recent years around the globe, including in Australia. This trend is anticipated to continue in the next few years. Given their high share of generation in power systems, detecting malfunctions and abnormalities in rooftop PV systems is essential for ensuring their high efficiency and safety. In this paper, we present a novel anomaly detection method for a large number of rooftop PV systems installed in a region using big data and a time series complexity measure called weighted permutation entropy (WPE). This efficient method only uses the historical PV generation data in a given region to identify anomalous PV systems and requires no new sensor or smart device. Using a real-world PV generation dataset, we discuss how the hyperparameters of WPE should be tuned for the purpose. The proposed PV anomaly detection method is then tested on rooftop PV generation data from over 100 South Australian households. The results demonstrate that anomalous systems detected by our method have indeed encountered problems and require a close inspection. The detection and resolution of potential faults would result in better rooftop PV systems, longer lifetimes, and higher returns on investment.


Some third-party Twitter apps aren't working right now

Engadget

Several third-party clients for Twitter are having issues communicating with the social network, leading to issues that prevent users from being able to log in. As TechCrunch reports, Tweetbot and Twitterific have both confirmed that they're having problems and are trying to find the root cause of the issue. "We've reached out to Twitter for more details, but haven't heard back," Tweetbot announced. Fenix has also confirmed that its client for Android is experiencing problems but that its iOS app seems to be unaffected. Matteo Villa, the app's developer, said Fenix for Android was suspended with no communication from the company.


Imbalanced Classification In Faulty Turbine Data: New Proximal Policy Optimization

arXiv.org Artificial Intelligence

There is growing importance to detecting faults and implementing the best methods in industrial and real-world systems. We are searching for the most trustworthy and practical data-based fault detection methods proposed by artificial intelligence applications. In this paper, we propose a framework for fault detection based on reinforcement learning and a policy known as proximal policy optimization. As a result of the lack of fault data, one of the significant problems with the traditional policy is its weakness in detecting fault classes, which was addressed by changing the cost function. Using modified Proximal Policy Optimization, we can increase performance, overcome data imbalance, and better predict future faults. When our modified policy is implemented, all evaluation metrics will increase by $3\%$ to $4\%$ as compared to the traditional policy in the first benchmark, between $20\%$ and $55\%$ in the second benchmark, and between $6\%$ and $14\%$ in the third benchmark, as well as an improvement in performance and prediction speed compared to previous methods.


Unsupervised High Impedance Fault Detection Using Autoencoder and Principal Component Analysis

arXiv.org Artificial Intelligence

Detection of high impedance faults (HIF) has been one of the biggest challenges in the power distribution network. The low current magnitude and diverse characteristics of HIFs make them difficult to be detected by over-current relays. Recently, data-driven methods based on machine learning models are gaining popularity in HIF detection due to their capability to learn complex patterns from data. Most machine learning-based detection methods adopt supervised learning techniques to distinguish HIFs from normal load conditions by performing classifications, which rely on a large amount of data collected during HIF. However, measurements of HIF are difficult to acquire in the real world. As a result, the reliability and generalization of the classification methods are limited when the load profiles and faults are not present in the training data. Consequently, this paper proposes an unsupervised HIF detection framework using the autoencoder and principal component analysis-based monitoring techniques. The proposed fault detection method detects the HIF by monitoring the changes in correlation structure within the current waveforms that are different from the normal loads. The performance of the proposed HIF detection method is tested using real data collected from a 4.16 kV distribution system and compared with results from a commercially available solution for HIF detection. The numerical results demonstrate that the proposed method outperforms the commercially available HIF detection technique while maintaining high security by not falsely detecting during load conditions.


On the utility of feature selection in building two-tier decision trees

arXiv.org Artificial Intelligence

Nowadays, feature selection is frequently used in machine learning when there is a risk of performance degradation due to overfitting or when computational resources are limited. During the feature selection process, the subset of features that are most relevant and least redundant is chosen. In recent years, it has become clear that, in addition to relevance and redundancy, features' complementarity must be considered. Informally, if the features are weak predictors of the target variable separately and strong predictors when combined, then they are complementary. It is demonstrated in this paper that the synergistic effect of complementary features mutually amplifying each other in the construction of two-tier decision trees can be interfered with by another feature, resulting in a decrease in performance. It is demonstrated using cross-validation on both synthetic and real datasets, regression and classification, that removing or eliminating the interfering feature can improve performance by up to 24 times. It has also been discovered that the lesser the domain is learned, the greater the increase in performance. More formally, it is demonstrated that there is a statistically significant negative rank correlation between performance on the dataset prior to the elimination of the interfering feature and performance growth after the elimination of the interfering feature. It is concluded that this broadens the scope of feature selection methods for cases where data and computational resources are sufficient.


The Improvement of Decision Tree Construction Algorithm Based On Quantum Heuristic Algorithms

arXiv.org Artificial Intelligence

This work is related to the implementation of a decision tree construction algorithm on a quantum simulator. Here we consider an algorithm based on a binary criterion. Also, we study the improvement capability with quantum heuristic QAOA. We implemented the classical and the quantum version of this algorithm to compare built trees.


Intelligent Feature Extraction, Data Fusion and Detection of Concrete Bridge Cracks: Current Development and Challenges

arXiv.org Artificial Intelligence

As a common appearance defect of concrete bridges, cracks are important indices for bridge structure health assessment. Although there has been much research on crack identification, research on the evolution mechanism of bridge cracks is still far from practical applications. In this paper, the state-of-the-art research on intelligent theories and methodologies for intelligent feature extraction, data fusion and crack detection based on data-driven approaches is comprehensively reviewed. The research is discussed from three aspects: the feature extraction level of the multimodal parameters of bridge cracks, the description level and the diagnosis level of the bridge crack damage states. We focus on previous research concerning the quantitative characterization problems of multimodal parameters of bridge cracks and their implementation in crack identification, while highlighting some of their major drawbacks. In addition, the current challenges and potential future research directions are discussed.


DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom Representations

arXiv.org Artificial Intelligence

Diagnosis-oriented dialogue system queries the patient's health condition and makes predictions about possible diseases through continuous interaction with the patient. A few studies use reinforcement learning (RL) to learn the optimal policy from the joint action space of symptoms and diseases. However, existing RL (or Non-RL) methods cannot achieve sufficiently good prediction accuracy, still far from its upper limit. To address the problem, we propose a decoupled automatic diagnostic framework DxFormer, which divides the diagnosis process into two steps: symptom inquiry and disease diagnosis, where the transition from symptom inquiry to disease diagnosis is explicitly determined by the stopping criteria. In DxFormer, we treat each symptom as a token, and formalize the symptom inquiry and disease diagnosis to a language generation model and a sequence classification model respectively. We use the inverted version of Transformer, i.e., the decoder-encoder structure, to learn the representation of symptoms by jointly optimizing the reinforce reward and cross entropy loss. Extensive experiments on three public real-world datasets prove that our proposed model can effectively learn doctors' clinical experience and achieve the state-of-the-art results in terms of symptom recall and diagnostic accuracy.


Causal Explanations of Structural Causal Models

arXiv.org Artificial Intelligence

In explanatory interactive learning (XIL) the user queries the learner, then the learner explains its answer to the user and finally the loop repeats. XIL is attractive for two reasons, (1) the learner becomes better and (2) the user's trust increases. For both reasons to hold, the learner's explanations must be useful to the user and the user must be allowed to ask useful questions. Ideally, both questions and explanations should be grounded in a causal model since they avoid spurious fallacies. Ultimately, we seem to seek a causal variant of XIL. The question part on the user's end we believe to be solved since the user's mental model can provide the causal model. But how would the learner provide causal explanations? In this work we show that existing explanation methods are not guaranteed to be causal even when provided with a Structural Causal Model (SCM). Specifically, we use the popular, proclaimed causal explanation method CXPlain to illustrate how the generated explanations leave open the question of truly causal explanations. Thus as a step towards causal XIL, we propose a solution to the lack of causal explanations. We solve this problem by deriving from first principles an explanation method that makes full use of a given SCM, which we refer to as SC$\textbf{E}$ ($\textbf{E}$ standing for explanation). Since SCEs make use of structural information, any causal graph learner can now provide human-readable explanations. We conduct several experiments including a user study with 22 participants to investigate the virtue of SCE as causal explanations of SCMs.