Goto

Collaborating Authors

 Accuracy


Enhancing Road Safety through Accurate Detection of Hazardous Driving Behaviors with Graph Convolutional Recurrent Networks

arXiv.org Artificial Intelligence

Car accidents remain a significant public safety issue worldwide, with the majority of them attributed to driver errors stemming from inadequate driving knowledge, non-compliance with regulations, and poor driving habits. To improve road safety, Driving Behavior Detection (DBD) systems have been proposed in several studies to identify safe and unsafe driving behavior. Many of these studies have utilized sensor data obtained from the Controller Area Network (CAN) bus to construct their models. However, the use of publicly available sensors is known to reduce the accuracy of detection models, while incorporating vendor-specific sensors into the dataset increases accuracy. To address the limitations of existing approaches, we present a reliable DBD system based on Graph Convolutional Long Short-Term Memory Networks (GConvLSTM) that enhances the precision and practicality of DBD models using public sensors. Additionally, we incorporate non-public sensors to evaluate the model's effectiveness. Our proposed model achieved a high accuracy of 97.5\% for public sensors and an average accuracy of 98.1\% for non-public sensors, indicating its consistency and accuracy in both settings. To enable local driver behavior analysis, we deployed our DBD system on a Raspberry Pi at the network edge, with drivers able to access daily driving condition reports, sensor data, and prediction results through a monitoring dashboard. Furthermore, the dashboard issues voice warnings to alert drivers of hazardous driving conditions. Our findings demonstrate that the proposed system can effectively detect hazardous and unsafe driving behavior, with potential applications in improving road safety and reducing the number of accidents caused by driver errors.


A Unified Evaluation Framework for Novelty Detection and Accommodation in NLP with an Instantiation in Authorship Attribution

arXiv.org Artificial Intelligence

State-of-the-art natural language processing models have been shown to achieve remarkable performance in 'closed-world' settings where all the labels in the evaluation set are known at training time. However, in real-world settings, 'novel' instances that do not belong to any known class are often observed. This renders the ability to deal with novelties crucial. To initiate a systematic research in this important area of 'dealing with novelties', we introduce 'NoveltyTask', a multi-stage task to evaluate a system's performance on pipelined novelty 'detection' and 'accommodation' tasks. We provide mathematical formulation of NoveltyTask and instantiate it with the authorship attribution task that pertains to identifying the correct author of a given text. We use Amazon reviews corpus and compile a large dataset (consisting of 250k instances across 200 authors/labels) for NoveltyTask. We conduct comprehensive experiments and explore several baseline methods for the task. Our results show that the methods achieve considerably low performance making the task challenging and leaving sufficient room for improvement. Finally, we believe our work will encourage research in this underexplored area of dealing with novelties, an important step en route to developing robust systems.


Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

arXiv.org Artificial Intelligence

State-of-the-art sequence-to-sequence models often require autoregressive decoding, which can be highly expensive. However, for some downstream tasks such as out-of-distribution (OOD) detection and resource allocation, the actual decoding output is not needed just a scalar attribute of this sequence. In these scenarios, where for example knowing the quality of a system's output to predict poor performance prevails over knowing the output itself, is it possible to bypass the autoregressive decoding? We propose Non-Autoregressive Proxy (NAP) models that can efficiently predict general scalar-valued sequence-level attributes. Importantly, NAPs predict these metrics directly from the encodings, avoiding the expensive autoregressive decoding stage. We consider two sequence-to-sequence task: Machine Translation (MT); and Automatic Speech Recognition (ASR). In OOD for MT, NAPs outperform a deep ensemble while being significantly faster. NAPs are also shown to be able to predict performance metrics such as BERTScore (MT) or word error rate (ASR). For downstream tasks, such as data filtering and resource optimization, NAPs generate performance predictions that outperform predictive uncertainty while being highly inference efficient.


Is AUC the best measure for practical comparison of anomaly detectors?

arXiv.org Artificial Intelligence

Data Mining and Knowledge Discovery manuscript No. (will be inserted by the editor) Is AUC the best measure for practical comparison of anomaly detectors? Abstract The area under receiver operating characteristics (AUC) is the standard measure for comparison of anomaly detectors. Its advantage is in providing a scalar number that allows a natural ordering and is independent on a threshold, which allows to postpone the choice. In this work, we question whether AUC is a good metric for anomaly detection, or if it gives a false sense of comfort, due to relying on assumptions which are unlikely to hold in practice. Our investigation shows that variations of AUC emphasizing accuracy at low false positive rate seem to be better correlated with the needs of practitioners, but also that we can compare anomaly detectors only in the case when we have representative examples of anomalous samples. This last result is disturbing, as it suggests that in many cases, we should do active or few-show learning instead of pure anomaly detection. While this definition makes sense from the point of view of probability theory, the practitioners are interested only in anomalies of a certain kind. For example, the use of anomaly detection in computer security is motivated by the assumption that attacks are rare, but not every rare event is an attack. Anomaly detection has a long history and there are hundreds of models based on vastly different approaches, such as modifications of the k-nearest neighbors algorithm (Harmeling et al., 2006), random forests Figure 1 Receiver operating characteristic (ROC) curve and the corresponding AUC of a degenerate anomaly detector. FPR / TPR stands for false / true positive rate. Several comparative studies exist - e.g.


Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

arXiv.org Artificial Intelligence

LASSO regularization is a popular regression tool to enhance the prediction accuracy of statistical models by performing variable selection through the $\ell_1$ penalty, initially formulated for the linear model and its variants. In this paper, the territory of LASSO is extended to two-layer ReLU neural networks, a fashionable and powerful nonlinear regression model. Specifically, given a neural network whose output $y$ depends only on a small subset of input $\boldsymbol{x}$, denoted by $\mathcal{S}^{\star}$, we prove that the LASSO estimator can stably reconstruct the neural network and identify $\mathcal{S}^{\star}$ when the number of samples scales logarithmically with the input dimension. This challenging regime has been well understood for linear models while barely studied for neural networks. Our theory lies in an extended Restricted Isometry Property (RIP)-based analysis framework for two-layer ReLU neural networks, which may be of independent interest to other LASSO or neural network settings. Based on the result, we advocate a neural network-based variable selection method. Experiments on simulated and real-world datasets show promising performance of the variable selection approach compared with existing techniques.


GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding

arXiv.org Artificial Intelligence

Graph Representation Learning (GRL) methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and embedding that can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as a competitive edge and node label prediction performance. GRAPE comprises about 1.7 million well-documented lines of Python and Rust code and provides 69 node embedding methods, 25 inference models, a collection of efficient graph processing utilities and over 80,000 graphs from the literature and other sources. Standardized interfaces allow seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of GRL methods, therefore also positioning GRAPE as a software resource to perform a fair comparison between methods and libraries for graph processing and embedding.


Detecting Concept Drift for the reliability prediction of Software Defects using Instance Interpretation

arXiv.org Artificial Intelligence

In the context of Just-In-Time Software Defect Prediction (JIT-SDP), Concept drift (CD) can occur due to changes in the software development process, the complexity of the software, or changes in user behavior that may affect the stability of the JIT-SDP model over time. Additionally, the challenge of class imbalance in JIT-SDP data poses a potential risk to the accuracy of CD detection methods if rebalancing is implemented. This issue has not been explored to the best of our knowledge. Furthermore, methods to check the stability of JIT-SDP models over time by considering labeled evaluation data have been proposed. However, it should be noted that future data labels may not always be available promptly. We aim to develop a reliable JIT-SDP model using CD point detection directly by identifying changes in the interpretation of unlabeled simplified and resampled data. To evaluate our approach, we first obtained baseline methods based on model performance monitoring to identify CD points on labeled data. We then compared the output of the proposed methods with baseline methods based on performance monitoring of threshold-dependent and threshold-independent criteria using well-known performance measures in CD detection methods, such as accuracy, MDR, MTD, MTFA, and MTR. We also utilize the Friedman statistical test to assess the effectiveness of our approach. As a result, our proposed methods show higher compatibility with baseline methods based on threshold-independent criteria when applied to rebalanced data, and with baseline methods based on threshold-dependent criteria when applied to simple data.


Symbolic Regression on FPGAs for Fast Machine Learning Inference

arXiv.org Artificial Intelligence

The high-energy physics community is investigating the feasibility of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to improve physics sensitivity while meeting data processing latency limitations. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches equation space to discover algebraic relations approximating a dataset. We use PySR (software for uncovering these expressions based on evolutionary algorithm) and extend the functionality of hls4ml (a package for machine learning inference in FPGAs) to support PySR -generated expressions for resource-constrained production environments. Deep learning models often optimise the top metric by pinning the network size because vast hyperparameter space prevents extensive neural architecture search. Conversely, SR selects a set of models on the Pareto front, which allows for optimising the performanceresource tradeoff directly. By embedding symbolic forms, our implementation can dramatically reduce the computational resources needed to perform critical tasks. We validate our procedure on a physics benchmark: multiclass classification of jets produced in simulated proton-proton collisions at the CERN Large Hadron Collider, and show that we approximate a 3-layer neural network with an inference model that has as low as 5 ns execution time (a reduction by a factor of 13) and over 90% approximation accuracy.


A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective

arXiv.org Artificial Intelligence

Enterprise financial risk analysis aims at predicting the future financial risk of enterprises. Due to its wide and significant application, enterprise financial risk analysis has always been the core research topic in the fields of Finance and Management. Based on advanced computer science and artificial intelligence technologies, enterprise risk analysis research is experiencing rapid developments and making significant progress. Therefore, it is both necessary and challenging to comprehensively review the relevant studies. Although there are already some valuable and impressive surveys on enterprise risk analysis from the perspective of Finance and Management, these surveys introduce approaches in a relatively isolated way and lack recent advances in enterprise financial risk analysis. In contrast, this paper attempts to provide a systematic literature survey of enterprise risk analysis approaches from Big Data perspective, which reviews more than 250 representative articles in the past almost 50 years (from 1968 to 2023). To the best of our knowledge, this is the first and only survey work on enterprise financial risk from Big Data perspective. Specifically, this survey connects and systematizes the existing enterprise financial risk studies, i.e. to summarize and interpret the problems, methods, and spotlights in a comprehensive way. In particular, we first introduce the issues of enterprise financial risks in terms of their types,granularity, intelligence, and evaluation metrics, and summarize the corresponding representative works. Then, we compare the analysis methods used to learn enterprise financial risk, and finally summarize the spotlights of the most representative works. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk generation and contagion.


A Review of Benchmarks for Visual Defect Detection in the Manufacturing Industry

arXiv.org Artificial Intelligence

The field of industrial defect detection using machine learning and deep learning is a subject of active research. Datasets, also called benchmarks, are used to compare and assess research results. There is a number of datasets in industrial visual inspection, of varying quality. Thus, it is a difficult task to determine which dataset to use. Generally speaking, datasets which include a testing set, with precise labeling and made in real-world conditions should be preferred. We propose a study of existing benchmarks to compare and expose their characteristics and their use-cases. A study of industrial metrics requirements, as well as testing procedures, will be presented and applied to the studied benchmarks. We discuss our findings by examining the current state of benchmarks for industrial visual inspection, and by exposing guidelines on the usage of benchmarks.