Goto

Collaborating Authors

 Performance Analysis


Intersection over Union (IoU) in Object Detection and Segmentation

#artificialintelligence

Intersection Over Union (IoU) is a number that quantifies the degree of overlap between two boxes. In the case of object detection and segmentation, IoU evaluates the overlap of Ground Truth and Prediction region. If you are a computer vision practitioner or even an enthusiast, you must have come across the term very often. It is the first checkpoint for evaluating the accuracy of a model. In simple terms, it's a metric that helps us measure the correctness of a prediction. In this blog post, you will get a detailed and intuitive explanation of the following.


Hop-Count Based Self-Supervised Anomaly Detection on Attributed Networks

arXiv.org Artificial Intelligence

Recent years have witnessed an upsurge of interest in the problem of anomaly detection on attributed networks due to its importance in both research and practice. Although various approaches have been proposed to solve this problem, two major limitations exist: (1) unsupervised approaches usually work much less efficiently due to the lack of supervisory signal, and (2) existing anomaly detection methods only use local contextual information to detect anomalous nodes, e.g., one- or two-hop information, but ignore the global contextual information. Since anomalous nodes differ from normal nodes in structures and attributes, it is intuitive that the distance between anomalous nodes and their neighbors should be larger than that between normal nodes and their neighbors if we remove the edges connecting anomalous and normal nodes. Thus, hop counts based on both global and local contextual information can be served as the indicators of anomaly. Motivated by this intuition, we propose a hop-count based model (HCM) to detect anomalies by modeling both local and global contextual information. To make better use of hop counts for anomaly identification, we propose to use hop counts prediction as a self-supervised task. We design two anomaly scores based on the hop counts prediction via HCM model to identify anomalies. Besides, we employ Bayesian learning to train HCM model for capturing uncertainty in learned parameters and avoiding overfitting. Extensive experiments on real-world attributed networks demonstrate that our proposed model is effective in anomaly detection.


PhilaeX: Explaining the Failure and Success of AI Models in Malware Detection

arXiv.org Artificial Intelligence

The explanation to an AI model's prediction used to support decision making in cyber security, is of critical importance. It is especially so when the model's incorrect prediction can lead to severe damages or even losses to lives and critical assets. However, most existing AI models lack the ability to provide explanations on their prediction results, despite their strong performance in most scenarios. In this work, we propose a novel explainable AI method, called PhilaeX, that provides the heuristic means to identify the optimized subset of features to form the complete explanations of AI models' predictions. It identifies the features that lead to the model's borderline prediction, and those with positive individual contributions are extracted. The feature attributions are then quantified through the optimization of a Ridge regression model. We verify the explanation fidelity through two experiments. First, we assess our method's capability in correctly identifying the activated features in the adversarial samples of Android malwares, through the features attribution values from PhilaeX. Second, the deduction and augmentation tests, are used to assess the fidelity of the explanations. The results show that PhilaeX is able to explain different types of classifiers correctly, with higher fidelity explanations, compared to the state-of-the-arts methods such as LIME and SHAP.


Depth image conversion model based on CycleGAN for growing tomato truss identification - Plant Methods

#artificialintelligence

On tomato plants, the flowering truss is a group or cluster of smaller stems where flowers and fruit develop, while the growing truss is the most extended part of the stem. Because the state of the growing truss reacts sensitively to the surrounding environment, it is essential to control its growth in the early stages. With the recent development of information and artificial intelligence technology in agriculture, a previous study developed a real-time acquisition and evaluation method for images using robots. Furthermore, we used image processing to locate the growing truss to extract growth information. Among the different vision algorithms, the CycleGAN algorithm was used to generate and transform unpaired images using generated learning images. In this study, we developed a robot-based system for simultaneously acquiring RGB and depth images of the growing truss of the tomato plant. The segmentation performance for approximately 35 samples was compared via false negative (FN) and false positive (FP) indicators. For the depth camera image, we obtained FN and FP values of 17.55 ± 3.01% and 17.76 ± 3.55%, respectively. For the CycleGAN algorithm, we obtained FN and FP values of 19.24 ± 1.45% and 18.24 ± 1.54%, respectively. When segmentation was performed via image processing through depth image and CycleGAN, the mean intersection over union (mIoU) was 63.56 ± 8.44% and 69.25 ± 4.42%, respectively, indicating that the CycleGAN algorithm can identify the desired growing truss of the tomato plant with high precision. The on-site possibility of the image extraction technique using CycleGAN was confirmed when the image scanning robot drove in a straight line through a tomato greenhouse. In the future, the proposed approach is expected to be used in vision technology to scan tomato growth indicators in greenhouses using an unmanned robot platform.


Prediction of Dilatory Behavior in eLearning: A Comparison of Multiple Machine Learning Models

arXiv.org Machine Learning

Procrastination, the irrational delay of tasks, is a common occurrence in online learning. Potential negative consequences include higher risk of drop-outs, increased stress, and reduced mood. Due to the rise of learning management systems and learning analytics, indicators of such behavior can be detected, enabling predictions of future procrastination and other dilatory behavior. However, research focusing on such predictions is scarce. Moreover, studies involving different types of predictors and comparisons between the predictive performance of various methods are virtually non-existent. In this study, we aim to fill these research gaps by analyzing the performance of multiple machine learning algorithms when predicting the delayed or timely submission of online assignments in a higher education setting with two categories of predictors: subjective, questionnaire-based variables and objective, log-data based indicators extracted from a learning management system. The results show that models with objective predictors consistently outperform models with subjective predictors, and a combination of both variable types perform slightly better. For each of these three options, a different approach prevailed (Gradient Boosting Machines for the subjective, Bayesian multilevel models for the objective, and Random Forest for the combined predictors). We conclude that careful attention should be paid to the selection of predictors and algorithms before implementing such models in learning management systems.


Text classification for online conversations with machine learning on AWS

#artificialintelligence

Online conversations are ubiquitous in modern life, spanning industries from video games to telecommunications. This has led to an exponential growth in the amount of online conversation data, which has helped in the development of state-of-the-art natural language processing (NLP) systems like chatbots and natural language generation (NLG) models. Over time, various NLP techniques for text analysis have also evolved. This necessitates the requirement for a fully managed service that can be integrated into applications using API calls without the need for extensive machine learning (ML) expertise. AWS offers pre-trained AWS AI services like Amazon Comprehend, which can effectively handle NLP use cases involving classification, text summarization, entity recognition, and more to gather insights from text.


Congratulations to the winners of the FAccT2022 distinguished paper awards!

AIHub

It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we present a rigorous examination of the values of the field by quantitatively and qualitatively analyzing 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: how they justify their choice of project, which aspects they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that societal needs are typically very loosely connected to the choice of project, if mentioned at all, and that consideration of negative consequences is extremely rare. We identify 67 values that are uplifted in machine learning research, and, of these, we find that papers most frequently justify and assess themselves based on performance, generalization, efficiency, researcher understanding, novelty, and building on previous work. We present extensive textual evidence and analysis of how these values are operationalized. Notably, we find that each of these top values is currently being defined and applied with assumptions and implications generally supporting the centralization of power. Finally, we find increasingly close ties between these highly cited papers and tech companies and elite universities.


3D Machine Learning 201 Guide: Point Cloud Semantic Segmentation

#artificialintelligence

Having the skills and the knowledge to attack every aspect of point cloud processing opens up many ideas and development doors. It is like a toolbox for 3D research creativity and development agility. And at the core, there is this incredible Artificial Intelligence space that targets 3D scene understanding. It is particularly relevant due to its importance for many applications, such as self-driving cars, autonomous robots, 3D mapping, virtual reality, and the Metaverse. And if you are an automation geek like me, it is hard to resist the temptation to have new paths to answer these challenges! This tutorial aims to give you what I consider the essential footing to do just that: the knowledge and code skills for developing 3D Point Cloud Semantic Segmentation systems. But actually, how can we apply semantic segmentation? And how challenging is 3D Machine Learning? Let me present a clear, in-depth 201 hands-on course focused on 3D Machine Learning.


Towards a Data-Driven Requirements Engineering Approach: Automatic Analysis of User Reviews

arXiv.org Artificial Intelligence

We are concerned by Data Driven Requirements Engineering, and in particular the consideration of user's reviews. These online reviews are a rich source of information for extracting new needs and improvement requests. In this work, we provide an automated analysis using CamemBERT, which is a state-of-the-art language model in French. We created a multi-label classification dataset of 6000 user reviews from three applications in the Health & Fitness field. The results are encouraging and suggest that it's possible to identify automatically the reviews concerning requests for new features. Dataset is available at: https://github.com/Jl-wei/APIA2022-French-user-reviews-classification-dataset.


How to Build an Online Machine Learning App With Python

#artificialintelligence

Machine learning is rapidly becoming as ubiquitous as data itself. Quite literally wherever there is an abundance of data, machine learning is somehow intertwined. After all, what utility would data have if we were not able to use it to predict something about the future? Luckily there is a plethora of toolkits and frameworks that have made it rather simple to deploy ML in Python. Specifically, Sklearn has done a terrifically effective job at making ML accessible to developers.