AITopics

2111.09076

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Regacho, Lord Christian Carl H., Matsushita, Ai, Ceniza-Canillo, Angie M.

Automated Identification of Disaster News For Crisis Management Using Machine Learning

arXiv.org Artificial IntelligenceJan-24-2023

A lot of news sources picked up on Typhoon Rai (also known locally as Typhoon Odette), along with fake news outlets. The study honed in on the issue, to create a model that can identify between legitimate and illegitimate news articles. With this in mind, we chose the following machine learning algorithms in our development: Logistic Regression, Random Forest and Multinomial Naive Bayes. Bag of Words, TF-IDF and Lemmatization were implemented in the Model. Gathering 160 datasets from legitimate and illegitimate sources, the machine learning was trained and tested. By combining all the machine learning techniques, the Combined BOW model was able to reach an accuracy of 91.07%, precision of 88.33%, recall of 94.64%, and F1 score of 91.38% and Combined TF-IDF model was able to reach an accuracy of 91.18%, precision of 86.89%, recall of 94.64%, and F1 score of 90.60%.

information retrieval, machine learning, news article, (19 more...)

2301.09896

Country:

Asia > Philippines > Visayas > Central Visayas > Province of Cebu > City of Cebu (0.05)
Europe > Andorra > Canillo > Canillo (0.05)
North America > Canada (0.04)
(2 more...)

Genre: Research Report > New Finding (0.39)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.57)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.53)
(3 more...)

#artificialintelligenceJan-23-2023, 15:08:08 GMT

From Data Collection to Model Deployment: 6 Stages of a Data Science Project - KDnuggets

Additionally, the chance is you won't be working with a dataset, so merging data is also a common operation you'll use. Extracting meaningful information from data becomes easier if you visualize it. In Python, there are many libraries you can use to visualize your data. You should use this stage to detect the outliers and correlated predictors. If undetected, they will decrease your machine-learning model performance.

data mining, information, machine learning, (19 more...)

#artificialintelligence

Industry: Education (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

#artificialintelligenceJan-23-2023, 12:45:12 GMT

Biased AI, a Look Under the Hood. What exactly is going on in AI systems…

In order to gain a better understanding of the background to this problem, let us first introduce some fundamental knowledge about machine learning. Compared with traditional programming, one major difference is that the reasoning behind the algorithm's decision-making is not defined by hard-coded rules which were explicitly programmed by a human, but it is rather learned by example data: thousands, sometimes millions of parameters get optimised without human intervention to finally capture a generalised pattern of the data. The resulting model allows to make predictions on new, unseen data with high accuracy. To illustrate the concept, let's consider a sample scenario about fraud detection in insurance claims. Verifying the legitimacy of an insurance claim is essential to prevent abuse.

artificial intelligence, fraudulent claim, machine learning, (16 more...)

#artificialintelligence

Industry: Banking & Finance > Insurance (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

A Survey on Actionable Knowledge

Arefin, Sayed Erfan

Actionable Knowledge Discovery (AKD) is a crucial aspect of data mining that is gaining popularity and being applied in a wide range of domains. This is because AKD can extract valuable insights and information, also known as knowledge, from large datasets. The goal of this paper is to examine different research studies that focus on various domains and have different objectives. The paper will review and discuss the methods used in these studies in detail. AKD is a process of identifying and extracting actionable insights from data, which can be used to make informed decisions and improve business outcomes. It is a powerful tool for uncovering patterns and trends in data that can be used for various applications such as customer relationship management, marketing, and fraud detection. The research studies reviewed in this paper will explore different techniques and approaches for AKD in different domains, such as healthcare, finance, and telecommunications. The paper will provide a thorough analysis of the current state of AKD in the field and will review the main methods used by various research studies. Additionally, the paper will evaluate the advantages and disadvantages of each method and will discuss any novel or new solutions presented in the field. Overall, this paper aims to provide a comprehensive overview of the methods and techniques used in AKD and the impact they have on different domains.

knowledge management, machine learning, pattern recognition, (18 more...)

2301.09317

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Asia > Taiwan (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Government (1.00)
Information Technology > Services (0.68)
(2 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Enterprise Applications (1.00)
Information Technology > Data Science > Data Mining (1.00)
(9 more...)

Franks, Billy Joe, Dinkelmann, Benjamin, Fellenz, Sophie, Kloft, Marius

Ordinal Regression for Difficulty Estimation of StepMania Levels

StepMania is a popular open-source clone of a rhythm-based video game. As is common in popular games, there is a large number of community-designed levels. It is often difficult for players and level authors to determine the difficulty level of such community contributions. In this work, we formalize and analyze the difficulty prediction task on StepMania levels as an ordinal regression (OR) task. We standardize a more extensive and diverse selection of this data resulting in five data sets, two of which are extensions of previous work. We evaluate many competitive OR and non-OR models, demonstrating that neural network-based models significantly outperform the state of the art and that StepMania-level data makes for an excellent test bed for deep OR models. We conclude with a user experiment showing our trained models' superiority over human labeling.

artificial intelligence, gpop gull speirmix pattern 0, machine learning, (13 more...)

2301.09485

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
Europe > Germany > Rhineland-Palatinate > Landau (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Keller, Piotr, Dawood, Muhammad, Minhas, Fayyaz ul Amir Afsar

Maximum Mean Discrepancy Kernels for Predictive and Prognostic Modeling of Whole Slide Images

How similar are two images? In computational pathology, where Whole Slide Images (WSIs) of digitally scanned tissue samples from patients can be multi-gigapixels in size, determination of degree of similarity between two WSIs is a challenging task with a number of practical applications. In this work, we explore a novel strategy based on kernelized Maximum Mean Discrepancy (MMD) analysis for determination of pairwise similarity between WSIs. The proposed approach works by calculating MMD between two WSIs using kernels over deep features of image patches. This allows representation of an entire dataset of WSIs as a kernel matrix for WSI level clustering, weakly-supervised prediction of TP-53 mutation status in breast cancer patients from their routine WSIs as well as survival analysis with state of the art prediction performance. We believe that this work will open up further avenues for application of WSI-level kernels for predictive and prognostic tasks in computational pathology.

artificial intelligence, data mining, machine learning, (16 more...)

2301.09624

Country:

Europe > United Kingdom (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.94)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.36)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.89)

Sharan, Lalith, Kelm, Halvar, Romano, Gabriele, Karck, Matthias, De Simone, Raffaele, Engelhardt, Sandy

mvHOTA: A multi-view higher order tracking accuracy metric to measure spatial and temporal associations in multi-point detection

Multi-point tracking is a challenging task that involves detecting points in the scene and tracking them across a sequence of frames. Computing detection-based measures like the F-measure on a frame-by-frame basis is not sufficient to assess the overall performance, as it does not interpret performance in the temporal domain. The main evaluation metric available comes from Multi-object tracking (MOT) methods to benchmark performance on datasets such as KITTI with the recently proposed higher order tracking accuracy (HOTA) metric, which is capable of providing a better description of the performance over metrics such as MOTA, DetA, and IDF1. While the HOTA metric takes into account temporal associations, it does not provide a tailored means to analyse the spatial associations of a dataset in a multi-camera setup. Moreover, there are differences in evaluating the detection task for points when compared to objects (point distances vs. bounding box overlap). Therefore in this work, we propose a multi-view higher order tracking metric (mvHOTA) to determine the accuracy of multi-point (multi-instance and multi-class) tracking methods, while taking into account temporal and spatial associations.mvHOTA can be interpreted as the geometric mean of detection, temporal, and spatial associations, thereby providing equal weighting to each of the factors. We demonstrate the use of this metric to evaluate the tracking performance on an endoscopic point detection dataset from a previously organised surgical data science challenge. Furthermore, we compare with other adjusted MOT metrics for this use-case, discuss the properties of mvHOTA, and show how the proposed multi-view Association and the Occlusion index (OI) facilitate analysis of methods with respect to handling of occlusions. The code is available at https://github.com/Cardio-AI/mvhota.

artificial intelligence, machine learning, spatial association, (18 more...)

doi: 10.1080/21681163.2022.2159535

2206.09372

Country:

Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Germany > Bremen > Bremen (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Education > Assessment & Standards (0.76)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Vouk, Boštjan, Guid, Matej, Robnik-Šikonja, Marko

Feature construction using explanations of individual predictions

Feature construction can contribute to comprehensibility and performance of machine learning models. Unfortunately, it usually requires exhaustive search in the attribute space or time-consuming human involvement to generate meaningful features. We propose a novel heuristic approach for reducing the search space based on aggregation of instance-based explanations of predictive models. The proposed Explainable Feature Construction (EFC) methodology identifies groups of co-occurring attributes exposed by popular explanation methods, such as IME and SHAP. We empirically show that reducing the search to these groups significantly reduces the time of feature construction using logical, relational, Cartesian, numerical, and threshold num-of-N and X-of-N constructive operators. An analysis on 10 transparent synthetic datasets shows that EFC effectively identifies informative groups of attributes and constructs relevant features. Using 30 real-world classification datasets, we show significant improvements in classification accuracy for several classifiers and demonstrate the feasibility of the proposed feature construction even for large datasets. Finally, EFC generated interpretable features on a real-world problem from the financial industry, which were confirmed by a domain expert.

artificial intelligence, data mining, machine learning, (21 more...)

doi: 10.1016/j.engappai.2023.105823

2301.09631

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England (0.04)
Asia > Vietnam (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Education (0.92)
Health & Medicine > Therapeutic Area > Oncology (0.68)
Banking & Finance > Credit (0.68)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)
(5 more...)

Kumari, Kavita, Rieger, Phillip, Fereidooni, Hossein, Jadliwala, Murtuza, Sadeghi, Ahmad-Reza

BayBFed: Bayesian Backdoor Defense for Federated Learning

Federated learning (FL) allows participants to jointly train a machine learning model without sharing their private data with others. However, FL is vulnerable to poisoning attacks such as backdoor attacks. Consequently, a variety of defenses have recently been proposed, which have primarily utilized intermediary states of the global model (i.e., logits) or distance of the local models (i.e., L2-norm) from the global model to detect malicious backdoors. However, as these approaches directly operate on client updates, their effectiveness depends on factors such as clients' data distribution or the adversary's attack strategies. In this paper, we introduce a novel and more generic backdoor defense framework, called BayBFed, which proposes to utilize probability distributions over client updates to detect malicious updates in FL: it computes a probabilistic measure over the clients' updates to keep track of any adjustments made in the updates, and uses a novel detection algorithm that can leverage this probabilistic measure to efficiently detect and filter out malicious updates. Thus, it overcomes the shortcomings of previous approaches that arise due to the direct usage of client updates; as our probabilistic measure will include all aspects of the local client training strategies. BayBFed utilizes two Bayesian Non-Parametric extensions: (i) a Hierarchical Beta-Bernoulli process to draw a probabilistic measure given the clients' updates, and (ii) an adaptation of the Chinese Restaurant Process (CRP), referred by us as CRP-Jensen, which leverages this probabilistic measure to detect and filter out malicious updates. We extensively evaluate our defense approach on five benchmark datasets: CIFAR10, Reddit, IoT intrusion detection, MNIST, and FMNIST, and show that it can effectively detect and eliminate malicious updates in FL without deteriorating the benign performance of the global model.

artificial intelligence, deep learning, machine learning, (19 more...)

2301.09508

Country:

North America > United States > Texas (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)