AITopics

2011.01183

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Everett, Derek, Nguyen, Andre T., Richards, Luke E., Raff, Edward

Improving Out-of-Distribution Detection via Epistemic Uncertainty Adversarial Training

arXiv.org Artificial IntelligenceSep-9-2022

The quantification of uncertainty is important for the adoption of machine learning, especially to reject out-of-distribution (OOD) data back to human experts for review. Yet progress has been slow, as a balance must be struck between computational efficiency and the quality of uncertainty estimates. For this reason many use deep ensembles of neural networks or Monte Carlo dropout for reasonable uncertainty estimates at relatively minimal compute and memory. Surprisingly, when we focus on the real-world applicable constraint of $\leq 1\%$ false positive rate (FPR), prior methods fail to reliably detect OOD samples as such. Notably, even Gaussian random noise fails to trigger these popular OOD techniques. We help to alleviate this problem by devising a simple adversarial training scheme that incorporates an attack of the epistemic uncertainty predicted by the dropout ensemble. We demonstrate this method improves OOD detection performance on standard data (i.e., not adversarially crafted), and improves the standardized partial AUC from near-random guessing performance to $\geq 0.75$.

ensemble, epistemic uncertainty, experiment, (15 more...)

2209.03148

Country:

North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

arXiv.org Artificial IntelligenceSep-8-2022

Patient-specific modelling, simulation and real-time processing for respiratory diseases

Nousias, Stavros

Asthma is a common chronic disease of the respiratory system causing significant disability and societal burden. It affects more than 300 million people worldwide, while more than 100 million people will likely have asthma by 2025. The price of asthma varies greatly from nation to nation. Mean yearly cost can be estimated to 1900 EUR in Europe and $3100 in the United States. Managing asthma involves controlling symptoms, preventing exacerbations, and maintaining lung function. Improved asthma control is reduces the risk of exacerbations and lung function impairment while reducing the direct costs of asthma care and indirect costs associated with reduced productivity. Understanding the complex dynamics of the pulmonary system and the lung's response to disease is fundamental to the advancement of Asthma treatment. Computational models of the respiratory system seek to provide a theoretical framework to understand the interaction between structure and function. Their application can improve pulmonary medicine by a patient-specific approach to medicinal methodologies optimizing the delivery given the personalized geometry and personalized ventilation patterns. A three-fold objective is addressed within this dissertation. The first part refers to the comprehension of pulmonary pathophysiology and the mechanics of Asthma and subsequently of constrictive pulmonary conditions in general. The second part refers to the design and implementation of tools that facilitate personalized medicine to improve delivery and effectiveness. Finally, the third part refers to the self-management of the condition, meaning that medical personnel and patients have access to tools and methods that allow the first party to easily track the course of the condition and the second party, i.e. the patient to easily self-manage it alleviating the significant burden from the health system.

data mining, environmental parameter and medication monitoring, machine learning, (22 more...)

2207.01082

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases > Asthma (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
(11 more...)

Juarez, Marc, Yeom, Samuel, Fredrikson, Matt

Black-Box Audits for Group Distribution Shifts

arXiv.org Artificial IntelligenceSep-8-2022

When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic groups. By extending techniques used in membership and property inference attacks -- which are designed to expose private information from learned models -- we demonstrate that an external auditor can gain the information needed to identify these distribution shifts solely by querying the model. Our experimental results on real-world datasets show that this approach is effective, achieving 80--100% AUC-ROC in detecting shifts involving the underrepresentation of a demographic group in the training set. Researchers and investigative journalists can use our tools to perform non-collaborative audits of proprietary models and expose cases of underrepresentation in the training datasets.

attack model, audit, target model, (16 more...)

2209.0362

Country:

North America > United States > California (0.14)
North America > United States > Michigan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (0.93)
Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Olukoga, Toluwalase A., Feng, Yin

A Case Study on the Classification of Lost Circulation Events During Drilling using Machine Learning Techniques on an Imbalanced Large Dataset

This study presents machine learning models that forecast and categorize lost circulation severity preemptively using a large class imbalanced drilling dataset. We demonstrate reproducible core techniques involved in tackling a large drilling engineering challenge utilizing easily interpretable machine learning approaches. We utilized a 65,000+ records data with class imbalance problem from Azadegan oilfield formations in Iran. Eleven of the dataset's seventeen parameters are chosen to be used in the classification of five lost circulation events. To generate classification models, we used six basic machine learning algorithms and four ensemble learning methods. Linear Discriminant Analysis (LDA), Logistic Regression (LR), Support Vector Machines (SVM), Classification and Regression Trees (CART), k-Nearest Neighbors (KNN), and Gaussian Naive Bayes (GNB) are the six fundamental techniques. We also used bagging and boosting ensemble learning techniques in the investigation of solutions for improved predicting performance. The performance of these algorithms is measured using four metrics: accuracy, precision, recall, and F1-score. The F1-score weighted to represent the data imbalance is chosen as the preferred evaluation criterion. The CART model was found to be the best in class for identifying drilling fluid circulation loss events with an average weighted F1-score of 0.9904 and standard deviation of 0.0015. Upon application of ensemble learning techniques, a Random Forest ensemble of decision trees showed the best predictive performance. It identified and classified lost circulation events with a perfect weighted F1-score of 1.0. Using Permutation Feature Importance (PFI), the measured depth was found to be the most influential factor in accurately recognizing lost circulation events while drilling.

artificial intelligence, dataset, machine learning, (17 more...)

2209.01607

Country: Asia > Middle East > Iran (0.54)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Siddharth, Thiru, Kirar, Bhupendra Singh, Agrawal, Dheeraj Kumar

Plant Species Classification Using Transfer Learning by Pretrained Classifier VGG-19

arXiv.org Machine LearningSep-7-2022

Deep learning is currently the most important branch of machine learning, with applications in speech recognition, computer vision, image classification, and medical imaging analysis. Plant recognition is one of the areas where image classification can be used to identify plant species through their leaves. Botanists devote a significant amount of time to recognizing plant species by personally inspecting. This paper describes a method for dissecting color images of Swedish leaves and identifying plant species. To achieve higher accuracy, the task is completed using transfer learning with the help of pre-trained classifier VGG-19. The four primary processes of classification are image preprocessing, image augmentation, feature extraction, and recognition, which are performed as part of the overall model evaluation. The VGG-19 classifier grasps the characteristics of leaves by employing pre-defined hidden layers such as convolutional layers, max pooling layers, and fully connected layers, and finally uses the soft-max layer to generate a feature representation for all plant classes. The model obtains knowledge connected to aspects of the Swedish leaf dataset, which contains fifteen tree classes, and aids in predicting the proper class of an unknown plant with an accuracy of 99.70% which is higher than previous research works reported.

accuracy, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2209.03076

Country:

Asia > India > Madhya Pradesh > Bhopal (0.06)
Asia > Singapore (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.86)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

A Machine Learning Analysis of Impact of the Covid-19 Pandemic on Alcohol Consumption Habit Changes Among Healthcare Workers in the U.S

Rezapour, Mostafa

In this paper, we discuss the impact of the Covid-19 pandemic on alcohol consumption habit changes among healthcare workers in the United States. We utilize multiple supervised and unsupervised machine learning methods and models such as Decision Trees, Logistic Regression, Naive Bayes classifier, k-Nearest Neighbors, Support Vector Machines, Multilayer perceptron, XGBoost, CatBoost, LightGBM, Chi-Squared Test and mutual information method on a mental health survey data obtained from the University of Michigan Inter-University Consortium for Political and Social Research to find out relationships between COVID-19 related negative effects and alcohol consumption habit changes among healthcare workers. Our findings suggest that COVID-19-related school closures, COVID-19-related work schedule changes and COVID-related news exposure may lead to an increase in alcohol use among healthcare workers in the United States.

alcohol consumption habit change, covid-19 pandemic, healthcare worker, (11 more...)

2112.06261

Country:

North America > United States > North Carolina > Forsyth County > Winston-Salem (0.14)
Asia > China > Hubei Province > Wuhan (0.05)
North America > United States > Ohio (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)
(2 more...)

Kolluri, Aashish, Baluta, Teodora, Hooi, Bryan, Saxena, Prateek

LPGNet: Link Private Graph Networks for Node Classification

Classification tasks on labeled graph-structured data have many important applications ranging from social recommendation to financial modeling. Deep neural networks are increasingly being used for node classification on graphs, wherein nodes with similar features have to be given the same label. Graph convolutional networks (GCNs) are one such widely studied neural network architecture that perform well on this task. However, powerful link-stealing attacks on GCNs have recently shown that even with black-box access to the trained model, inferring which links (or edges) are present in the training graph is practical. In this paper, we present a new neural network architecture called LPGNet for training on graphs with privacy-sensitive edges. LPGNet provides differential privacy (DP) guarantees for edges using a novel design for how graph edge structure is used during training. We empirically show that LPGNet models often lie in the sweet spot between providing privacy and utility: They can offer better utility than "trivially" private architectures which use no edge information (e.g., vanilla MLPs) and better resilience against existing link-stealing attacks than vanilla GCNs which use the full edge structure. LPGNet also offers consistently better privacy-utility tradeoffs than DPGCN, which is the state-of-the-art mechanism for retrofitting differential privacy into conventional GCNs, in most of our evaluated datasets.

graph, lpgnet, node, (13 more...)

2205.03105

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Spain (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.81)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
Media (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Naseri, Mohammad, Han, Yufei, Mariconti, Enrico, Shen, Yun, Stringhini, Gianluca, De Cristofaro, Emiliano

Cerberus: Exploring Federated Prediction of Security Events

Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. In this paper, we explore the feasibility of using Federated Learning (FL) to predict future security events. To this end, we introduce Cerberus, a system enabling collaborative training of Recurrent Neural Network (RNN) models for participating organizations. The intuition is that FL could potentially offer a middle-ground between the non-private approach where the training data is pooled at a central server and the low-utility alternative of only training local models. We instantiate Cerberus on a dataset obtained from a major security company's intrusion prevention product and evaluate it vis-a-vis utility, robustness, and privacy, as well as how participants contribute to and benefit from the system. Overall, our work sheds light on both the positive aspects and the challenges of using FL for this task and paves the way for deploying federated approaches to predictive security.

cerberus, participant, security event, (13 more...)

2209.0305

Country:

North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Andres, Georges, Casiraghi, Giona, Vaccario, Giacomo, Schweitzer, Frank

Reconstructing signed relations from interaction data

Positive and negative relations play an essential role in human behavior and shape the communities we live in. Despite their importance, data about signed relations is rare and commonly gathered through surveys. Interaction data is more abundant, for instance, in the form of proximity or communication data. So far, though, it could not be utilized to detect signed relations. In this paper, we show how the underlying signed relations can be extracted with such data. Employing a statistical network approach, we construct networks of signed relations in four communities. We then show that these relations correspond to the ones reported in surveys. Additionally, the inferred relations allow us to study the homophily of individuals with respect to gender, religious beliefs, and financial backgrounds. We evaluate the importance of triads in the signed network to study group cohesion.

dataset, interaction, relation, (15 more...)

2209.03219

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > France (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(3 more...)