Calgary
On the Robustness of Random Forest Against Untargeted Data Poisoning: An Ensemble-Based Approach
Anisetti, Marco, Ardagna, Claudio A., Balestrucci, Alessandro, Bena, Nicola, Damiani, Ernesto, Yeun, Chan Yeob
Machine learning is becoming ubiquitous. From finance to medicine, machine learning models are boosting decision-making processes and even outperforming humans in some tasks. This huge progress in terms of prediction quality does not however find a counterpart in the security of such models and corresponding predictions, where perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy. Research on poisoning attacks and defenses received increasing attention in the last decade, leading to several promising solutions aiming to increase the robustness of machine learning. Among them, ensemble-based defenses, where different models are trained on portions of the training set and their predictions are then aggregated, provide strong theoretical guarantees at the price of a linear overhead. Surprisingly, ensemble-based defenses, which do not pose any restrictions on the base model, have not been applied to increase the robustness of random forest models. The work in this paper aims to fill in this gap by designing and implementing a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks. An extensive experimental evaluation measures the performance of our approach against a variety of attacks, as well as its sustainability in terms of resource consumption and performance, and compares it with a traditional monolithic model based on random forest. A final discussion presents our main findings and compares our approach with existing poisoning defenses targeting random forests.
An investigation into the impact of deep learning model choice on sex and race bias in cardiac MR segmentation
Lee, Tiarna, Puyol-Antón, Esther, Ruijsink, Bram, Aitcheson, Keana, Shi, Miaojing, King, Andrew P.
In medical imaging, artificial intelligence (AI) is increasingly being used to automate routine tasks. However, these algorithms can exhibit and exacerbate biases which lead to disparate performances between protected groups. We investigate the impact of model choice on how imbalances in subject sex and race in training datasets affect AI-based cine cardiac magnetic resonance image segmentation. We evaluate three convolutional neural network-based models and one vision transformer model. We find significant sex bias in three of the four models and racial bias in all of the models. However, the severity and nature of the bias varies between the models, highlighting the importance of model choice when attempting to train fair AI-based segmentation models for medical imaging tasks.
Decoupled Structure for Improved Adaptability of End-to-End Models
Deng, Keqi, Woodland, Philip C.
Although end-to-end (E2E) trainable automatic speech recognition (ASR) has shown great success by jointly learning acoustic and linguistic information, it still suffers from the effect of domain shifts, thus limiting potential applications. The E2E ASR model implicitly learns an internal language model (LM) which characterises the training distribution of the source domain, and the E2E trainable nature makes the internal LM difficult to adapt to the target domain with text-only data To solve this problem, this paper proposes decoupled structures for attention-based encoder-decoder (Decoupled-AED) and neural transducer (Decoupled-Transducer) models, which can achieve flexible domain adaptation in both offline and online scenarios while maintaining robust intra-domain performance. To this end, the acoustic and linguistic parts of the E2E model decoder (or prediction network) are decoupled, making the linguistic component (i.e. internal LM) replaceable. When encountering a domain shift, the internal LM can be directly replaced during inference by a target-domain LM, without re-training or using domain-specific paired speech-text data. Experiments for E2E ASR models trained on the LibriSpeech-100h corpus showed that the proposed decoupled structure gave 15.1% and 17.2% relative word error rate reductions on the TED-LIUM 2 and AESRC2020 corpora while still maintaining performance on intra-domain data.
Reframing the Brain Age Prediction Problem to a More Interpretable and Quantitative Approach
Gianchandani, Neha, Dibaji, Mahsa, Bento, Mariana, MacDonald, Ethan, Souza, Roberto
Deep learning models have achieved state-of-the-art results in estimating brain age, which is an important brain health biomarker, from magnetic resonance (MR) images. However, most of these models only provide a global age prediction, and rely on techniques, such as saliency maps to interpret their results. These saliency maps highlight regions in the input image that were significant for the model's predictions, but they are hard to be interpreted, and saliency map values are not directly comparable across different samples. In this work, we reframe the age prediction problem from MR images to an image-to-image regression problem where we estimate the brain age for each brain voxel in MR images. We compare voxel-wise age prediction models against global age prediction models and their corresponding saliency maps. The results indicate that voxel-wise age prediction models are more interpretable, since they provide spatial information about the brain aging process, and they benefit from being quantitative.
FECoM: A Step towards Fine-Grained Energy Measurement for Deep Learning
Rajput, Saurabhsingh, Widmayer, Tim, Shang, Ziyuan, Kechagia, Maria, Sarro, Federica, Sharma, Tushar
With the increasing usage, scale, and complexity of Deep Learning (DL) models, their rapidly growing energy consumption has become a critical concern. Promoting green development and energy awareness at different granularities is the need of the hour to limit carbon emissions of DL systems. However, the lack of standard and repeatable tools to accurately measure and optimize energy consumption at a fine granularity (e.g., at method level) hinders progress in this area. In this paper, we introduce FECoM (Fine-grained Energy Consumption Meter), a framework for fine-grained DL energy consumption measurement. Specifically, FECoM provides researchers and developers a mechanism to profile DL APIs. FECoM addresses the challenges of measuring energy consumption at fine-grained level by using static instrumentation and considering various factors, including computational load and temperature stability. We assess FECoM's capability to measure fine-grained energy consumption for one of the most popular open-source DL frameworks, namely TensorFlow. Using FECoM, we also investigate the impact of parameter size and execution time on energy consumption, enriching our understanding of TensorFlow APIs' energy profiles. Furthermore, we elaborate on the considerations, issues, and challenges that one needs to consider while designing and implementing a fine-grained energy consumption measurement tool. We hope this work will facilitate further advances in DL energy measurement and the development of energy-aware practices for DL systems.
RECOMED: A Comprehensive Pharmaceutical Recommendation System
Zomorodi, Mariam, Ghodsollahee, Ismail, Martin, Jennifer H., Talley, Nicholas J., Salari, Vahid, Plawiak, Pawel, Rahimi, Kazem, Acharya, U. Rajendra
A comprehensive pharmaceutical recommendation system was designed based on the patients and drugs features extracted from Drugs.com and Druglib.com. First, data from these databases were combined, and a dataset of patients and drug information was built. Secondly, the patients and drugs were clustered, and then the recommendation was performed using different ratings provided by patients, and importantly by the knowledge obtained from patients and drug specifications, and considering drug interactions. To the best of our knowledge, we are the first group to consider patients conditions and history in the proposed approach for selecting a specific medicine appropriate for that particular user. Our approach applies artificial intelligence (AI) models for the implementation. Sentiment analysis using natural language processing approaches is employed in pre-processing along with neural network-based methods and recommender system algorithms for modeling the system. In our work, patients conditions and drugs features are used for making two models based on matrix factorization. Then we used drug interaction to filter drugs with severe or mild interactions with other drugs. We developed a deep learning model for recommending drugs by using data from 2304 patients as a training set, and then we used data from 660 patients as our validation set. After that, we used knowledge from critical information about drugs and combined the outcome of the model into a knowledge-based system with the rules obtained from constraints on taking medicine.
Quantum State Tomography using Quantum Machine Learning
Innan, Nouhaila, Siddiqui, Owais Ishtiaq, Arora, Shivang, Ghosh, Tamojit, Koçak, Yasemin Poyraz, Paragas, Dominic, Galib, Abdullah Al Omar, Khan, Muhammad Al-Zafar, Bennai, Mohamed
Quantum State Tomography (QST) is a fundamental technique in Quantum Information Processing (QIP) for reconstructing unknown quantum states. However, the conventional QST methods are limited by the number of measurements required, which makes them impractical for large-scale quantum systems. To overcome this challenge, we propose the integration of Quantum Machine Learning (QML) techniques to enhance the efficiency of QST. In this paper, we conduct a comprehensive investigation into various approaches for QST, encompassing both classical and quantum methodologies; We also implement different QML approaches for QST and demonstrate their effectiveness on various simulated and experimental quantum systems, including multi-qubit networks. Our results show that our QML-based QST approach can achieve high fidelity (98%) with significantly fewer measurements than conventional methods, making it a promising tool for practical QIP applications.
Meta-learning enhanced next POI recommendation by leveraging check-ins from auxiliary cities
Wang, Jinze, Zhang, Lu, Sun, Zhu, Ong, Yew-Soon
Most existing point-of-interest (POI) recommenders aim to capture user preference by employing city-level user historical check-ins, thus facilitating users' exploration of the city. However, the scarcity of city-level user check-ins brings a significant challenge to user preference learning. Although prior studies attempt to mitigate this challenge by exploiting various context information, e.g., spatio-temporal information, they ignore to transfer the knowledge (i.e., common behavioral pattern) from other relevant cities (i.e., auxiliary cities). In this paper, we investigate the effect of knowledge distilled from auxiliary cities and thus propose a novel Meta-learning Enhanced next POI Recommendation framework (MERec). The MERec leverages the correlation of check-in behaviors among various cities into the meta-learning paradigm to help infer user preference in the target city, by holding the principle of "paying more attention to more correlated knowledge". Particularly, a city-level correlation strategy is devised to attentively capture common patterns among cities, so as to transfer more relevant knowledge from more correlated cities. Extensive experiments verify the superiority of the proposed MERec against state-of-the-art algorithms.
Improved Privacy-Preserving PCA Using Optimized Homomorphic Matrix Multiplication
Principal Component Analysis (PCA) is a pivotal technique widely utilized in the realms of machine learning and data analysis. It aims to reduce the dimensionality of a dataset while minimizing the loss of information. In recent years, there have been endeavors to utilize homomorphic encryption in privacy-preserving PCA algorithms for secure cloud computing. These approaches commonly employ a PCA routine known as PowerMethod, which takes the covariance matrix as input and generates an approximate eigenvector corresponding to the primary component of the dataset. However, their performance is constrained by the absence of an efficient homomorphic covariance matrix computation circuit and an accurate homomorphic vector normalization strategy in the PowerMethod algorithm. In this study, we propose a novel approach to privacy-preserving PCA that addresses these limitations, resulting in superior efficiency, accuracy, and scalability compared to previous approaches. We attain such efficiency and precision through the following contributions: (i) We implement space and speed optimization techniques for a homomorphic matrix multiplication method, specifically tailored for parallel computing scenarios.
Automated Test Case Generation Using Code Models and Domain Adaptation
Hashtroudi, Sepehr, Shin, Jiho, Hemmati, Hadi, Wang, Song
State-of-the-art automated test generation techniques, such as search-based testing, are usually ignorant about what a developer would create as a test case. Therefore, they typically create tests that are not human-readable and may not necessarily detect all types of complex bugs developer-written tests would do. In this study, we leverage Transformer-based code models to generate unit tests that can complement search-based test generation. Specifically, we use CodeT5, i.e., a state-of-the-art large code model, and fine-tune it on the test generation downstream task. For our analysis, we use the Methods2test dataset for fine-tuning CodeT5 and Defects4j for project-level domain adaptation and evaluation. The main contribution of this study is proposing a fully automated testing framework that leverages developer-written tests and available code models to generate compilable, human-readable unit tests. Results show that our approach can generate new test cases that cover lines that were not covered by developer-written tests. Using domain adaptation, we can also increase line coverage of the model-generated unit tests by 49.9% and 54% in terms of mean and median (compared to the model without domain adaptation). We can also use our framework as a complementary solution alongside common search-based methods to increase the overall coverage with mean and median of 25.3% and 6.3%. It can also increase the mutation score of search-based methods by killing extra mutants (up to 64 new mutants were killed per project in our experiments).