Accuracy
Artificial Intelligence Empowered Multiple Access for Ultra Reliable and Low Latency THz Wireless Networks
Boulogeorgos, Alexandros-Apostolos A., Yaqub, Edwin, Desai, Rachana, Sanguanpuak, Tachporn, Katzouris, Nikos, Lazarakis, Fotis, Alexiou, Angeliki, Di Renzo, Marco
Terahertz (THz) wireless networks are expected to catalyze the beyond fifth generation (B5G) era. However, due to the directional nature and the line-of-sight demand of THz links, as well as the ultra-dense deployment of THz networks, a number of challenges that the medium access control (MAC) layer needs to face are created. In more detail, the need of rethinking user association and resource allocation strategies by incorporating artificial intelligence (AI) capable of providing "real-time" solutions in complex and frequently changing environments becomes evident. Moreover, to satisfy the ultra-reliability and low-latency demands of several B5G applications, novel mobility management approaches are required. Motivated by this, this article presents a holistic MAC layer approach that enables intelligent user association and resource allocation, as well as flexible and adaptive mobility management, while maximizing systems' reliability through blockage minimization. In more detail, a fast and centralized joint user association, radio resource allocation, and blockage avoidance by means of a novel metaheuristic-machine learning framework is documented, that maximizes the THz networks performance, while minimizing the association latency by approximately three orders of magnitude. To support, within the access point (AP) coverage area, mobility management and blockage avoidance, a deep reinforcement learning (DRL) approach for beam-selection is discussed. Finally, to support user mobility between coverage areas of neighbor APs, a proactive hand-over mechanism based on AI-assisted fast channel prediction is~reported.
Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making
Inkpen, Kori, Chappidi, Shreya, Mallari, Keri, Nushi, Besmira, Ramesh, Divya, Michelucci, Pietro, Mandava, Vani, Vepลek, Libuลกe Hannah, Quinn, Gabrielle
Human-AI collaboration for decision-making strives to achieve team performance that exceeds the performance of humans or AI alone. However, many factors can impact success of Human-AI teams, including a user's domain expertise, mental models of an AI system, trust in recommendations, and more. This work examines users' interaction with three simulated algorithmic models, all with similar accuracy but different tuning on their true positive and true negative rates. Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled. Our results show that while recommendations from an AI-Assistant can aid user decision making, factors such as users' baseline performance relative to the AI and complementary tuning of AI error types significantly impact overall team performance. Novice users improved, but not to the accuracy level of the AI. Highly proficient users were generally able to discern when they should follow the AI recommendation and typically maintained or improved their performance. Mid-performers, who had a similar level of accuracy to the AI, were most variable in terms of whether the AI recommendations helped or hurt their performance. In addition, we found that users' perception of the AI's performance relative on their own also had a significant impact on whether their accuracy improved when given AI recommendations. This work provides insights on the complexity of factors related to Human-AI collaboration and provides recommendations on how to develop human-centered AI algorithms to complement users in decision-making tasks.
Fair Machine Learning in Healthcare: A Review
Feng, Qizhang, Du, Mengnan, Zou, Na, Hu, Xia
Benefiting from the digitization of healthcare data and the development of computing power, machine learning methods are increasingly used in the healthcare domain. Fairness problems have been identified in machine learning for healthcare, resulting in an unfair allocation of limited healthcare resources or excessive health risks for certain groups. Therefore, addressing the fairness problems has recently attracted increasing attention from the healthcare community. However, the intersection of machine learning for healthcare and fairness in machine learning remains understudied. In this review, we build the bridge by exposing fairness problems, summarizing possible biases, sorting out mitigation methods and pointing out challenges along with opportunities for the future.
Towards Explainable Meta-Learning for DDoS Detection
Zhou, Qianru, Li, Rongzhen, Xu, Lei, Nallanathan, Arumugam, Yang, Jian, Fu, Anmin
The Internet is the most complex machine humankind has ever built, and how to defense it from intrusions is even more complex. With the ever increasing of new intrusions, intrusion detection task rely on Artificial Intelligence more and more. Interpretability and transparency of the machine learning model is the foundation of trust in AI-driven intrusion detection results. Current interpretation Artificial Intelligence technologies in intrusion detection are heuristic, which is neither accurate nor sufficient. This paper proposed a rigorous interpretable Artificial Intelligence driven intrusion detection approach, based on artificial immune system. Details of rigorous interpretation calculation process for a decision tree model is presented. Prime implicant explanation for benign traffic flow are given in detail as rule for negative selection of the cyber immune system. Experiments are carried out in real-life traffic.
AI accelerates AML processes across financial services
Financial regulators across Europe continue to levy steep enforcement fines against banks for failures to comply with know-your-customer (KYC) and anti-money laundering (AML) regulations. At the end of 2021, the Financial Conduct Authority (FCA) fined two of the UK's largest banks, HSBC and NatWest, a total of ยฃ328.95 million ($436.1 million) for failings in their money laundering processes. Meanwhile, members of the European Parliament are calling for cryptocurrencies to be governed by the European Commission's Anti-Money Laundering Authority, as illicit organisations continue to find new methods for laundering money through the financial system. Money laundering is a process that criminals use to hide the illegal source of their funds. By passing money through multiple, sometimes complex, transfers and transactions, the money is "cleaned" of its illegitimate origin and made to appear as legitimate business profits.
Classification Models: Supervised Machine Learning in Python
Describe the input and output of a classification model Prepare data with feature engineering techniques Tackle both binary and multiclass classification problems Implement Support Vector Machines, Naive Bayes, Decision Tree, Random Forest, K-Nearest Neighbors, Neural Networks, logistic regression models on Python Use a variety of performance metrics such as confusion matrix, accuracy, precision, recall, ROC curve and AUC score. Use a variety of performance metrics such as confusion matrix, accuracy, precision, recall, ROC curve and AUC score. Artificial intelligence and machine learning are touching our everyday lives in more-and-more ways. There's an endless supply of industries and applications that machine learning can make more efficient and intelligent. Supervised machine learning is the underlying method behind a large part of this.
Entity Anchored ICD Coding
DeYoung, Jay, Shing, Han-Chin, Kong, Luyang, Winestock, Christopher, Shivade, Chaitanya
Medical coding is a complex task, requiring assignment of a subset of over 72,000 ICD codes to a patient's notes. Modern natural language processing approaches to these tasks have been challenged by the length of the input and size of the output space. We limit our model inputs to a small window around medical entities found in our documents. From those local contexts, we build contextualized representations of both ICD codes and entities, and aggregate over these representations to form document-level predictions. In contrast to existing methods which use a representation fixed either in size or by codes seen in training, we represent ICD codes by encoding the code description with local context. We discuss metrics appropriate to deploying coding systems in practice. We show that our approach is superior to existing methods in both standard and deployable measures, including performance on rare and unseen codes. 1 Introduction Medical coding is the complex task aimed at generating a coded summary of clinical information associated with a patient.
Predictive Data Calibration for Linear Correlation Significance Testing
Patil, Kaustubh R., Eickhoff, Simon B., Langner, Robert
Inferring linear relationships lies at the heart of many empirical investigations. A measure of linear dependence should correctly evaluate the strength of the relationship as well as qualify whether it is meaningful for the population. Pearson's correlation coefficient (PCC), the \textit{de-facto} measure for bivariate relationships, is known to lack in both regards. The estimated strength $r$ maybe wrong due to limited sample size, and nonnormality of data. In the context of statistical significance testing, erroneous interpretation of a $p$-value as posterior probability leads to Type I errors -- a general issue with significance testing that extends to PCC. Such errors are exacerbated when testing multiple hypotheses simultaneously. To tackle these issues, we propose a machine-learning-based predictive data calibration method which essentially conditions the data samples on the expected linear relationship. Calculating PCC using calibrated data yields a calibrated $p$-value that can be interpreted as posterior probability together with a calibrated $r$ estimate, a desired outcome not provided by other methods. Furthermore, the ensuing independent interpretation of each test might eliminate the need for multiple testing correction. We provide empirical evidence favouring the proposed method using several simulations and application to real-world data.
Membership Inference Attacks Against Self-supervised Speech Models
Tseng, Wei-Cheng, Kao, Wei-Tsung, Lee, Hung-yi
Recently, adapting the idea of self-supervised learning (SSL) on continuous speech has started gaining attention. SSL models pre-trained on a huge amount of unlabeled audio can generate general-purpose representations that benefit a wide variety of speech processing tasks. Despite their ubiquitous deployment, however, the potential privacy risks of these models have not been well investigated. In this paper, we present the first privacy analysis on several SSL speech models using Membership Inference Attacks (MIA) under black-box access. The experiment results show that these pre-trained models are vulnerable to MIA and prone to membership information leakage with high Area Under the Curve (AUC) in both utterance-level and speaker-level. Furthermore, we also conduct several ablation studies to understand the factors that contribute to the success of MIA.
DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps
Bertucci, Donald, Hamid, Md Montaser, Anand, Yashwanthi, Ruangrotsakun, Anita, Tabatabai, Delyar, Perez, Melissa, Kahng, Minsuk
ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grouping and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap.