Goto

Collaborating Authors

 Overview


machine learning in public health

#artificialintelligence

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Another prominent example in this regard came from DeepMind's publication of the possible protein structures associated with the COVID-19 virus (SARS-CoV-2) using their AlphaFold system. For example, our process of vetting results in the Global Burden of Disease Study [2] included the visual inspection of thousands of plots showing data together with model estimates. Our experience developing methods for computer certification of verbal autopsy has bolstered our belief that using an explainable approach, even with a reduction in accuracy, can be superior. Qualified practitioners are in short supply. There is increasing awareness that health … enhancing the ability to see and navigate in a procedure. Going beyond the conventional long-haul process, AI techniques are increasingly being applied to accelerate the fundamental processes of early-stage candidate selection and mechanism discovery. This could be the biggest impact of AI tools as it can potentially transform the quality of life for billions of people around the world. These technologies are also being used in the following ways: Preventing crime: AI and machine learning help authorities track and manage the huge amount of data generated by public surveillance devices, and analyze that data in real time for anomalies and threats.


A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives

arXiv.org Artificial Intelligence

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various application tasks. These pretraining methods are frequently extended with recurrence, adversarial or linguistic property masking, and more recently with contrastive learning objectives. Contrastive self-supervised training objectives enabled recent successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. However, in NLP, automated creation of text input augmentations is still very challenging because a single token can invert the meaning of a sentence. For this reason, some contrastive NLP pretraining methods contrast over input-label pairs, rather than over input-input pairs, using methods from Metric Learning and Energy Based Models. In this survey, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, few or zero-shot learning, pretraining data-efficiency and specific NLP end-tasks. We introduce key contrastive learning concepts with lessons learned from prior research and structure works by applications and cross-field relations. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.


Benchmarking and Survey of Explanation Methods for Black Box Models

arXiv.org Artificial Intelligence

The widespread adoption of black-box models in Artificial Intelligence has enhanced the need for explanation methods to reveal how these obscure models reach specific decisions. Retrieving explanations is fundamental to unveil possible biases and to resolve practical or ethical issues. Nowadays, the literature is full of methods with different explanations. We provide a categorization of explanation methods based on the type of explanation returned. We present the most recent and widely used explainers, and we show a visual comparison among explanations and a quantitative benchmarking.


A Sufficient Statistic for Influence in Structured Multiagent Environments

Journal of Artificial Intelligence Research

Making decisions in complex environments is a key challenge in artificial intelligence (AI). Situations involving multiple decision makers are particularly complex, leading to computational intractability of principled solution methods. A body of work in AI has tried to mitigate this problem by trying to distill interaction to its essence: how does the policy of one agent influence another agent? If we can find more compact representations of such influence, this can help us deal with the complexity, for instance by searching the space of influences rather than the space of policies. However, so far these notions of influence have been restricted in their applicability to special cases of interaction. In this paper we formalize influence-based abstraction (IBA), which facilitates the elimination of latent state factors without any loss in value, for a very general class of problems described as factored partially observable stochastic games (fPOSGs). On the one hand, this generalizes existing descriptions of influence, and thus can serve as the foundation for improvements in scalability and other insights in decision making in complex multiagent settings. On the other hand, since the presence of other agents can be seen as a generalization of single agent settings, our formulation of IBA also provides a sufficient statistic for decision making under abstraction for a single agent. We also give a detailed discussion of the relations to such previous works, identifying new insights and interpretations of these approaches. In these ways, this paper deepens our understanding of abstraction in a wide range of sequential decision making settings, providing the basis for new approaches and algorithms for a large class of problems.


Set-valued classification -- overview via a unified framework

arXiv.org Machine Learning

Multi-class classification problem is among the most popular and well-studied statistical frameworks. Modern multi-class datasets can be extremely ambiguous and single-output predictions fail to deliver satisfactory performance. By allowing predictors to predict a set of label candidates, set-valued classification offers a natural way to deal with this ambiguity. Several formulations of set-valued classification are available in the literature and each of them leads to different prediction strategies. The present survey aims to review popular formulations using a unified statistical framework. The proposed framework encompasses previously considered and leads to new formulations as well as it allows to understand underlying trade-offs of each formulation. We provide infinite sample optimal set-valued classification strategies and review a general plug-in principle to construct data-driven algorithms. The exposition is supported by examples and pointers to both theoretical and practical contributions. Finally, we provide experiments on real-world datasets comparing these approaches in practice and providing general practical guidelines.


Designing Explanations for Group Recommender Systems

arXiv.org Artificial Intelligence

Explanations are used in recommender systems for various reasons. Users have to be supported in making (high-quality) decisions more quickly. Developers of recommender systems want to convince users to purchase specific items. Users should better understand how the recommender system works and why a specific item has been recommended. Users should also develop a more in-depth understanding of the item domain. Consequently, explanations are designed in order to achieve specific \emph{goals} such as increasing the transparency of a recommendation or increasing a user's trust in the recommender system. In this paper, we provide an overview of existing research related to explanations in recommender systems, and specifically discuss aspects relevant to group recommendation scenarios. In this context, we present different ways of explaining and visualizing recommendations determined on the basis of preference aggregation strategies.


Is AI-enabled radiomics the next frontier in oncology?

#artificialintelligence

Many professionals in healthcare information technology might not yet be familiar with radiomics, but the emerging technology could potentially have a big impact on the future of cancer care. Radiomics might be most easily explained as "imaging analytics." The technology uses AI and machine learning to extract high-dimensional data from standard medical images such as CT, MRI and PET scans to provide more than 1,500 data points that deliver new insights about the tumors or lesions in those images that cannot be obtained via traditional approaches. Healthcare IT News asked Rose Higgins, CEO of radiomics pioneer HealthMyne, for a primer on radiomics, and to discuss its place in the realm of health IT. Q: Please explain what radiomics is, and how it works.


Position Information in Transformers: An Overview

arXiv.org Artificial Intelligence

Transformers are arguably the main workhorse in recent Natural Language Processing research. By definition a Transformer is invariant with respect to reorderings of the input. However, language is inherently sequential and word order is essential to the semantics and syntax of an utterance. In this paper, we provide an overview of common methods to incorporate position information into Transformer models. The objectives of this survey are to i) showcase that position information in Transformer is a vibrant and extensive research area; ii) enable the reader to compare existing methods by providing a unified notation and meaningful clustering; iii) indicate what characteristics of an application should be taken into account when selecting a position encoding; iv) provide stimuli for future research. The Transformer model as introduced by Vaswani et al. (2017) has been found to perform well for many tasks, such as machine translation or language modeling. With the rise of pretrained language models (PLMs) (Peters et al., 2018; Howard & Ruder, 2018; Devlin et al., 2019; Brown et al., 2020) Transformer models have become even more popular. As a result they are at the core of many state of the art natural language processing (NLP) models. A Transformer model consists of several layers, or blocks. Each layer is a self-attention (Vaswani et al., 2017) module followed by a feed-forward layer. Layer normalization and residual connections are additional components of a layer.


Handling Epistemic and Aleatory Uncertainties in Probabilistic Circuits

arXiv.org Artificial Intelligence

When collaborating with an AI system, we need to assess when to trust its recommendations. If we mistakenly trust it in regions where it is likely to err, catastrophic failures may occur, hence the need for Bayesian approaches for probabilistic reasoning in order to determine the confidence (or epistemic uncertainty) in the probabilities in light of the training data. We propose an approach to overcome the independence assumption behind most of the approaches dealing with a large class of probabilistic reasoning that includes Bayesian networks as well as several instances of probabilistic logic. We provide an algorithm for Bayesian learning from sparse, albeit complete, observations, and for deriving inferences and their confidences keeping track of the dependencies between variables when they are manipulated within the unifying computational formalism provided by probabilistic circuits. Each leaf of such circuits is labelled with a beta-distributed random variable that provides us with an elegant framework for representing uncertain probabilities. We achieve better estimation of epistemic uncertainty than state-of-the-art approaches, including highly engineered ones, while being able to handle general circuits and with just a modest increase in the computational effort compared to using point probabilities.


Attention Models for Point Clouds in Deep Learning: A Survey

arXiv.org Artificial Intelligence

Recently, the advancement of 3D point clouds in deep learning has attracted intensive research in different application domains such as computer vision and robotic tasks. However, creating feature representation of robust, discriminative from unordered and irregular point clouds is challenging. In this paper, our ultimate goal is to provide a comprehensive overview of the point clouds feature representation which uses attention models. More than 75+ key contributions in the recent three years are summarized in this survey, including the 3D objective detection, 3D semantic segmentation, 3D pose estimation, point clouds completion etc. We provide a detailed characterization (1) the role of attention mechanisms, (2) the usability of attention models into different tasks, (3) the development trend of key technology.