Goto

Collaborating Authors

 Overview


The Week in Business: Microsoft's Big Bet on A.I.

NYT > Business Day

Microsoft's often-overlooked search engine, Bing, is mounting a comeback with ChatGPT, the suddenly ubiquitous chatbot capable of composing song lyrics, writing academic essays and answering all manner of questions. The new version of Bing was released to a limited group of users on Tuesday. The revamped product is part of Microsoft's $13 billion investment in OpenAI, the artificial intelligence lab behind ChatGPT that Microsoft is betting on to stay competitive with its big tech rivals like Google, Apple and Meta. But those companies are also racing to incorporate the new technology into their own software. A day before the unveiling of the new Bing, Google announced that it would soon release an experimental chatbot called Bard for its own search engine, which is much more widely used than Bing.


Transfer Learning for Bayesian Optimization: A Survey

arXiv.org Artificial Intelligence

A wide spectrum of design and decision problems, including parameter tuning, A/B testing and drug design, intrinsically are instances of black-box optimization. Bayesian optimization (BO) is a powerful tool that models and optimizes such expensive "black-box" functions. However, at the beginning of optimization, vanilla Bayesian optimization methods often suffer from slow convergence issue due to inaccurate modeling based on few trials. To address this issue, researchers in the BO community propose to incorporate the spirit of transfer learning to accelerate optimization process, which could borrow strength from the past tasks (source tasks) to accelerate the current optimization problem (target task). This survey paper first summarizes transfer learning methods for Bayesian optimization from four perspectives: initial points design, search space design, surrogate model, and acquisition function. Then it highlights its methodological aspects and technical details for each approach. Finally, it showcases a wide range of applications and proposes promising future directions.


Multi-dimensional discrimination in Law and Machine Learning -- A comparative overview

arXiv.org Artificial Intelligence

AI-driven decision-making can lead to discrimination against certain individuals or social groups based on protected characteristics/attributes such as race, gender, or age. The domain of fairness-aware machine learning focuses on methods and algorithms for understanding, mitigating, and accounting for bias in AI/ML models. Still, thus far, the vast majority of the proposed methods assess fairness based on a single protected attribute, e.g. only gender or race. In reality, though, human identities are multi-dimensional, and discrimination can occur based on more than one protected characteristic, leading to the so-called ``multi-dimensional discrimination'' or ``multi-dimensional fairness'' problem. While well-elaborated in legal literature, the multi-dimensionality of discrimination is less explored in the machine learning community. Recent approaches in this direction mainly follow the so-called intersectional fairness definition from the legal domain, whereas other notions like additive and sequential discrimination are less studied or not considered thus far. In this work, we overview the different definitions of multi-dimensional discrimination/fairness in the legal domain as well as how they have been transferred/ operationalized (if) in the fairness-aware machine learning domain. By juxtaposing these two domains, we draw the connections, identify the limitations, and point out open research directions.


Deep Learning in Healthcare: An In-Depth Analysis

arXiv.org Artificial Intelligence

Deep learning (DL) along with never-ending advancements in computational processing and cloud technologies have bestowed us powerful analyzing tools and techniques in the past decade and enabled us to use and apply them in various fields of study. Health informatics is not an exception, and conversely, is the discipline that generates the most amount of data in today's era and can benefit from DL the most. Extracting features and finding complex patterns from a huge amount of raw data and transforming them into knowledge is a challenging task. Besides, various DL architectures have been proposed by researchers throughout the years to tackle different problems. In this paper, we provide a review of DL models and their broad application in bioinformatics and healthcare categorized by their architecture. In addition, we also go over some of the key challenges that still exist and can show up while conducting DL research.


A Survey on Spectral Graph Neural Networks

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) have attracted considerable attention from the research community. It is well established that GNNs are usually roughly divided into spatial and spectral methods. Despite that spectral GNNs play an important role in both graph signal processing and graph representation learning, existing studies are biased toward spatial approaches, and there is no comprehensive review on spectral GNNs so far. In this paper, we summarize the recent development of spectral GNNs, including model, theory, and application. Specifically, we first discuss the connection between spatial GNNs and spectral GNNs, which shows that spectral GNNs can capture global information and have better expressiveness and interpretability. Next, we categorize existing spectral GNNs according to the spectrum information they use, \ie, eigenvalues or eigenvectors. In addition, we review major theoretical results and applications of spectral GNNs, followed by a quantitative experiment to benchmark some popular spectral GNNs. Finally, we conclude the paper with some future directions.


Explaining text classifiers through progressive neighborhood approximation with realistic samples

arXiv.org Artificial Intelligence

The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent space underexplored. To overcome this problem, our work, focusing on local model-agnostic explanations for text classifiers, proposes a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage interpolation using counterfactuals as landmarks. We explicitly specify the two properties that should be satisfied by generative models, the reconstruction ability and the locality-preserving property, to guide the selection of generators for local explanation methods. Moreover, noticing the opacity of generative models during the study, we propose another method that implements progressive neighborhood approximation with probability-based editions as an alternative to the generator-based solution. The explanation results from both methods consist of word-level and instance-level explanations benefiting from the realistic neighborhood. Through exhaustive experiments, we qualitatively and quantitatively demonstrate the effectiveness of the two proposed methods.


A Human-Centered Review of Algorithms in Decision-Making in Higher Education

arXiv.org Artificial Intelligence

The use of algorithms for decision-making in higher education is steadily growing, promising cost-savings to institutions and personalized service for students but also raising ethical challenges around surveillance, fairness, and interpretation of data. To address the lack of systematic understanding of how these algorithms are currently designed, we reviewed an extensive corpus of papers proposing algorithms for decision-making in higher education. We categorized them based on input data, computational method, and target outcome, and then investigated the interrelations of these factors with the application of human-centered lenses: theoretical, participatory, or speculative design. We found that the models are trending towards deep learning, and increased use of student personal data and protected attributes, with the target scope expanding towards automated decisions. However, despite the associated decrease in interpretability and explainability, current development predominantly fails to incorporate human-centered lenses. We discuss the challenges with these trends and advocate for a human-centered approach.


UGAE: A Novel Approach to Non-exponential Discounting

arXiv.org Artificial Intelligence

The discounting mechanism in Reinforcement Learning determines the relative importance of future and present rewards. While exponential discounting is widely used in practice, non-exponential discounting methods that align with human behavior are often desirable for creating human-like agents. However, non-exponential discounting methods cannot be directly applied in modern on-policy actor-critic algorithms. To address this issue, we propose Universal Generalized Advantage Estimation (UGAE), which allows for the computation of GAE advantage values with arbitrary discounting. Additionally, we introduce Beta-weighted discounting, a continuous interpolation between exponential and hyperbolic discounting, to increase flexibility in choosing a discounting method. To showcase the utility of UGAE, we provide an analysis of the properties of various discounting methods. We also show experimentally that agents with non-exponential discounting trained via UGAE outperform variants trained with Monte Carlo advantage estimation. Through analysis of various discounting methods and experiments, we demonstrate the superior performance of UGAE with Beta-weighted discounting over the Monte Carlo baseline on standard RL benchmarks. UGAE is simple and easily integrated into any advantage-based algorithm as a replacement for the standard recursive GAE.


Spatio-Temporal Graph Neural Networks: A Survey

arXiv.org Artificial Intelligence

Graph Neural Networks have gained huge interest in the past few years. These powerful algorithms expanded deep learning models to non-Euclidean space and were able to achieve state of art performance in various applications including recommender systems and social networks. However, this performance is based on static graph structures assumption which limits the Graph Neural Networks performance when the data varies with time. Spatiotemporal Graph Neural Networks are extension of Graph Neural Networks that takes the time factor into account. Recently, various Spatiotemporal Graph Neural Network algorithms were proposed and achieved superior performance compared to other deep learning algorithms in several time dependent applications. This survey discusses interesting topics related to Spatiotemporal Graph Neural Networks, including algorithms, applications, and open challenges.


Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play

arXiv.org Artificial Intelligence

Standard empirical risk minimization (ERM) training can produce deep neural network (DNN) models that are accurate on average but under-perform in under-represented population subgroups, especially when there are imbalanced group distributions in the long-tailed training data. Therefore, approaches that improve the accuracy-group robustness trade-off frontier of a DNN model (i.e. improving worst-group accuracy without sacrificing average accuracy, or vice versa) is of crucial importance. Uncertainty-based active learning (AL) can potentially improve the frontier by preferentially sampling underrepresented subgroups to create a more balanced training dataset. However, the quality of uncertainty estimates from modern DNNs tend to degrade in the presence of spurious correlations and dataset bias, compromising the effectiveness of AL for sampling tail groups. In this work, we propose Introspective Self-play (ISP), a simple approach to improve the uncertainty estimation of a deep neural network under dataset bias, by adding an auxiliary introspection task requiring a model to predict the bias for each data point in addition to the label. We show that ISP provably improves the bias-awareness of the model representation and the resulting uncertainty estimates. On two real-world tabular and language tasks, ISP serves as a simple "plug-in" for AL model training, consistently improving both the tail-group sampling rate and the final accuracy-fairness trade-off frontier of popular AL methods.