A Study of Automatic Metrics for the Evaluation of Natural Language Explanations Artificial Intelligence

As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (NL) explanations. Here, we explore parallels between the generation of such explanations and the much-studied field of evaluation of Natural Language Generation (NLG). Specifically, we investigate which of the NLG evaluation measures map well to explanations. We present the ExBAN corpus: a crowd-sourced corpus of NL explanations for Bayesian Networks. We run correlations comparing human subjective ratings with NLG automatic measures. We find that embedding-based automatic NLG evaluation methods, such as BERTScore and BLEURT, have a higher correlation with human ratings, compared to word-overlap metrics, such as BLEU and ROUGE. This work has implications for Explainable AI and transparent robotic and autonomous systems.

Artificial Intelligence for Social Good: A Survey Artificial Intelligence

Its impact is drastic and real: Youtube's AIdriven recommendation system would present sports videos for days if one happens to watch a live baseball game on the platform [1]; email writing becomes much faster with machine learning (ML) based auto-completion [2]; many businesses have adopted natural language processing based chatbots as part of their customer services [3]. AI has also greatly advanced human capabilities in complex decision-making processes ranging from determining how to allocate security resources to protect airports [4] to games such as poker [5] and Go [6]. All such tangible and stunning progress suggests that an "AI summer" is happening. As some put it, "AI is the new electricity" [7]. Meanwhile, in the past decade, an emerging theme in the AI research community is the so-called "AI for social good" (AI4SG): researchers aim at developing AI methods and tools to address problems at the societal level and improve the wellbeing of the society.

Using Social Network Information in Bayesian Truth Discovery Machine Learning

We investigate the problem of truth discovery based on opinions from multiple agents who may be unreliable or biased. We consider the case where agents' reliabilities or biases are correlated if they belong to the same community, which defines a group of agents with similar opinions regarding a particular event. An agent can belong to different communities for different events, and these communities are unknown \emph{a priori}. We incorporate knowledge of the agents' social network in our truth discovery framework and develop Laplace variational inference methods to estimate agents' reliabilities, communities, and the event states. We also develop a stochastic variational inference method to scale our model to large social networks. Simulations and experiments on real data suggest that when observations are sparse, our proposed methods perform better than several other inference methods, including majority voting, the popular Bayesian Classifier Combination (BCC) method, and the Community BCC method.

"I can assure you [$\ldots$] that it's going to be all right" -- A definition, case for, and survey of algorithmic assurances in human-autonomy trust relationships Machine Learning

As technology become more advanced, those who design, use and are otherwise affected by it want to know that it will perform correctly, and understand why it does what it does, and how to use it appropriately. In essence they want to be able to trust the systems that are being designed. In this survey we present assurances that are the method by which users can understand how to trust this technology. Trust between humans and autonomy is reviewed, and the implications for the design of assurances are highlighted. A survey of research that has been performed with respect to assurances is presented, and several key ideas are extracted in order to refine the definition of assurances. Several directions for future research are identified and discussed.

A Survey of Available Corpora for Building Data-Driven Dialogue Systems Artificial Intelligence

During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective.

Efficient Bayesian Learning in Social Networks with Gaussian Estimators Machine Learning

We consider a group of Bayesian agents who try to estimate a state of the world $\theta$ through interaction on a social network. Each agent $v$ initially receives a private measurement of $\theta$: a number $S_v$ picked from a Gaussian distribution with mean $\theta$ and standard deviation one. Then, in each discrete time iteration, each reveals its estimate of $\theta$ to its neighbors, and, observing its neighbors' actions, updates its belief using Bayes' Law. This process aggregates information efficiently, in the sense that all the agents converge to the belief that they would have, had they access to all the private measurements. We show that this process is computationally efficient, so that each agent's calculation can be easily carried out. We also show that on any graph the process converges after at most $2N \cdot D$ steps, where $N$ is the number of agents and $D$ is the diameter of the network. Finally, we show that on trees and on distance transitive-graphs the process converges after $D$ steps, and that it preserves privacy, so that agents learn very little about the private signal of most other agents, despite the efficient aggregation of information. Our results extend those in an unpublished manuscript of the first and last authors.