Collaborating Authors

Information Technology: Overviews

Low-Code Programming Models

Communications of the ACM

Low-code has the potential to empower more people to automate tasks by creating computer programs.

Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Journal of Artificial Intelligence Research

Autonomous robots that can perform common tasks like driving, surveillance, and chores have the biggest potential for impact due to frequency of usage, and the biggest potential for risk due to direct interaction with humans. These tasks take place in openended environments where humans socially interact and pursue their goals in complex and diverse ways. To operate in such environments, such systems must predict this behaviour, especially when the behavior is unexpected and potentially dangerous. Therefore, we summarize trends in various types of tasks, modeling methods, datasets, and social interaction modules aimed at predicting the future location of dynamic, socially interactive agents. Furthermore, we describe long-tailed learning techniques from classification and regression problems that can be applied to prediction problems. To our knowledge this is the first work that reviews social interaction modeling within prediction, and long-tailed learning techniques within regression and prediction.

Reports of the Association for the Advancement of Artificial Intelligence's 2023 Summer Symposium Series

Interactive AI Magazine

The Association for the Advancement of Artificial Intelligence's Inaugural Summer Symposium Series was held held at Singapore EXPO in Singapore, July 17-19, 2023. There were five symposia in the summer program: Second Symposium on Human Partnership with Medical AI: Design, Operationalization, and Ethics, AI x Metaverse, Building Connections: From Human-Human to Human-AI Collaboration, Artificial Intelligence for FinTech (AI4FinTech), and Embodied Intelligence. Building on the success of the inaugural symposium held in 2021, the second symposium on Human Partnership with Medical AI delved deeper into the critical components of Trust, Ethics, and Security in the design and operationalization of Clinical AI. This year, the event aimed to continue the discussions and collaborations that started in the previous symposium and explore new avenues of clinical utility, trustworthiness, robustness, and responsible AI. The symposium brought together researchers, clinicians, policymakers, and stakeholders from various domains to discuss the challenges and opportunities of AI-human partnership, share their latest research and insights, and develop actionable strategies to create trustworthy, ethical, and secure AI systems.

YouTube's new AI feature helps you decide what to watch next


If a new feature from YouTube tests well, AI could help you decide what video to watch next. In an announcement on its official blog, the company detailed how, starting this week, it is using artificial intelligence to create video summaries that appear on both search and watch pages. The summaries, YouTube says, will make it easier for a user to find out information about a video and decide whether or not it's a good watch. These summaries, to be available on a "limited number" of English language videos, are only intended to provide brief overviews of a few lines each. The AI-produced summaries will not replace actual descriptions made by the video creators.

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Journal of Artificial Intelligence Research

Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP. In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees. With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.

A Model to Support Collective Reasoning: Formalization, Analysis and Computational Assessment

Journal of Artificial Intelligence Research

In this paper we propose a new model to represent human debates and methods to obtain collective conclusions from them. This model overcomes two drawbacks of existing approaches. First, our model does not assume that participants agree on the structure of the debate. It does this by allowing participants to express their opinion about all aspects of the debate. Second, our model does not assume that participants' opinions are rational, an assumption that significantly limits current approaches. Instead, we define a weaker notion of rationality that characterises coherent opinions, and we consider different scenarios based on the coherence of individual opinions and the level of consensus. We provide a formal analysis of different opinion aggregation functions that compute a collective decision based on the individual opinions and the debate structure. In particular, we demonstrate that aggregated opinions can be coherent even if there is a lack of consensus and individual opinions are not coherent. We conclude with an empirical evaluation demonstrating that collective opinions can be computed efficiently for real-sized debates.

The Complexity of Matching Games: A Survey

Journal of Artificial Intelligence Research

Matching games naturally generalize assignment games, a well-known class of cooperative games. Interest in matching games has grown recently due to some breakthrough results and new applications. This state-of-the-art survey provides an overview of matching games and extensions, such as b-matching games and partitioned matching games; the latter originating from the emerging area of international kidney exchange. In this survey we focus on computational complexity aspects of various game-theoretical solution concepts, such as the core, nucleolus and Shapley value, when the input is restricted to a matching game or one of its variants.

Towards Green Automated Machine Learning: Status Quo and Future Directions

Journal of Artificial Intelligence Research

Automated machine learning (AutoML) strives for the automatic configuration of machine learning algorithms and their composition into an overall (software) solution — a machine learning pipeline — tailored to the learning task (dataset) at hand. Over the last decade, AutoML has developed into an independent research field with hundreds of contributions. At the same time, AutoML is being criticized for its high resource consumption as many approaches rely on the (costly) evaluation of many machine learning pipelines, as well as the expensive large-scale experiments across many datasets and approaches. In the spirit of recent work on Green AI, this paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly. Therefore, we first elaborate on how to quantify the environmental footprint of an AutoML tool. Afterward, different strategies on how to design and benchmark an AutoML tool w.r.t. their “greenness”, i.e., sustainability, are summarized. Finally, we elaborate on how to be transparent about the environmental footprint and what kind of research incentives could direct the community in a more sustainable AutoML research direction. As part of this, we propose a sustainability checklist to be attached to every AutoML paper featuring all core aspects of Green AutoML.

Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text

Journal of Artificial Intelligence Research

Evaluation practices in natural language generation (NLG) have many known flaws, but improved evaluation approaches are rarely widely adopted. This issue has become more urgent, since neural generation models have improved to the point where their outputs can often no longer be distinguished based on the surface-level features that older metrics rely on. This paper surveys the issues with human and automatic model evaluations and with commonly used datasets in NLG that have been pointed out over the past 20 years. We summarize, categorize, and discuss how researchers have been addressing these issues and what their findings mean for the current state of model evaluations. Building on those insights, we lay out a long-term vision for evaluation research and propose concrete steps for researchers to improve their evaluation processes. Finally, we analyze 66 generation papers from recent NLP conferences in how well they already follow these suggestions and identify which areas require more drastic changes to the status quo.

FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?

Journal of Artificial Intelligence Research

The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. However, training such models and exploring their hyperparameter space is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, such works limit analysis to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model has 2.6× smaller size. FlexiBERT-Large, another proposed model, attains state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.