Overview
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges
Bischl, Bernd, Binder, Martin, Lang, Michel, Pielok, Tobias, Richter, Jakob, Coors, Stefan, Thomas, Janek, Ullmann, Theresa, Becker, Marc, Boulesteix, Anne-Laure, Deng, Difan, Lindauer, Marius
Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
Forgetting in Answer Set Programming -- A Survey
Gonçalves, Ricardo, Knorr, Matthias, Leite, João
Forgetting - or variable elimination - is an operation that allows the removal, from a knowledge base, of middle variables no longer deemed relevant. In recent years, many different approaches for forgetting in Answer Set Programming have been proposed, in the form of specific operators, or classes of such operators, commonly following different principles and obeying different properties. Each such approach was developed to somehow address some particular view on forgetting, aimed at obeying a specific set of properties deemed desirable in such view, but a comprehensive and uniform overview of all the existing operators and properties is missing. In this paper, we thoroughly examine existing properties and (classes of) operators for forgetting in Answer Set Programming, drawing a complete picture of the landscape of these classes of forgetting operators, which includes many novel results on relations between properties and operators, including considerations on concrete operators to compute results of forgetting and computational complexity. Our goal is to provide guidance to help users in choosing the operator most adequate for their application requirements.
A Survey on Data Augmentation for Text Classification
Bayer, Markus, Kaufhold, Marc-André, Reuter, Christian
Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).
Two-thirds of Americans, 227 million, play video games. For many games were an escape, stress relief in pandemic
Yes, we did play more video games during the coronavirus pandemic. Hey, when you are asked to stay at home and social distance as a way to stop or at least slow the spread of COVID-19, who could blame you for bingeing on "Animal Crossing," "Call of Duty" or "Fortnite." More than half of players (55%) said they played more games during the pandemic, and most players (90%) said they will continue playing after the country opens up, according to a survey of 4,000 U.S. adults conducted by market research firm Ipsos in February 2021 for the Entertainment Software Association. For players during the pandemic, video games were a source of stress relief (55%) and distraction (48%), the survey found. Video games also served as an escape and a break for children, 71% of parents surveyed said.
Indian Legal NLP Benchmarks : A Survey
Kalamkar, Prathamesh, D., Janani Venugopalan Ph., D, Vivek Raghavan Ph.
Availability of challenging benchmarks is the key to advancement of AI in a specific field.Since Legal Text is significantly different than normal English text, there is a need to create separate Natural Language Processing benchmarks for Indian Legal Text which are challenging and focus on tasks specific to Legal Systems. This will spur innovation in applications of Natural language Processing for Indian Legal Text and will benefit AI community and Legal fraternity. We review the existing work in this area and propose ideas to create new benchmarks for Indian Legal Natural Language Processing.
A Classification of Artificial Intelligence Systems for Mathematics Education
Van Vaerenbergh, Steven, Pérez-Suay, Adrián
This chapter provides an overview of the different Artificial Intelligence (AI) systems that are being used in contemporary digital tools for Mathematics Education (ME). It is aimed at researchers in AI and Machine Learning (ML), for whom we shed some light on the specific technologies that are being used in educational applications; and at researchers in ME, for whom we clarify: i) what the possibilities of the current AI technologies are, ii) what is still out of reach and iii) what is to be expected in the near future. We start our analysis by establishing a high-level taxonomy of AI tools that are found as components in digital ME applications. Then, we describe in detail how these AI tools, and in particular ML, are being used in two key applications, specifically AI-based calculators and intelligent tutoring systems. We finish the chapter with a discussion about student modeling systems and their relationship to artificial general intelligence.
Region attention and graph embedding network for occlusion objective class-based micro-expression recognition
Mao, Qirong, Zhou, Ling, Zheng, Wenming, Shao, Xiuyan, Huang, Xiaohua
Micro-expression recognition (\textbf{MER}) has attracted lots of researchers' attention in a decade. However, occlusion will occur for MER in real-world scenarios. This paper deeply investigates an interesting but unexplored challenging issue in MER, \ie, occlusion MER. First, to research MER under real-world occlusion, synthetic occluded micro-expression databases are created by using various mask for the community. Second, to suppress the influence of occlusion, a \underline{R}egion-inspired \underline{R}elation \underline{R}easoning \underline{N}etwork (\textbf{RRRN}) is proposed to model relations between various facial regions. RRRN consists of a backbone network, the Region-Inspired (\textbf{RI}) module and Relation Reasoning (\textbf{RR}) module. More specifically, the backbone network aims at extracting feature representations from different facial regions, RI module computing an adaptive weight from the region itself based on attention mechanism with respect to the unobstructedness and importance for suppressing the influence of occlusion, and RR module exploiting the progressive interactions among these regions by performing graph convolutions. Experiments are conducted on handout-database evaluation and composite database evaluation tasks of MEGC 2018 protocol. Experimental results show that RRRN can significantly explore the importance of facial regions and capture the cooperative complementary relationship of facial regions for MER. The results also demonstrate RRRN outperforms the state-of-the-art approaches, especially on occlusion, and RRRN acts more robust to occlusion.
Explainable AI: current status and future directions
Gohel, Prashant, Singh, Priyanka, Mohanty, Manoranjan
Explainable Artificial Intelligence (XAI) is an emerging area of research in the field of Artificial Intelligence (AI). XAI can explain how AI obtained a particular solution (e.g., classification or object detection) and can also answer other "wh" questions. This explainability is not possible in traditional AI. Explainability is essential for critical applications, such as defense, health care, law and order, and autonomous driving vehicles, etc, where the know-how is required for trust and transparency. A number of XAI techniques so far have been purposed for such applications. This paper provides an overview of these techniques from a multimedia (i.e., text, image, audio, and video) point of view. The advantages and shortcomings of these techniques have been discussed, and pointers to some future directions have also been provided.
Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations
Bing, Xin, Bunea, Florentina, Strimas-Mackey, Seth, Wegkamp, Marten
This paper studies the estimation of high-dimensional, discrete, possibly sparse, mixture models in topic models. The data consists of observed multinomial counts of $p$ words across $n$ independent documents. In topic models, the $p\times n$ expected word frequency matrix is assumed to be factorized as a $p\times K$ word-topic matrix $A$ and a $K\times n$ topic-document matrix $T$. Since columns of both matrices represent conditional probabilities belonging to probability simplices, columns of $A$ are viewed as $p$-dimensional mixture components that are common to all documents while columns of $T$ are viewed as the $K$-dimensional mixture weights that are document specific and are allowed to be sparse. The main interest is to provide sharp, finite sample, $\ell_1$-norm convergence rates for estimators of the mixture weights $T$ when $A$ is either known or unknown. For known $A$, we suggest MLE estimation of $T$. Our non-standard analysis of the MLE not only establishes its $\ell_1$ convergence rate, but reveals a remarkable property: the MLE, with no extra regularization, can be exactly sparse and contain the true zero pattern of $T$. We further show that the MLE is both minimax optimal and adaptive to the unknown sparsity in a large class of sparse topic distributions. When $A$ is unknown, we estimate $T$ by optimizing the likelihood function corresponding to a plug in, generic, estimator $\hat{A}$ of $A$. For any estimator $\hat{A}$ that satisfies carefully detailed conditions for proximity to $A$, the resulting estimator of $T$ is shown to retain the properties established for the MLE. The ambient dimensions $K$ and $p$ are allowed to grow with the sample sizes. Our application is to the estimation of 1-Wasserstein distances between document generating distributions. We propose, estimate and analyze new 1-Wasserstein distances between two probabilistic document representations.
Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Zhang, Ruohan, Torabi, Faraz, Warnell, Garrett, Stone, Peter
With respect to artificial learning agents in particular, humans must provide some specification of what the agent should learn to perform. One method by which humans typically provide this specification is by designing a stationary reward function. This function provides a reward to the agent when it correctly performs the desired task and, perhaps, punishment when the agent does not. Artificial learning agents may then approach the task-learning process using reinforcement learning (RL) techniques (Sutton and Barto, 2018) that seek to find a policy (i.e., an explicit function that the agent uses to make decisions) that allows the agent to gather as much reward as possible. Another popular way in which humans specify tasks for artificial agents to learn is by demonstrating the task themselves. Typically, this is accomplished by having the human perform the task while the learning agent observes the actions that the human takes (e.g., the human physically moving a robot arm). In these cases, artificial agents may use approaches from imitation learning (IL) (Schaal, 1999; Argall et al., 2009; Osa et al., 2018) in order to find policies that allow them to perform the demonstrated task.