South America
StreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm
Oslandsbotn, Andreas, Kereta, Zeljko, Naumova, Valeriya, Freund, Yoav, Cloninger, Alexander
Kernel ridge regression (KRR) is a popular scheme for non-linear non-parametric learning. However, existing implementations of KRR require that all the data is stored in the main memory, which severely limits the use of KRR in contexts where data size far exceeds the memory size. Such applications are increasingly common in data mining, bioinformatics, and control. A powerful paradigm for computing on data sets that are too large for memory is the streaming model of computation, where we process one data sample at a time, discarding each sample before moving on to the next one. In this paper, we propose StreaMRAK - a streaming version of KRR. StreaMRAK improves on existing KRR schemes by dividing the problem into several levels of resolution, which allows continual refinement to the predictions. The algorithm reduces the memory requirement by continuously and efficiently integrating new samples into the training model. With a novel sub-sampling scheme, StreaMRAK reduces memory and computational complexities by creating a sketch of the original data, where the sub-sampling density is adapted to the bandwidth of the kernel and the local dimensionality of the data. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum. The results show that the proposed algorithm is fast and accurate.
SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification by Utilising the Notion of "Subjectivity" and "Identity Terms"
Zhao, Zhixue, Zhang, Ziqi, Hopfgartner, Frank
Toxic comment classification models are often found biased toward identity terms which are terms characterizing a specific group of people such as "Muslim" and "black". Such bias is commonly reflected in false-positive predictions, i.e. non-toxic comments with identity terms. In this work, we propose a novel approach to tackle such bias in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that when a comment is made about a group of people that is characterized by an identity term, the likelihood of that comment being toxic is associated with the subjectivity level of the comment, i.e. the extent to which the comment conveys personal feelings and opinions. Building upon the BERT model, we propose a new structure that is able to leverage these features, and thoroughly evaluate our model on 4 datasets of varying sizes and representing different social media platforms. The results show that our model can consistently outperform BERT and a SOTA model devised to address identity term bias in a different way, with a maximum improvement in F1 of 2.43% and 1.91% respectively.
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Csordás, Róbert, Irie, Kazuki, Schmidhuber, Jürgen
Recently, many datasets have been proposed to test the systematic generalization ability of neural networks. The companion baseline Transformers, typically trained with default hyper-parameters from standard tasks, are shown to fail dramatically. Here we demonstrate that by revisiting model configurations as basic as scaling of embeddings, early stopping, relative positional embedding, and Universal Transformer variants, we can drastically improve the performance of Transformers on systematic generalization. We report improvements on five popular datasets: SCAN, CFQ, PCFG, COGS, and Mathematics dataset. Our models improve accuracy from 50% to 85% on the PCFG productivity split, and from 35% to 81% on COGS. On SCAN, relative positional embedding largely mitigates the EOS decision problem (Newman et al., 2020), yielding 100% accuracy on the length split with a cutoff at 26. Importantly, performance differences between these models are typically invisible on the IID data split. This calls for proper generalization validation sets for developing neural networks that generalize systematically. We publicly release the code to reproduce our results.
Image In painting Applied to Art Completing Escher's Print Gallery
Cipolina-Kun, Lucia, Caenazzo, Simone, Mazzei, Gaston, Menon, Aditya Srinivas
This extended abstract presents the first stages of a research on in-painting suited for art reconstruction. We introduce M.C Eschers Print Gallery lithography as a use case example. This artwork presents a void on its center and additionally, it follows a challenging mathematical structure that needs to be preserved by the in-painting method. We present our work so far and our future line of research.
CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models
Akula, Arjun R., Wang, Keze, Liu, Changsong, Saba-Sadiya, Sari, Lu, Hongjing, Todorovic, Sinisa, Chai, Joyce, Zhu, Song-Chun
We propose CX-ToM, short for counterfactual explanations with theory-of mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e. dialog, between the machine and human user. More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling human's intention, machine's mind as inferred by the human as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class c_pred, a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class c_alt. We argue that, due to the iterative, conceptual and counterfactual nature of CX-ToM explanations, our framework is practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art explainable AI models.
Fairness via AI: Bias Reduction in Medical Information
Dori-Hacohen, Shiri, Montenegro, Roberto, Murai, Fabricio, Hale, Scott A., Sung, Keen, Blain, Michela, Edwards-Johnson, Jennifer
Most Fairness in AI research focuses on exposing biases in AI systems. A broader lens on fairness reveals that AI can serve a greater aspiration: rooting out societal inequities from their source. Specifically, we focus on inequities in health information, and aim to reduce bias in that domain using AI. The AI algorithms under the hood of search engines and social media, many of which are based on recommender systems, have an outsized impact on the quality of medical and health information online. Therefore, embedding bias detection and reduction into these recommender systems serving up medical and health content online could have an outsized positive impact on patient outcomes and wellbeing. In this position paper, we offer the following contributions: (1) we propose a novel framework of Fairness via AI, inspired by insights from medical education, sociology and antiracism; (2) we define a new term, bisinformation, which is related to, but distinct from, misinformation, and encourage researchers to study it; (3) we propose using AI to study, detect and mitigate biased, harmful, and/or false health information that disproportionately hurts minority groups in society; and (4) we suggest several pillars and pose several open problems in order to seed inquiry in this new space. While part (3) of this work specifically focuses on the health domain, the fundamental computer science advances and contributions stemming from research efforts in bias reduction and Fairness via AI have broad implications in all areas of society.
FBCNN: A Deep Neural Network Architecture for Portable and Fast Brain-Computer Interfaces
Bassi, Pedro R. A. S., Attux, Romis
Objective: To propose a novel deep neural network (DNN) architecture -- the filter bank convolutional neural network (FBCNN) -- to improve SSVEP classification in single-channel BCIs with small data lengths. Methods: We propose two models: the FBCNN-2D and the FBCNN-3D. The FBCNN-2D utilizes a filter bank to create sub-band components of the electroencephalography (EEG) signal, which it transforms using the fast Fourier transform (FFT) and analyzes with a 2D CNN. The FBCNN-3D utilizes the same filter bank, but it transforms the sub-band components into spectrograms via short-time Fourier transform (STFT), and analyzes them with a 3D CNN. We made use of transfer learning. To train the FBCNN-3D, we proposed a new technique, called inter-dimensional transfer learning, to transfer knowledge from a 2D DNN to a 3D DNN. Our BCI was conceived so as not to require calibration from the final user: therefore, the test subject data was separated from training and validation. Results: The mean test accuracy was 85.7% for the FBCCA-2D and 85% for the FBCCA-3D. Mean F1-Scores were 0.858 and 0.853. Alternative classification methods, SVM, FBCCA and a CNN, had mean accuracy of 79.2%, 80.1% and 81.4%, respectively. Conclusion: The FBCNNs surpassed traditional SSVEP classification methods in our simulated BCI, by a considerable margin (about 5% higher accuracy). Transfer learning and inter-dimensional transfer learning made training much faster and more predictable. Significance: We proposed a new and flexible type of DNN, which had a better performance than standard methods in SSVEP classification for portable and fast BCIs.
Super-resolution data assimilation
Barthélémy, Sébastien, Brajard, Julien, Bertino, Laurent, Counillon, François
Increasing the resolution of a model can improve the performance of a data assimilation system: first because model field are in better agreement with high resolution observations, then the corrections are better sustained and, with ensemble data assimilation, the forecast error covariances are improved. However, resolution increase is associated with a cubical increase of the computational costs. Here we are testing an approach inspired from images super-resolution techniques and called "Super-resolution data assimilation" (SRDA). Starting from a low-resolution forecast, a neural network (NN) emulates a high-resolution field that is then used to assimilate high-resolution observations. We apply the SRDA to a quasi-geostrophic model representing simplified surface ocean dynamics, with a model resolution up to four times lower than the reference high-resolution and we use the Ensemble Kalman Filter data assimilation method. We show that SRDA outperforms the low-resolution data assimilation approach and a SRDA version with cubic spline interpolation instead of NN. The NN's ability to anticipate the systematic differences between low and high resolution model dynamics explains the enhanced performance, for example by correcting the difference of propagation speed of eddies. Increasing the computational cost by 55\% above the LR data assimilation system (using a 25-members ensemble), the SRDA reduces the errors by 40\% making the performance very close to the HR system (16\% larger, compared to 92\% larger for the LR EnKF). The reliability of the ensemble system is not degraded by SRDA.
Representation Learning for Efficient and Effective Similarity Search and Recommendation
How data is represented and operationalized is critical for building computational solutions that are both effective and efficient. A common approach is to represent data objects as binary vectors, denoted \textit{hash codes}, which require little storage and enable efficient similarity search through direct indexing into a hash table or through similarity computations in an appropriate space. Due to the limited expressibility of hash codes, compared to real-valued representations, a core open challenge is how to generate hash codes that well capture semantic content or latent properties using a small number of bits, while ensuring that the hash codes are distributed in a way that does not reduce their search efficiency. State of the art methods use representation learning for generating such hash codes, focusing on neural autoencoder architectures where semantics are encoded into the hash codes by learning to reconstruct the original inputs of the hash codes. This thesis addresses the above challenge and makes a number of contributions to representation learning that (i) improve effectiveness of hash codes through more expressive representations and a more effective similarity measure than the current state of the art, namely the Hamming distance, and (ii) improve efficiency of hash codes by learning representations that are especially suited to the choice of search method. The contributions are empirically validated on several tasks related to similarity search and recommendation.
Stimuli-Aware Visual Emotion Analysis
Yang, Jingyuan, Li, Jie, Wang, Xiumei, Ding, Yuxuan, Gao, Xinbo
Visual emotion analysis (VEA) has attracted great attention recently, due to the increasing tendency of expressing and understanding emotions through images on social networks. Different from traditional vision tasks, VEA is inherently more challenging since it involves a much higher level of complexity and ambiguity in human cognitive process. Most of the existing methods adopt deep learning techniques to extract general features from the whole image, disregarding the specific features evoked by various emotional stimuli. Inspired by the \textit{Stimuli-Organism-Response (S-O-R)} emotion model in psychological theory, we proposed a stimuli-aware VEA method consisting of three stages, namely stimuli selection (S), feature extraction (O) and emotion prediction (R). First, specific emotional stimuli (i.e., color, object, face) are selected from images by employing the off-the-shelf tools. To the best of our knowledge, it is the first time to introduce stimuli selection process into VEA in an end-to-end network. Then, we design three specific networks, i.e., Global-Net, Semantic-Net and Expression-Net, to extract distinct emotional features from different stimuli simultaneously. Finally, benefiting from the inherent structure of Mikel's wheel, we design a novel hierarchical cross-entropy loss to distinguish hard false examples from easy ones in an emotion-specific manner. Experiments demonstrate that the proposed method consistently outperforms the state-of-the-art approaches on four public visual emotion datasets. Ablation study and visualizations further prove the validity and interpretability of our method.