Goto

Collaborating Authors

 Energy


Estimating Amazon Carbon Stock Using AI-based Remote Sensing

Communications of the ACM

Forests are the major terrestrial ecosystem responsible for carbon sequestration and storage. The Amazon rainforest is the world's largest tropical rainforest encompassing up to 2,124,000 square miles, covering a large area in South America including nine countries. The majority of that area (69%) lies in Brazil. Thus, Amazonia holds about 20% of the total carbon contained in the world's terrestrial vegetation.1,5,7 But the rampant deforestation due to illegal logging, mining, cattle ranching, and soy plantation are examples of threats to the vast region.


Understanding and Improving Fast Adversarial Training

arXiv.org Machine Learning

A recent line of work focused on making adversarial training computationally efficient for deep learning models. In particular, Wong et al. (2020) showed that $\ell_\infty$-adversarial training with fast gradient sign method (FGSM) can fail due to a phenomenon called "catastrophic overfitting", when the model quickly loses its robustness over a single epoch of training. We show that adding a random step to FGSM, as proposed in Wong et al. (2020), does not prevent catastrophic overfitting, and that randomness is not important per se -- its main role being simply to reduce the magnitude of the perturbation. Moreover, we show that catastrophic overfitting is not inherent to deep and overparametrized networks, but can occur in a single-layer convolutional network with a few filters. In an extreme case, even a single filter can make the network highly non-linear locally, which is the main reason why FGSM training fails. Based on this observation, we propose a new regularization method, GradAlign, that prevents catastrophic overfitting by explicitly maximizing the gradient alignment inside the perturbation set and improves the quality of the FGSM solution. As a result, GradAlign allows to successfully apply FGSM training also for larger $\ell_\infty$-perturbations and reduce the gap to multi-step adversarial training. The code of our experiments is available at https://github.com/tml-epfl/understanding-fast-adv-training.


Efficiently Mitigating Classification Bias via Transfer Learning

arXiv.org Machine Learning

Prediction bias in machine learning models refers to unintended model behaviors that discriminate against inputs mentioning or produced by certain groups; for example, hate speech classifiers predict more false positives for neutral text mentioning specific social groups. Mitigating bias for each task or domain is inefficient, as it requires repetitive model training, data annotation (e.g., demographic information), and evaluation. In pursuit of a more accessible solution, we propose the Upstream Bias Mitigation for Downstream Fine-Tuning (UBM) framework, which mitigate one or multiple bias factors in downstream classifiers by transfer learning from an upstream model. In the upstream bias mitigation stage, explanation regularization and adversarial training are applied to mitigate multiple bias factors. In the downstream fine-tuning stage, the classifier layer of the model is re-initialized, and the entire model is fine-tuned to downstream tasks in potentially novel domains without any further bias mitigation. We expect downstream classifiers to be less biased by transfer learning from de-biased upstream models. We conduct extensive experiments varying the similarity between the source and target data, as well as varying the number of dimensions of bias (e.g., discrimination against specific social groups or dialects). Our results indicate the proposed UBM framework can effectively reduce bias in downstream classifiers.



Adam with Bandit Sampling for Deep Learning

arXiv.org Machine Learning

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities. We theoretically show that Adambs improves the convergence rate of Adam---$O(\sqrt{\frac{\log n}{T} })$ instead of $O(\sqrt{\frac{n}{T}})$ in some cases. Experiments on various models and datasets demonstrate Adambs's fast convergence in practice.


Differentiable Open-Ended Commonsense Reasoning

arXiv.org Artificial Intelligence

Current commonsense reasoning research mainly focuses on developing models that use commonsense knowledge to answer multiple-choice questions. However, systems designed to answer multiple-choice questions may not be useful in applications that do not provide a small list of possible candidate answers to choose from. As a step towards making commonsense reasoning research more realistic, we propose to study open-ended commonsense reasoning (OpenCSR) -- the task of answering a commonsense question without any pre-defined choices, using as a resource only a corpus of commonsense facts written in natural language. The task is challenging due to a much larger decision space, and because many commonsense questions require multi-hop reasoning. We propose an efficient differentiable model for multi-hop reasoning over knowledge facts, named DrFact. We evaluate our approach on a collection of re-formatted, open-ended versions of popular tests targeting commonsense reasoning, and show that our approach outperforms strong baseline methods by a large margin.


Implicit Variational Inference: the Parameter and the Predictor Space

arXiv.org Artificial Intelligence

Having access to accurate confidence levels along with the predictions allows to determine whether making a decision is worth the risk. Under the Bayesian paradigm, the posterior distribution over parameters is used to capture model uncertainty, a valuable information that can be translated into predictive uncertainty. However, computing the posterior distribution for high capacity predictors, such as neural networks, is generally intractable, making approximate methods such as variational inference a promising alternative. While most methods perform inference in the space of parameters, we explore the benefits of carrying inference directly in the space of predictors. Relying on a family of distributions given by a deep generative neural network, we present two ways of carrying variational inference: one in \emph{parameter space}, one in \emph{predictor space}. Importantly, the latter requires us to choose a distribution of inputs, therefore allowing us at the same time to explicitly address the question of \emph{out-of-distribution} uncertainty. We explore from various perspectives the implications of working in the predictor space induced by neural networks as opposed to the parameter space, focusing mainly on the quality of uncertainty estimation for data lying outside of the training distribution. We compare posterior approximations obtained with these two methods to several standard methods and present results showing that variational approximations learned in the predictor space distinguish themselves positively from those trained in the parameter space.


High-confidence approach for artificial intelligence-based models

#artificialintelligence

They call it artificial intelligence--not because the intelligence is somehow fake. It's real intelligence, but it's still made by humans. That means AI--a power tool that can add speed, efficiency, insight and accuracy to a researcher's work--has many limitations. It's only as good as the methods and data it has been given. On its own, it doesn't know if information is missing, how much weight to give differing kinds of information or whether the data it draws on is incorrect or corrupted.


Saudi Arabia signs MoUs with IBM, Alibaba and Huawei on AI

#artificialintelligence

SDAIA and Alibaba Cloud announced an MoU to partner in supporting Saudi Arabia's path to develop smart cities through AI, SPA said. "Saudi Arabia's Vision 2030 has clear goals to transform KSA cities into smart ones by unlocking the value of city data as a national asset to realize Vision 2030 aspirations," said Abdullah Bin Sharaf Alghandi, President of SDAIA. SDAIA and Huawei signed an MOU to recognise Arabic language and character using AI technology and with the help of researchers from the kingdom and Huawei, according to SDAIA's twitter account. Saudi Arabia's Vision 2030 reform plan is a package of economic and social policies designed to free the kingdom from dependence on oil exports. SDAIA is seeking IBM's help in developing "real use cases" of AI in areas of health, energy and other sectors, as well as training through a strategic relationship, it said.


Top 7 COOLEST Technology Innovations Inspired by Nature

#artificialintelligence

Whenever we hear the words ''innovation'' or ''creativity'', we tend to typically think of technology, R&D labs, cutting-edge corporations, and prestigious academic institutions. Despite the ingenuity and engineering ability humans have demonstrated over the past millennia, time and again, we fall short of ''creativity'' when compared to mother nature. The examples of how insights from nature can improve, inspire and innovate technology are endless. One of the ambitious project Artificial Photosynthesis happened in 2016, when an artificial leaf split water into hydrogen and oxygen, combined with a modified bacteria, converted hydrogen into liquid fuel ten times as efficiently as plants. Out of the hundreds of such nature inspired innovations, we thought it would be interesting to round up a few awe-inspiring examples.