South America
Truly Sparse Neural Networks at Scale
Curci, Selima, Mocanu, Decebal Constantin, Pechenizkiyi, Mykola
Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and to train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. The results show that our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.
A Data-Driven Approach to Violin Making
Gonzalez, Sebastian, Salvi, Davide, Baeza, Daniel, Antonacci, Fabio, Sarti, Augusto
Of all the characteristics of a violin, those that concern its shape are probably the most important ones, as the violin maker has complete control over them. Contemporary violin making, however, is still based more on tradition than understanding, and a definitive scientific study of the specific relations that exist between shape and vibrational properties is yet to come and sorely missed. In this article, using standard statistical learning tools, we show that the modal frequencies of violin tops can, in fact, be predicted from geometric parameters, and that artificial intelligence can be successfully applied to traditional violin making. We also study how modal frequencies vary with the thicknesses of the plate (a process often referred to as {\em plate tuning}) and discuss the complexity of this dependency. Finally, we propose a predictive tool for plate tuning, which takes into account material and geometric parameters.
Directive Explanations for Actionable Explainability in Machine Learning Applications
Singh, Ronal, Dourish, Paul, Howe, Piers, Miller, Tim, Sonenberg, Liz, Velloso, Eduardo, Vetere, Frank
This paper investigates the prospects of using directive explanations to assist people in achieving recourse of machine learning decisions. Directive explanations list which specific actions an individual needs to take to achieve their desired outcome. If a machine learning model makes a decision that is detrimental to an individual (e.g. denying a loan application), then it needs to both explain why it made that decision and also explain how the individual could obtain their desired outcome (if possible). At present, this is often done using counterfactual explanations, but such explanations generally do not tell individuals how to act. We assert that counterfactual explanations can be improved by explicitly providing people with actions they could use to achieve their desired goal. This paper makes two contributions. First, we present the results of an online study investigating people's perception of directive explanations. Second, we propose a conceptual model to generate such explanations. Our online study showed a significant preference for directive explanations ($p<0.001$). However, the participants' preferred explanation type was affected by multiple factors, such as individual preferences, social factors, and the feasibility of the directives. Our findings highlight the need for a human-centred and context-specific approach for creating directive explanations.
Towards a reinforcement learning de novo genome assembler
Padovani, Kleber, Xavier, Roberto, Carvalho, Andre, Reali, Anna, Chateau, Annie, Alves, Ronnie
The use of reinforcement learning has proven to be very promising for solving complex activities without human supervision during their learning process. However, their successful applications are predominantly focused on fictional and entertainment problems - such as games. Based on the above, this work aims to shed light on the application of reinforcement learning to solve this relevant real-world problem, the genome assembly. By expanding the only approach found in the literature that addresses this problem, we carefully explored the aspects of intelligent agent learning, performed by the Q-learning algorithm, to understand its suitability to be applied in scenarios whose characteristics are more similar to those faced by real genome projects. The improvements proposed here include changing the previously proposed reward system and including state space exploration optimization strategies based on dynamic pruning and mutual collaboration with evolutionary computing. These investigations were tried on 23 new environments with larger inputs than those used previously. All these environments are freely available on the internet for the evolution of this research by the scientific community. The results suggest consistent performance progress using the proposed improvements, however, they also demonstrate the limitations of them, especially related to the high dimensionality of state and action spaces. We also present, later, the paths that can be traced to tackle genome assembly efficiently in real scenarios considering recent, successfully reinforcement learning applications - including deep reinforcement learning - from other domains dealing with high-dimensional inputs.
A metaheuristic for crew scheduling in a pickup-and-delivery problem with time windows
Lucci, Mauro, Severín, Daniel, Zabala, Paula
A vehicle routing and crew scheduling problem (VRCSP) consists of simultaneously planning the routes of a fleet of vehicles and scheduling the crews, where the vehicle-crew correspondence is not fixed through time. This allows a greater planning flexibility and a more efficient use of the fleet, but in counterpart, a high synchronisation is demanded. In this work, we present a VRCSP where pickup-and-delivery requests with time windows have to be fulfilled over a given planning horizon by using trucks and drivers. Crews can be composed of 1 or 2 drivers and any of them can be relieved in a given set of locations. Moreover, they are allowed to travel among locations with non-company shuttles, at an additional cost that is minimised. As our problem considers distinct routes for trucks and drivers, we have an additional flexibility not contemplated in other previous VRCSP given in the literature where a crew is handled as an indivisible unit. We tackle this problem with a two-stage sequential approach: a set of truck routes is computed in the first stage and a set of driver routes consistent with the truck routes is obtained in the second one. We design and evaluate the performance of a metaheuristic based algorithm for the latter stage. Our algorithm is mainly a GRASP with a perturbation procedure that allows reusing solutions already found in case the search for new solutions becomes difficult. This procedure together with other to repair infeasible solutions allow us to find high-quality solutions on instances of 100 requests spread across 15 cities with a fleet of 12-32 trucks (depending on the planning horizon) in less than an hour. We also conclude that the possibility of carrying an additional driver leads to a decrease of the cost of external shuttles by about 60% on average with respect to individual crews and, in some cases, to remove this cost completely.
Policy Analysis using Synthetic Controls in Continuous-Time
Bellot, Alexis, van der Schaar, Mihaela
Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. Despite its popularity, the current description only considers time series aligned across units and synthetic controls expressed as linear combinations of observed control units. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. This model is directly applicable to the general setting of irregularly-aligned multivariate time series and may be optimized in rich function spaces - thereby substantially improving on some limitations of existing approaches.
A General Framework for the Logical Representation of Combinatorial Exchange Protocols
Mittelmann, Munyque, Bouveret, Sylvain, Perrussel, Laurent
The goal of this paper is to propose a framework for representing and reasoning about the rules governing a combinatorial exchange. Such a framework is at first interest as long as we want to build up digital marketplaces based on auction, a widely used mechanism for automated transactions. Combinatorial exchange is the most general case of auctions, mixing the double and combinatorial variants: agents bid to trade bundles of goods. Hence the framework should fulfill two requirements: (i) it should enable bidders to express their bids on combinations of goods and (ii) it should allow describing the rules governing some market, namely the legal bids, the allocation and payment rules. To do so, we define a logical language in the spirit of the Game Description Language: the Combinatorial Exchange Description Language is the first language for describing combinatorial exchange in a logical framework. The contribution is two-fold: first, we illustrate the general dimension by representing different kinds of protocols, and second, we show how to reason about auction properties in this machine-processable language.
Evaluating the Interpretability of Generative Models by Interactive Reconstruction
Ross, Andrew Slavin, Chen, Nina, Hang, Elisa Zhao, Glassman, Elena L., Doshi-Velez, Finale
For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed.
"Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision
Roy, Kaushik, Lokala, Usha, Khandelwal, Vedant, Sheth, Amit
With strong marketing advocacy of the benefits of cannabis use for improved mental health, cannabis legalization is a priority among legislators. However, preliminary scientific research does not conclusively associate cannabis with improved mental health. In this study, we explore the relationship between depression and consumption of cannabis in a targeted social media corpus involving personal use of cannabis with the intent to derive its potential mental health benefit. We use tweets that contain an association among three categories annotated by domain experts - Reason, Effect, and Addiction. The state-of-the-art Natural Langauge Processing techniques fall short in extracting these relationships between cannabis phrases and the depression indicators. We seek to address the limitation by using domain knowledge; specifically, the Drug Abuse Ontology for addiction augmented with Diagnostic and Statistical Manual of Mental Disorders lexicons for mental health. Because of the lack of annotations due to the limited availability of the domain experts' time, we use supervised contrastive learning in conjunction with GPT-3 trained on a vast corpus to achieve improved performance even with limited supervision. Experimental results show that our method can significantly extract cannabis-depression relationships better than the state-of-the-art relation extractor. High-quality annotations can be provided using a nearest neighbor approach using the learned representations that can be used by the scientific community to understand the association between cannabis and depression better.
Single Image Non-uniform Blur Kernel Estimation via Adaptive Basis Decomposition
Carbajal, Guillermo, Vitoria, Patricia, Delbracio, Mauricio, Musé, Pablo, Lezama, José
Characterizing and removing motion blur caused by camera shake or object motion remains an important task for image restoration. In recent years, removal of motion blur in photographs has seen impressive progress in the hands of deep learning-based methods, trained to map directly from blurry to sharp images. Characterization of motion blur, on the other hand, has received less attention and progress in model-based methods for restoration lags behind that of data-driven end-to-end approaches. In this paper, we propose a general, non-parametric model for dense non-uniform motion blur estimation. Given a blurry image, we estimate a set of adaptive basis kernels as well as the mixing coefficients at pixel level, producing a per-pixel map of motion blur. This rich but efficient forward model of the degradation process allows the utilization of existing tools for solving inverse problems. We show that our method overcomes the limitations of existing non-uniform motion blur estimation and that it contributes to bridging the gap between model-based and data-driven approaches for deblurring real photographs.