Goto

Collaborating Authors

approximation


How to accelerate and compress neural networks with quantization

#artificialintelligence

Neural networks are very resource intensive algorithms. They not only incur significant computational costs, they also consume a lot of memory in addition. Even though the commercially available computational resources increase day by day, optimizing the training and inference of deep neural networks is extremely important. If we run our models in the cloud, we want to minimize the infrastructure costs and the carbon footprint. When we are running our models on the edge, network optimization becomes even more significant. If we have to run our models on smartphones or embedded devices, hardware limitations are immediately apparent.


Memorizing vs. Understanding (read: Data vs. Knowledge)

#artificialintelligence

So how can I get the result of the arithmetic expression, e? Well, there are two ways: (i) if I'm lucky, and lazy (think: efficiency) I could have the value of e stored (as data) in some hashtable (a data dictionary) where I can use a key to pick-up the value of e anytime I need it (figure 1); The first method, let's call it the data/memorization method, does not require us to know how to compute e. That is, if the value of e is not memorized (and stored in some data storage), then the only way to get the value of e is to know that adding m to n is essentially adding n 1's to m and knowing that multiplying m by n is adding m to itself n times (and thus'multiplication' can be defined only after the more primitive function'addition' is defined). Crucially, then, the first method is limited to the data I have seen and memorized (i.e., stored in memory), while the second method does not have this limitation -- in fact, once I know the procedures of addition and multiplication (and other operations) then I'm ready for an infinite number of expressions. So we could, at this early juncture, describe the first method by "knowing what (is the value)" and the second method by "knowing how (to compute the value)" -- the first is fast (not to mention easy) but limited to the data I have seen and memorized (stored). The second is not limited to the data we have seen, but requires detailed knowledge (knowing how) of the procedures.


Differential ML on TensorFlow and Colab

#artificialintelligence

Brian Huge and I just posted a working paper following six months of research and development on function approximation by artificial intelligence (AI) in Danske Bank. One major finding was that training machine learning (ML) models for regression (i.e. Given those differential labels, we can write simple, yet unreasonably effective training algorithms, capable of learning accurate function approximations with remarkable speed and accuracy from small datasets, in a stable manner, without need of additional regularization or optimization of hyperparameters, e.g. by cross validation. In this post, we briefly summarize these algorithms under the name differential machine learning, highlighting the main intuitions and benefits and commenting TensorFlow implementation code. All the details are found in the working paper, the online appendices and the Colab notebooks.


Improving Nash Social Welfare Approximations

Journal of Artificial Intelligence Research

We consider the problem of fairly allocating a set of indivisible goods among n agents. Various fairness notions have been proposed within the rapidly growing field of fair division, but the Nash social welfare (NSW) serves as a focal point. In part, this follows from the ‘unreasonable’ fairness guarantees provided, in the sense that a max NSW allocation meets multiple other fairness metrics simultaneously, all while satisfying a standard economic concept of efficiency, Pareto optimality. However, existing approximation algorithms fail to satisfy all of the remarkable fairness guarantees offered by a max NSW allocation, instead targeting only the specific NSW objective. We address this issue by presenting a 2 max NSW, Prop-1, 1/(2n) MMS, and Pareto optimal allocation in strongly polynomial time. Our techniques are based on a market interpretation of a fractional max NSW allocation. We present novel definitions of fairness concepts in terms of market prices, and design a new scheme to round a market equilibrium into an integral allocation in a way that provides most of the fairness properties of an integral max NSW allocation.


Algebraic Approach to Directed Rough Sets

arXiv.org Artificial Intelligence

In relational approach to general rough sets, ideas of directed relations are supplemented with additional conditions for multiple algebraic approaches in this research paper. The relations are also specialized to representations of general parthood that are upper-directed, reflexive and antisymmetric for a better behaved groupoidal semantics over the set of roughly equivalent objects by the first author. Another distinct algebraic semantics over the set of approximations, and a new knowledge interpretation are also invented in this research by her. Because of minimal conditions imposed on the relations, neighborhood granulations are used in the construction of all approximations (granular and pointwise). Necessary and sufficient conditions for the lattice of local upper approximations to be completely distributive are proved by the second author. These results are related to formal concept analysis. Applications to student centered learning and decision making are also outlined.


An Extension of LIME with Improvement of Interpretability and Fidelity

arXiv.org Artificial Intelligence

While deep learning makes significant achievements in Artificial Intelligence (AI), the lack of transparency has limited its broad application in various vertical domains. Explainability is not only a gateway between AI and real world, but also a powerful feature to detect flaw of the models and bias of the data. Local Interpretable Model-agnostic Explanation (LIME) is a widely-accepted technique that explains the prediction of any classifier faithfully by learning an interpretable model locally around the predicted instance. As an extension of LIME, this paper proposes an high-interpretability and high-fidelity local explanation method, known as Local Explanation using feature Dependency Sampling and Nonlinear Approximation (LEDSNA). Given an instance being explained, LEDSNA enhances interpretability by feature sampling with intrinsic dependency. Besides, LEDSNA improves the local explanation fidelity by approximating nonlinear boundary of local decision. We evaluate our method with classification tasks in both image domain and text domain. Experiments show that LEDSNA's explanation of the back-box model achieves much better performance than original LIME in terms of interpretability and fidelity.


High-dimensional macroeconomic forecasting using message passing algorithms

arXiv.org Machine Learning

As a response to the increasing linkages between the macroeconomy and the financial sector, as well as the expanding interconnectedness of the global economy, empirical macroeconomic models have increased both in complexity and size. For that reason, estimation of modern models that inform macroeconomic decisions - such as linear and nonlinear versions of dynamic stochastic general equilibrium (DSGE) and vector autoregressive (VAR) models - many times relies on Bayesian inference via powerful Markov chain Monte Carlo (MCMC) methods. 1 However, existing posterior simulation algorithms cannot scale up to very high-dimensions due to the computational inefficiency and the larger numerical error associated with repeated sampling via Monte Carlo; see Angelino et al. (2016) for a thorough review of such computational issues from a machine learning and high-dimensional data perspective. In that respect, while Bayesian inference is a natural probabilistic framework for learning about parameters by utilizing all information in the data likelihood and prior, computational restrictions might make it less suitable for supporting real-time decision-making in very high dimensions. This paper introduces to the econometric literature the framework of factor graphs (Kschischang et al., 2001) for the purpose of designing computationally efficient, and easy to maintain, Bayesian estimation algorithms. The focus is not only on "faster" posterior inference broadly interpreted, but on designing algorithms that have such low complexity that are future-proof and can be used in high-dimensional econometric problems with possibly thousands or millions of coefficients.


Verification of Markov Decision Processes with Risk-Sensitive Measures

arXiv.org Artificial Intelligence

We develop a method for computing policies in Markov decision processes with risk-sensitive measures subject to temporal logic constraints. Specifically, we use a particular risk-sensitive measure from cumulative prospect theory, which has been previously adopted in psychology and economics. The nonlinear transformation of the probabilities and utility functions yields a nonlinear programming problem, which makes computation of optimal policies typically challenging. We show that this nonlinear weighting function can be accurately approximated by the difference of two convex functions. This observation enables efficient policy computation using convex-concave programming. We demonstrate the effectiveness of the approach on several scenarios.


Weaponizing Machine Learning

#artificialintelligence

Unfortunately this talk is not focused on technical security aspects, but gives you a clear view on how Machine Learning could be used in Security applications. You will read a single DEFCON talk resume, but you can go deeper looking here. There are already softwares that use Machine Learning for Defensive purposes like firewalls for anomalous traffic detection, so we'll focus on the Offensive purposes in order to create of find already existing tools of this category. Den Petro and Ben Morris from Bishop Fox created a tool named "DeepHack" which uses ML to accomplish SQL injection attack. Language used to query a Database in order to add/remove/edit information collected inside of it as records.


Weaponizing Machine Learning

#artificialintelligence

Unfortunately this talk is not focused on technical security aspects, but gives you a clear view on how Machine Learning could be used in Security applications. You will read a single DEFCON talk resume, but you can go deeper looking here. There are already softwares that use Machine Learning for Defensive purposes like firewalls for anomalous traffic detection, so we'll focus on the Offensive purposes in order to create of find already existing tools of this category. Den Petro and Ben Morris from Bishop Fox created a tool named "DeepHack" which uses ML to accomplish SQL injection attack. Language used to query a Database in order to add/remove/edit information collected inside of it as records.