Goto

Collaborating Authors

 Country




The Digital Twin and P&L of One JD Supra

#artificialintelligence

Innovation in compliance can come in many forms. One such form was described by Vincent M. Walden, Managing Director at Alvarez and Marsal Holdings, LLC (A&M), in his article entitled "Profit & Loss-of-One"(P&L-of-One). In it, Walden detailed how he and his then colleagues at Ernest & Young (EY) worked in conjunction with the General Electric (GE) compliance function to "improve compliance by using forensic data analytics to provide behavioral insights to their compliance program." They did this through the innovative use of "digital twins" which Walden described as "digital replicas of physical assets that organizations can use for multiple purposes such as the maintenance of power generation equipment, jet engines and heavy machinery." In a more expansive definition, the consulting firm Gartner, Inc. described "digital twins" as dynamic software models of physical things or systems.


DARPA Demonstrates "Competition" Tool at Combatant Command DefenceTalk

#artificialintelligence

Service members at U.S. Indo-Pacific Command headquarters in Hawaii recently tested a prototype DARPA system designed to help military analysts and planners determine if observed events – such as increased force movements, cyber intrusions, and civil unrest – are unconnected occurrences, or if they're part of an adversary's coordinated campaign to achieve strategic objectives in a geographic region. Operational representatives from the command's intelligence and operations divisions spent three days in December trying out DARPA's COMPASS tool suite. COMPASS, which stands for Collection and Monitoring via Planning for Active Situational Scenarios, analyzes large streams of data to uncover competition campaigns, and displays results that represent the evidence and the analysis behind each hypothesis. COMPASS seeks to leverage advanced AI and other technologies to help commanders make more effective decisions regarding a competitor's complex, multi-layered competition activity. Competition refers to actions – both non-violent and violent – designed to achieve geopolitical goals without provoking full-blown armed conflict.


5 AI policy questions our presidential candidates must address

#artificialintelligence

Our 2020 presidential candidates will be questioned about their stance on artificial intelligence (AI) policy, especially with regard to the job displacement AI could cause in manufacturing, transportation, and other industries. An over-regulation of AI could hand technical superiority to countries like China and Russia, leading to a ripple effect on America's GDP and even threatening national security. But under-regulation could lead to a massive consolidation of power among a handful of American technology companies, millions of jobs lost without replacement planning, and algorithms that show bias based on age, race, gender, and more. We're certain to hear statements about upskilling -- the process of helping displaced workers acquire new skills so they can find other employment -- and about taxing robots to slow down job loss. But the candidates will need to offer up more than a few soundbites.


Symmetry & critical points for a model shallow neural network

arXiv.org Machine Learning

A detailed analysis is given of a family of critical points determining spurious minima for a model student-teacher 2-layer neural network, with ReLU activation function, and a natural $\Gamma = S_k \times S_k$-symmetry. For a $k$-neuron shallow network of this type, analytic equations are given which, for example, determine the critical points of the spurious minima described by Safran and Shamir (2018) for $6 \le k \le 20$. These critical points have isotropy (conjugate to) the diagonal subgroup $\Delta S_{k-1}\subset \Delta S_k$ of $\Gamma$. It is shown that critical points of this family can be expressed as an infinite series in $1/\sqrt{k}$ (for large enough $k$) and, as an application, the critical values decay like $a k^{-1}$, where $a \approx 0.3$. Other non-trivial families of critical points are also described with isotropy conjugate to $\Delta S_{k-1}, \Delta S_k$ and $\Delta (S_2\times S_{k-2})$ (the latter giving spurious minima for $k\ge 9$). The methods used depend on symmetry breaking, bifurcation, and algebraic geometry, notably Artin's implicit function theorem, and are applicable to other families of critical points that occur in this network.


Markovian Score Climbing: Variational Inference with KL(p||q)

arXiv.org Machine Learning

Modern variational inference (VI) uses stochastic gradients to avoid intractable expectations, enabling large-scale probabilistic inference in complex models. VI posits a family of approximating distributions $q$ and then finds the member of that family that is closest to the exact posterior $p$. Traditionally, VI algorithms minimize the "exclusive KL" KL$(q\|p)$, often for computational convenience. Recent research, however, has also focused on the "inclusive KL" KL$(p\|q)$, which has good statistical properties that makes it more appropriate for certain inference problems. This paper develops a simple algorithm for reliably minimizing the inclusive KL. Consider a valid MCMC method, a Markov chain whose stationary distribution is $p$. The algorithm we develop iteratively samples the chain $z[k]$, and then uses those samples to follow the score function of the variational approximation, $\nabla \log q(z[k])$ with a Robbins-Monro step-size schedule. This method, which we call Markovian score climbing (MSC), converges to a local optimum of the inclusive KL. It does not suffer from the systematic errors inherent in existing methods, such as Reweighted Wake-Sleep and Neural Adaptive Sequential Monte Carlo, which lead to bias in their final estimates. In a variant that ties the variational approximation directly to the Markov chain, MSC further provides a new algorithm that melds VI and MCMC. We illustrate convergence on a toy model and demonstrate the utility of MSC on Bayesian probit regression for classification as well as a stochastic volatility model for financial data.


FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection

arXiv.org Machine Learning

As massive data are produced from small gadgets, federated learning on mobile devices has become an emerging trend. In the federated setting, Stochastic Gradient Descent (SGD) has been widely used in federated learning for various machine learning models. To prevent privacy leakages from gradients that are calculated on users' sensitive data, local differential privacy (LDP) has been considered as a privacy guarantee in federated SGD recently. However, the existing solutions have a dimension dependency problem: the injected noise is substantially proportional to the dimension $d$. In this work, we propose a two-stage framework FedSel for federated SGD under LDP to relieve this problem. Our key idea is that not all dimensions are equally important so that we privately select Top-k dimensions according to their contributions in each iteration of federated SGD. Specifically, we propose three private dimension selection mechanisms and adapt the gradient accumulation technique to stabilize the learning process with noisy updates. We also theoretically analyze privacy, accuracy and time complexity of FedSel, which outperforms the state-of-the-art solutions. Experiments on real-world and synthetic datasets verify the effectiveness and efficiency of our framework.


A Pitfall of Learning from User-generated Data: In-depth Analysis of Subjective Class Problem

arXiv.org Machine Learning

Research in the supervised learning algorithms field implicitly assumes that training data is labeled by domain experts or at least semi-professional labelers accessible through crowdsourcing services like Amazon Mechanical Turk. With the advent of the Internet, data has become abundant and a large number of machine learning based systems started being trained with user-generated data, using categorical data as true labels. However, little work has been done in the area of supervised learning with user-defined labels where users are not necessarily experts and might be motivated to provide incorrect labels in order to improve their own utility from the system. In this article, we propose two types of classes in user-defined labels: subjective class and objective class - showing that the objective classes are as reliable as if they were provided by domain experts, whereas the subjective classes are subject to bias and manipulation by the user. We define this as a subjective class issue and provide a framework for detecting subjective labels in a dataset without querying oracle. Using this framework, data mining practitioners can detect a subjective class at an early stage of their projects, and avoid wasting their precious time and resources by dealing with subjective class problem with traditional machine learning techniques.


Efficient Gaussian Process Bandits by Believing only Informative Actions

arXiv.org Machine Learning

Bayesian optimization is a framework for global search via maximum a posteriori updates rather than simulated annealing, and has gained prominence for decision-making under uncertainty. In this work, we cast Bayesian optimization as a multi-armed bandit problem, where the payoff function is sampled from a Gaussian process (GP). Further, we focus on action selections via upper confidence bound (UCB) or expected improvement (EI) due to their prevalent use in practice. Prior works using GPs for bandits cannot allow the iteration horizon $T$ to be large, as the complexity of computing the posterior parameters scales cubically with the number of past observations. To circumvent this computational burden, we propose a simple statistical test: only incorporate an action into the GP posterior when its conditional entropy exceeds an $\epsilon$ threshold. Doing so permits us to derive sublinear regret bounds of GP bandit algorithms up to factors depending on the compression parameter $\epsilon$ for both discrete and continuous action sets. Moreover, the complexity of the GP posterior remains provably finite. Experimentally, we observe state of the art accuracy and complexity tradeoffs for GP bandit algorithms applied to global optimization, suggesting the merits of compressed GPs in bandit settings.