Goto

Collaborating Authors

 Optimization


A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Predictions

arXiv.org Artificial Intelligence

More and more investors and machine learning models rely on social media (e.g., Twitter and Reddit) to gather real-time information and sentiment to predict stock price movements. Although text-based models are known to be vulnerable to adversarial attacks, whether stock prediction models have similar vulnerability is underexplored. In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models. We address the task of adversarial generation by solving combinatorial optimization problems with semantics and budget constraints. Our results show that the proposed attack method can achieve consistent success rates and cause significant monetary loss in trading simulation by simply concatenating a Figure 1: An example of word-replacement adversarial perturbed but semantically similar tweet.


Evolving objective function for improved variational quantum optimization

#artificialintelligence

A promising approach to useful computational quantum advantage is to use variational quantum algorithms for optimization problems. Crucial for the performance of these algorithms is to ensure that the algorithm converges with high probability to a near-optimal solution in a small time. In Barkoutsos et al. [Quantum 4, 256 (2020)], an alternative class of objective functions, called conditional value at risk (CVaR), was introduced and it was shown that they perform better than standard objective functions. Here we extend that work by introducing an evolving objective function, which we call ascending-CVaR and that can be used for any optimization problem. We test our proposed objective function in an emulation environment, using as case studies three different optimization problems: MaxCut, number partitioning, and portfolio optimization. We examine multiple instances of different sizes and analyze the performance using the variational quantum eigensolver with hardware-efficient ansatz and the quantum approximate optimization algorithm. We show that ascending-CVaR in all cases performs better than standard objective functions or the constant CVaR of Barkoutsos et al. [Quantum 4, 256 (2020)] and that it can be used as a heuristic for avoiding suboptimal minima. Our proposal achieves higher overlap with the ideal state in all problems, whether we consider easy or hard instances---on average, it gives up to ten times greater overlap at portfolio optimization and number partitioning, while it gives an 80% improvement at MaxCut. In the hard instances we consider, for the number partitioning problem, standard objective functions fail to find the correct solution in almost all cases, CVaR finds the correct solution at 60% of the cases, while ascending-CVaR finds the correct solution in 95% of the cases.


Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions

arXiv.org Machine Learning

Moment restrictions identify a parameter of interest by restricting the expectation value of so-called moment functions, which depend on the parameter and random variables representing the underlying noisy data generating process. Important problems in causal inference, economics, and generally robust machine learning can be cast in this form [Newey, 1993, Ai and Chen, 2003, Bennett and Kallus, 2020b, Dikkala et al., 2020]. Particularly challenging are problems formulated as conditional moment restrictions (CMR), which constrain the conditional expectation of the moment function. Such problems appear, e.g., in instrumental variable (IV) regression [Newey and Powell, 2003, Angrist and Pischke, 2008], where the expectation of the residual of the prediction conditioned on so-called instruments is restricted to be zero. Other applications are policy learning [Bennett and Kallus, 2020a] and off-policy evaluation in reinforcement learning [Kallus and Uehara, 2020, Bennett et al., 2021, Chen et al., 2021] and double/debiased machine learning [Chernozhukov et al., 2016, 2017, 2018]. As conditional moment restrictions are difficult to handle directly, a common approach is to transform them into an infinite number of corresponding unconditional moment restrictions Bierens [1982]. Generalizing the corresponding estimation methods from the finite dimensional case to the infinite case is an active area of research [Carrasco and Florens, 2000, Carrasco et al., 2007, Chaussรฉ, 2012, Carrasco and Kotchoni, 2017, Muandet et al., 2020, Bennett and Kallus, 2020b, Zhang et al., 2021]. One of the most popular approaches to learning with moment restrictions is Hansen's celebrated generalized method of moments (GMM) [Hansen, 1982]. In order to improve the small sample properties of GMM estimators, alternative methods have been proposed and are generally known as generalized empirical likelihood (GEL) estimators [Smith, 1997, 2005, Newey and Smith, 2004].


Comprehensive Reactive Safety: No Need For A Trajectory If You Have A Strategy

arXiv.org Artificial Intelligence

Abstract-- Safety guarantees in motion planning for autonomous driving typically involve certifying the trajectory to be collision-free under any motion of the uncontrollable participants in the environment, such as the human-driven vehicles on the road. As a result they usually employ a conservative bound on the behavior of such participants, such as reachability analysis. We point out that planning trajectories to rigorously avoid the entirety of the reachable regions is unnecessary and too restrictive, because observing the environment in the future will allow us to prune away most of them; disregarding this ability to react to future updates could prohibit solutions to scenarios that are easily navigated by human drivers. We propose to account for the autonomous vehicle's reactions to future environment changes by a novel safety framework, Comprehensive Reactive Safety. Validated in simulations in several urban driving scenarios such as unprotected left turns and lane merging, the resulting planning algorithm called Reactive ILQR demonstrates strong negotiation capabilities and better safety at the same time.


Uniform Manifold Approximation with Two-phase Optimization

arXiv.org Artificial Intelligence

We introduce Uniform Manifold Approximation with Two-phase Optimization (UMATO), a dimensionality reduction (DR) technique that improves UMAP to capture the global structure of high-dimensional data more accurately. In UMATO, optimization is divided into two phases so that the resulting embeddings can depict the global structure reliably while preserving the local structure with sufficient accuracy. In the first phase, hub points are identified and projected to construct a skeletal layout for the global structure. In the second phase, the remaining points are added to the embedding preserving the regional characteristics of local areas. Through quantitative experiments, we found that UMATO (1) outperformed widely used DR techniques in preserving the global structure while (2) producing competitive accuracy in representing the local structure. We also verified that UMATO is preferable in terms of robustness over diverse initialization methods, number of epochs, and subsampling techniques.


Event Collapse in Contrast Maximization Frameworks

arXiv.org Artificial Intelligence

Contrast maximization (CMax) is a framework that provides state-of-the-art results on several event-based computer vision tasks, such as ego-motion or optical flow estimation. However, it may suffer from a problem called event collapse, which is an undesired solution where events are warped into too few pixels. As prior works have largely ignored the issue or proposed workarounds, it is imperative to analyze this phenomenon in detail. Our work demonstrates event collapse in its simplest form and proposes collapse metrics by using first principles of space-time deformation based on differential geometry and physics. We experimentally show on publicly available datasets that the proposed metrics mitigate event collapse and do not harm well-posed warps. To the best of our knowledge, regularizers based on the proposed metrics are the only effective solution against event collapse in the experimental settings considered, compared with other methods. We hope that this work inspires further research to tackle more complex warp models.


FD-GATDR: A Federated-Decentralized-Learning Graph Attention Network for Doctor Recommendation Using EHR

arXiv.org Artificial Intelligence

In the past decade, with the development of big data technology, an increasing amount of patient information has been stored as electronic health records (EHRs). Leveraging these data, various doctor recommendation systems have been proposed. Typically, such studies process the EHR data in a flat-structured manner, where each encounter was treated as an unordered set of features. Nevertheless, the heterogeneous structured information such as service sequence stored in claims shall not be ignored. This paper presents a doctor recommendation system with time embedding to reconstruct the potential connections between patients and doctors using heterogeneous graph attention network. Besides, to address the privacy issue of patient data sharing crossing hospitals, a federated decentralized learning method based on a minimization optimization model is also proposed. The graph-based recommendation system has been validated on a EHR dataset. Compared to baseline models, the proposed method improves the AUC by up to 6.2%. And our proposed federated-based algorithm not only yields the fictitious fusion center's performance but also enjoys a convergence rate of O(1/T).


Orthogonal Gromov-Wasserstein Discrepancy with Efficient Lower Bound

arXiv.org Machine Learning

Comparing structured data from possibly different metric-measure spaces is a fundamental task in machine learning, with applications in, e.g., graph classification. The Gromov-Wasserstein (GW) discrepancy formulates a coupling between the structured data based on optimal transportation, tackling the incomparability between different structures by aligning the intra-relational geometries. Although efficient \emph{local} solvers such as conditional gradient and Sinkhorn are available, the inherent non-convexity still prevents a tractable evaluation, and the existing lower bounds are not tight enough for practical use. To address this issue, we take inspiration from the connection with the quadratic assignment problem, and propose the orthogonal Gromov-Wasserstein (OGW) discrepancy as a surrogate of GW. It admits an efficient and \emph{closed-form} lower bound with $\mathcal{O}(n^3)$ complexity, and directly extends to the fused Gromov-Wasserstein (FGW) distance, incorporating node features into the coupling. Extensive experiments on both the synthetic and real-world datasets show the tightness of our lower bounds, and both OGW and its lower bounds efficiently deliver accurate predictions and satisfactory barycenters for graph sets.


Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes

arXiv.org Machine Learning

We propose efficient Langevin Monte Carlo algorithms for sampling distributions with nonsmooth convex composite potentials, which is the sum of a continuously differentiable function and a possibly nonsmooth function. We devise such algorithms leveraging recent advances in convex analysis and optimization methods involving Bregman divergences, namely the Bregman--Moreau envelopes and the Bregman proximity operators, and in the Langevin Monte Carlo algorithms reminiscent of mirror descent. The proposed algorithms extend existing Langevin Monte Carlo algorithms in two aspects -- the ability to sample nonsmooth distributions with mirror descent-like algorithms, and the use of the more general Bregman--Moreau envelope in place of the Moreau envelope as a smooth approximation of the nonsmooth part of the potential. A particular case of the proposed scheme is reminiscent of the Bregman proximal gradient algorithm. The efficiency of the proposed methodology is illustrated with various sampling tasks at which existing Langevin Monte Carlo methods are known to perform poorly.


Local Area Routes for Vehicle Routing Problems

arXiv.org Artificial Intelligence

We consider an approach for improving the efficiency of column generation (CG) methods for solving vehicle routing problems. We introduce Local Area (LA) route relaxations, an alternative/complement to the commonly used ng-route relaxations and Decremental State Space Relaxations (DSSR) inside of CG formulations. LA routes are a subset of ng-routes and a super-set of elementary routes. Normally, the pricing stage of CG must produce elementary routes, which are routes without repeated customers, using processes which can be computationally expensive. Non-elementary routes visit at least one customer more than once, creating a cycle. LA routes relax the constraint of being an elementary route in such a manner as to permit efficient pricing. LA routes are best understood in terms of ng-route relaxations. Ng-routes are routes which are permitted to have non-localized cycles in space; this means that at least one intermediate customer (called a breaker) in the cycle must consider the starting customer in the cycle to be spatially far away. LA routes are described using a set of special indexes corresponding to customers on the route ordered from the start to the end of the route. LA route relaxations further restrict the set of permitted cycles beyond that of ng-routes by additionally enforcing that the breaker must be a located at a special index where the set of special indexes is defined recursively as follows. The first special index in the route is at index 1 meaning that it is associated with the first customer in the route. The k'th special index corresponds to the first customer after the k-1'th special index, that is not considered to be a neighbor of (considered spatially far from) the customer located at the k-1'th special index. We demonstrate that LA route relaxations can significantly improve the computational speed of pricing when compared to the standard DSSR.