Country
Estimation of the yield curve for Costa Rica using combinatorial optimization metaheuristics applied to nonlinear regression
Quiros-Granados, Andres, Trejos-Zelaya, JAvier
The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson - Si egel and Svensson were used to estimate the yield curve using a sample of historical data supplied by th e National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of four well known combinatorial optimization metaheu-ristics: Ant colony optimization, Genetic algorithm, Part icle swarm optimization and Simulated annealing. The aim of the study is to improve the local minima obtained by a classical quasi - Newton optimization m ethod using a descent direction. Good results with at least two metaheuristics are achieved, Particle sw arm optimization and Simulated annealing.
Noisy, sparse, nonlinear: Navigating the Bermuda Triangle of physical inference with deep filtering
Poelking, Carl, Amar, Yehia, Lapkin, Alexei, Colwell, Lucy
Capturing the microscopic interactions that determine molecular reactivity poses a challenge across the physical sciences. Even a basic understanding of the underlying reaction mechanisms can substantially accelerate materials and compound design, including the development of new catalysts or drugs. Given the difficulties routinely faced by both experimental and theoretical investigations that aim to improve our mechanistic understanding of a reaction, recent advances have focused on data-driven routes to derive structure-property relationships directly from high-throughput screens. However, even these high-quality, high-volume data are noisy, sparse and biased -- placing them in a regime where machine-learning is extremely challenging. Here we show that a statistical approach based on deep filtering of nonlinear feature networks results in physicochemical models that are more robust, transparent and generalize better than standard machine-learning architectures. Using diligent descriptor design and data post-processing, we exemplify the approach using both literature and fresh data on asymmetric catalytic hydrogenation, Palladium-catalyzed cross-coupling reactions, and drug-drug synergy. We illustrate how the sparse models uncovered by the filtering help us formulate physicochemical reaction ``pharmacophores'', investigate experimental bias and derive strategies for mechanism detection and classification.
Shared Visual Abstractions
This paper presents abstract art created by neural networks and broadly recognizable across various computer vision systems. The existence of abstract forms that trigger specific labels independent of neural architecture or training set suggests convolutional neural networks build shared visual representations for the categories they understand. Computer vision classifiers encountering these drawings often respond with strong responses for specific labels - in extreme cases stronger than all examples from the validation set. By surveying human subjects we confirm that these abstract artworks are also broadly recognizable by people, suggesting visual representations triggered by these drawings are shared across human and computer vision systems.
Eliminating artefacts in Polarimetric Images using Deep Learning
Paranjpye, Dhruv, Mahabal, Ashish, Ramaprakash, A. N., Panopoulou, Gina, Cleary, Kieran, Readhead, Anthony, Blinov, Dmitry, Tassis, Kostas
MNRAS 000, 1-7 (2019) Preprint 20 November 2019 Compiled using MNRAS L A T EX style file v3.0 Eliminating artefacts in Polarimetric Images using Deep Learning D. Paranjpye, 1 null A. Mahabal, 2 A.N. Ramaprakash, 3 G. Received YYY; in original form ZZZ ABSTRACT Polarization measurements done using Imaging Polarimeters such as the Robotic Polarimeter are very sensitive to the presence of artefacts in images. Artefacts can range from internal reflections in a telescope to satellite trails that could contaminate an area of interest in the image. With the advent of wide-field polarimetry surveys, it is imperative to develop methods that automatically flag artefacts in images. In this paper, we implement a Convolutional Neural Network to identify the most dominant artefacts in the images. We find that our model can successfully classify sources with 98% true positive and 97% true negative rates. Such models, combined with transfer learning, will give us a running start in artefact elimination for near-future surveys like W ALOP. Key words: deep learning - image classification - artefact detection - polarimetry 1 INTRODUCTION RoboPol (Ramaprakash et al. 2019) is a four-channel optical polarimeter installed on the 1.3m telescope at the Ski-nakas Observatory in Crete, Greece that is primarily used for polarimetry of point sources in the R band.
Bayesian sparse convex clustering via global-local shrinkage priors
Shimamura, Kaito, Kawano, Shuichi
Sparse convex clustering is to cluster observations and conduct variable selection simultaneously in the framework of convex clustering. Although the weighted $L_1$ norm as the regularization term is usually employed in the sparse convex clustering, this increases the dependence on the data and reduces the estimation accuracy if the sample size is not sufficient. To tackle these problems, this paper proposes a Bayesian sparse convex clustering via the idea of Bayesian lasso and global-local shrinkage priors. We introduce Gibbs sampling algorithms for our method using scale mixtures of normals. The effectiveness of the proposed methods is shown in simulation studies and a real data analysis.
Where is the Bottleneck of Adversarial Learning with Unlabeled Data?
Zhang, Jingfeng, Han, Bo, Niu, Gang, Liu, Tongliang, Sugiyama, Masashi
Deep neural networks (DNNs) are incredibly brittle due to adversarial examples. To robustify DNNs, adversarial training was proposed, which requires large-scale but well-labeled data. However, it is quite expensive to annotate large-scale data well. To compensate for this shortage, several seminal works are utilizing large-scale unlabeled data. In this paper, we observe that seminal works do not perform well, since the quality of pseudo labels on unlabeled data is quite poor, especially when the amount of unlabeled data is significantly larger than that of labeled data. We believe that the quality of pseudo labels is the bottleneck of adversarial learning with unlabeled data. To tackle this bottleneck, we leverage deep co-training, which trains two deep networks and encourages two networks diverged by exploiting peer's adversarial examples. Based on deep co-training, we propose robust co-training (RCT) for adversarial learning with unlabeled data. We conduct comprehensive experiments on CIFAR-10 and SVHN datasets. Empirical results demonstrate that our RCT can significantly outperform baselines (e.g., robust self-training (RST)) in both standard test accuracy and robust test accuracy w.r.t. different datasets, different network structures, and different types of adversarial training.
TITAN: A Spatiotemporal Feature Learning Framework for Traffic Incident Duration Prediction
Fu, Kaiqun, Ji, Taoran, Zhao, Liang, Lu, Chang-Tien
Critical incident stages identification and reasonable prediction of traffic incident duration are essential in traffic incident management. In this paper, we propose a traffic incident duration prediction model that simultaneously predicts the impact of the traffic incidents and identifies the critical groups of temporal features via a multi-task learning framework. First, we formulate a sparsity optimization problem that extracts low-level temporal features based on traffic speed readings and then generalizes higher level features as phases of traffic incidents. Second, we propose novel constraints on feature similarity exploiting prior knowledge about the spatial connectivity of the road network to predict the incident duration. The proposed problem is challenging to solve due to the orthogonality constraints, non-convexity objective, and non-smoothness penalties. We develop an algorithm based on the alternating direction method of multipliers (ADMM) framework to solve the proposed formulation. Extensive experiments and comparisons to other models on real-world traffic data and traffic incident records justify the efficacy of our model.
Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification
Zhang, Huan, Zhang, Zhao, Zhao, Mingbo, Ye, Qiaolin, Zhang, Min, Wang, Meng
The graph-based semi-supervised label propagation algorithm has delivered impressive classification results. However, the estimated soft labels typically contain mixed signs and noise, which cause inaccurate predictions due to the lack of suitable constraints. Moreover, available methods typically calculate the weights and estimate the labels in the original input space, which typically contains noise and corruption. Thus, the en-coded similarities and manifold smoothness may be inaccurate for label estimation. In this paper, we present effective schemes for resolving these issues and propose a novel and robust semi-supervised classification algorithm, namely, the tri-ple-matrix-recovery-based robust auto-weighted label propa-gation framework (ALP-TMR). Our ALP-TMR introduces a triple matrix recovery mechanism to remove noise or mixed signs from the estimated soft labels and improve the robustness to noise and outliers in the steps of assigning weights and pre-dicting the labels simultaneously. Our method can jointly re-cover the underlying clean data, clean labels and clean weighting spaces by decomposing the original data, predicted soft labels or weights into a clean part plus an error part by fitting noise. In addition, ALP-TMR integrates the au-to-weighting process by minimizing reconstruction errors over the recovered clean data and clean soft labels, which can en-code the weights more accurately to improve both data rep-resentation and classification. By classifying samples in the recovered clean label and weight spaces, one can potentially improve the label prediction results. The results of extensive experiments demonstrated the satisfactory performance of our ALP-TMR.
Evaluating task-agnostic exploration for fixed-batch learning of arbitrary future tasks
Dasagi, Vibhavari, Lee, Robert, Bruce, Jake, Leitner, Jürgen
Deep reinforcement learning has been shown to solve challenging tasks where large amounts of training experience is available, usually obtained online while learning the task. Robotics is a significant potential application domain for many of these algorithms, but generating robot experience in the real world is expensive, especially when each task requires a lengthy online training procedure. Off-policy algorithms can in principle learn arbitrary tasks from a diverse enough fixed dataset. In this work, we evaluate popular exploration methods by generating robotics datasets for the purpose of learning to solve tasks completely offline without any further interaction in the real world. We present results on three popular continuous control tasks in simulation, as well as continuous control of a high-dimensional real robot arm. Code documenting all algorithms, experiments, and hyper-parameters is available at https://github.com/qutrobotlearning/batchlearning.
Predictive properties of forecast combination, ensemble methods, and Bayesian predictive synthesis
Takanashi, Kosaku, McAlinn, Kenichiro
This paper studies the theoretical predictive properties of classes of forecast combination methods. The study is motivated by the recently developed Bayesian framework for synthesizing predictive densities: Bayesian predictive synthesis. A novel strategy based on continuous time stochastic processes is proposed and developed, where the combined predictive error processes are expressed as stochastic differential equations, evaluated using Ito's lemma. We show that a subclass of synthesis functions under Bayesian predictive synthesis, which we categorize as non-linear synthesis, entails an extra term that "corrects" the bias from misspecification and dependence in the predictive error process, effectively improving forecasts. Theoretical properties are examined and shown that this subclass improves the expected squared forecast error over any and all linear combination, averaging, and ensemble of forecasts, under mild conditions. We discuss the conditions for which this subclass outperforms others, and its implications for developing forecast combination methods. A finite sample simulation study is presented to illustrate our results.