Oceania
Adaptive Portfolio by Solving Multi-armed Bandit via Thompson Sampling
Zhu, Mengying, Zheng, Xiaolin, Wang, Yan, Li, Yuyuan, Liang, Qianqiao
As the cornerstone of modern portfolio theory, Markowitz's mean-variance optimization is considered a major model adopted in portfolio management. However, due to the difficulty of estimating its parameters, it cannot be applied to all periods. In some cases, naive strategies such as Equally-weighted and Value-weighted portfolios can even get better performance. Under these circumstances, we can use multiple classic strategies as multiple strategic arms in multi-armed bandit to naturally establish a connection with the portfolio selection problem. This can also help to maximize the rewards in the bandit algorithm by the trade-off between exploration and exploitation. In this paper, we present a portfolio bandit strategy through Thompson sampling which aims to make online portfolio choices by effectively exploiting the performances among multiple arms. Also, by constructing multiple strategic arms, we can obtain the optimal investment portfolio to adapt different investment periods. Moreover, we devise a novel reward function based on users' different investment risk preferences, which can be adaptive to various investment styles. Our experimental results demonstrate that our proposed portfolio strategy has marked superiority across representative real-world market datasets in terms of extensive evaluation criteria.
The Canonical Distortion Measure for Vector Quantization and Function Approximation
To measure the quality of a set of vector quantization points a means of measuring the distance between a random point and its quantization is required. Common metrics such as the {\em Hamming} and {\em Euclidean} metrics, while mathematically simple, are inappropriate for comparing natural signals such as speech or images. In this paper it is shown how an {\em environment} of functions on an input space $X$ induces a {\em canonical distortion measure} (CDM) on X. The depiction 'canonical" is justified because it is shown that optimizing the reconstruction error of X with respect to the CDM gives rise to optimal piecewise constant approximations of the functions in the environment. The CDM is calculated in closed form for several different function classes. An algorithm for training neural networks to implement the CDM is presented along with some encouraging experimental results.
The Similarity-Consensus Regularized Multi-view Learning for Dimension Reduction
Meng, Xiangzhu, Wang, Huibing, Feng, Lin
During the last decades, learning a low-dimensional space with discriminative information for dimension reduction (DR) has gained a surge of interest. However, it's not accessible for these DR methods to achieve satisfactory performance when facing the features from multiple views. In multi-view learning problems, one instance can be represented by multiple heterogeneous features, which are highly related but sometimes look different from each other. In addition, correlations between features from multiple views always vary greatly, which challenges the capability of multi-view learning methods. Consequently, constructing a multi-view learning framework with generalization and scalability, which could take advantage of multi-view information as much as possible, is extremely necessary but challenging. To implement the above target, this paper proposes a novel multi-view learning framework based on similarity consensus, which makes full use of correlations among multi-view features while considering the scalability and robustness of the framework. It aims to straightforwardly extend those existing DR methods into multi-view learning domain by preserving the similarity between different views to capture the low-dimensional embedding. Two schemes based on pairwise-consensus and centroid-consensus are separately proposed to force multiple views to learn from each other and then an iterative alternating strategy is developed to obtain the optimal solution. The proposed method is evaluated on 5 benchmark datasets and comprehensive experiments show that our proposed multi-view framework can yield comparable and promising performance with previous approaches proposed in recent literatures.
Learning Model Bias
In this paper the problem of {\em learning} appropriate domain-specific bias is addressed. It is shown that this can be achieved by learning many related tasks from the same domain, and a theorem is given bounding the number tasks that must be learnt. A corollary of the theorem is that if the tasks are known to possess a common {\em internal representation} or {\em preprocessing} then the number of examples required per task for good generalisation when learning $n$ tasks simultaneously scales like $O(a + \frac{b}{n})$, where $O(a)$ is a bound on the minimum number of examples required to learn a single task, and $O(a + b)$ is a bound on the number of examples required to learn each task independently. An experiment providing strong qualitative support for the theoretical results is reported.
Conjugate Gradients for Kernel Machines
Bartels, Simon, Hennig, Philipp
Regularized least-squares (kernel-ridge / Gaussian process) regression is a fundamental algorithm of statistics and machine learning. Because generic algorithms for the exact solution have cubic complexity in the number of datapoints, large datasets require to resort to approximations. In this work, the computation of the least-squares prediction is itself treated as a probabilistic inference problem. We propose a structured Gaussian regression model on the kernel function that uses projections of the kernel matrix to obtain a low-rank approximation of the kernel and the matrix. A central result is an enhanced way to use the method of conjugate gradients for the specific setting of least-squares regression as encountered in machine learning. Our method improves the approximation of the kernel ridge regressor / Gaussian process posterior mean over vanilla conjugate gradients and, allows computation of the posterior variance and the log marginal likelihood (evidence) without further overhead.
Attention on Abstract Visual Reasoning
Hahne, Lukas, Lรผddecke, Timo, Wรถrgรถtter, Florentin, Kappel, David
Attention mechanisms have been boosting the performance of deep learning models on a wide range of applications, ranging from speech understanding to program induction. However, despite experiments from psychology which suggest that attention plays an essential role in visual reasoning, the full potential of attention mechanisms has so far not been explored to solve abstract cognitive tasks on image data. In this work, we propose a hybrid network architecture, grounded on self-attention and relational reasoning. We call this new model Attention Relation Network (ARNe). ARNe combines features from the recently introduced Transformer and the Wild Relation Network (WReN). We test ARNe on the Procedurally Generated Matrices (PGMs) datasets for abstract visual reasoning. ARNe excels the WReN model on this task by 11.28 ppt. Relational concepts between objects are efficiently learned demanding only 35% of the training samples to surpass reported accuracy of the base line model. Our proposed hybrid model, represents an alternative on learning abstract relations using self-attention and demonstrates that the Transformer network is also well suited for abstract visual reasoning.
Distributional Clustering: A distribution-preserving clustering method
Krishna, Arvind, Mak, Simon, Joseph, Roshan
One key use of k-means clustering is to identify cluster prototypes which can serve as representative points for a dataset. However, a drawback of using k-means cluster centers as representative points is that such points distort the distribution of the underlying data. This can be highly disadvantageous in problems where the representative points are subsequently used to gain insights on the data distribution, as these points do not mimic the distribution of the data. To this end, we propose a new clustering method called "distributional clustering", which ensures cluster centers capture the distribution of the underlying data. We first prove the asymptotic convergence of the proposed cluster centers to the data generating distribution, then present an efficient algorithm for computing these cluster centers in practice. Finally, we demonstrate the effectiveness of distributional clustering on synthetic and real datasets.
App developers in Uganda use TensorFlow to spot armyworm damage in maize
Fall armyworm, the larval life stage of a fall armyworm moth, impacts maize crops worldwide but particularly in countries like Uganda, where agricultural businesses employ 70% of the population. Studies show the potential impact is between 8.3 and 20.6 million tons per year, with the fallout amounting to between $2.48 million and $6.19 million per year. The threat of devastating losses prompted developers participating in a Google Developer Group in Mbale to create an Android app -- FlatButter -- that identifies signs of fall armyworm damage in maize crops. It's been featured on a national TV station in Uganda and highlighted by the Food Agricultural Organization of the United Nations, as well as by Google in a short film published today. "The vast damage and yield losses in maize production, due to FAW, got the attention of global organizations, who are calling for innovators to help," wrote Hansu Mobile and Intelligent Innovations CEO Nsubuga Hassan, who led the team that developed the app.
From Microbiology to Machine Learning with Springboard
Microbiology and MBA grad JK started to learn about big data and machine learning in his job, but wanted to learn more about data science in a structured environment. He enrolled in Springboard's Machine Learning Career Track to learn about ML and AI online. JK tells us how he balanced his full-time job with the Springboard bootcamp (hint: he didn't sleep much), and how networking at conferences helped him land his new job as a Data Engineer at KPMG! What is your educational and career background? I didn't come from a computer science (CS) background. My undergrad was in microbiology, immunology and molecular genetics. I then completed an MBA with a concentration in Accounting and Finance, working at the Australian Chamber of Commerce in Korea. And that's where I got a taste of some CS database work.
Cases challenging mobile phone detection cameras could clog NSW courts, MPs warn
New South Wales courts could be flooded with tens of thousands of cases every year if the NSW government moves ahead with plans to roll out cameras that use artificial intelligence to detect drivers using their mobile phones, a parliamentary committee has warned. The state parliament is considering legislation that would allow mobile phone detection cameras to be placed around NSW to capture drivers using their mobile phones while behind the wheel. The government estimates that there were at least 158 casualties on NSW roads between 2012 and 2018 involving mobile phones. Under the plan, two cameras are used at each location, with one at an angle to capture people with phones to their ears, and a second placed to capture people using their phones in their laps. Every car passing through thelocation is snapped, and Transport for NSW says it then deploys artificial intelligence to determine which drivers were using their mobiles.