Supervised Learning
End-to-End Learning for Structured Prediction Energy Networks
Belanger, David, Yang, Bishan, McCallum, Andrew
Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly.
Summer transfer window: Record set to be broken in Premier League spending spree
Swansea's Gylfi Sigurdsson has been valued at ยฃ50m, Everton have spent ยฃ90m and Manchester United bought Romelu Lukaku for ยฃ75m - so is Newcastle boss Rafael Benitez right to call this summer's transfer window "a little bit crazy"? Premier League clubs' spending has already surpassed ยฃ500m since the end of last season - and business analysts Deloitte say they are on course to set another new record by 31 August. Teams spent a record ยฃ1.165bn last summer, rising to ยฃ1.38bn after the January window. Football finance expert Rob Wilson says the market "hyper-inflation" means anyone selling to an English club is adding "at least 40%, if not 50%, to the deal". And football agent Jon Smith says a ยฃ30m transfer - such as goalkeeper Jordan Pickford's move from Sunderland to Everton - is "the new norm".
L.A. County median home price breaks record set during last decade's housing boom
In summer 2007, the Los Angeles County median home price hit an all-time high of $550,000. It soon plunged as the housing bubble burst and the national economy crashed. Now the median, the point where half the homes sold for more and half for less, has finally passed the heights of 10 years ago -- the result of an improving economy, historically low mortgage rates and a shortage of listings. According to a report released Wednesday from real estate firm CoreLogic, the county's median price in May rose 6.8% from a year earlier to reach $560,500 as sales jumped 4.8%. When adjusted for inflation, May's median remains 11% below the 2007 high, though the nominal record comes amid fresh concerns over the high cost of housing in California.
Renewable Energy Record Set in U.S.
Solar panels stand at the Ivanpah Solar Electric Generating System in the Mojave Desert near Primm, Nevada in 2014. California and Arizona by far generate the most electricity with solar power in the U.S. The U.S. set a new renewable energy milestone in March, in data released Wednesday. For the first time, wind and solar accounted for 10 percent of all electricity generation, with wind comprising 8 percent and solar coming in at 2 percent. The report was published by the U.S. Energy Information Administration (EIA), which collects and disseminates environmental data that is used to inform policymakers. Wind and solar generation typically peaks in the spring and fall when there is less energy demand, and the EIA expects April to continue the record-setting 10 percent trend.
Learning from networked examples
Wang, Yuyi, Ramon, Jan, Guo, Zheng-Chu
Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities.
Poincar\'e Embeddings for Learning Hierarchical Representations
Nickel, Maximilian, Kiela, Douwe
Representation learning has become an invaluable approach for learning from symbolic data such as text and graphs. However, while complex symbolic datasets often exhibit a latent hierarchical structure, state-of-the-art methods typically learn embeddings in Euclidean vector spaces, which do not account for this property. For this purpose, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an n-dimensional Poincar\'e ball. Due to the underlying hyperbolic geometry, this allows us to learn parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity. We introduce an efficient algorithm to learn the embeddings based on Riemannian optimization and show experimentally that Poincar\'e embeddings outperform Euclidean embeddings significantly on data with latent hierarchies, both in terms of representation capacity and in terms of generalization ability.
#NuggsforCarter: Teen reaches all-time retweet record
Carter Wilkerson, 16, of Reno took on a Wendy's challenge to get 18 million retweets for free chicken nuggets for a year. On Tuesday, May 9, 2017, he beat the record set by Ellen DeGeneres. He will get the free nuggets. Carter Wilkerson, 16, of Reno, reached the top of the Twitter game Tuesday morning with the most retweets ever (3.441 million as of 11:24 a.m. And he'll get his chicken nuggets, plus $100,000 to a national charity.
Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis
Fu, Xiao, Huang, Kejun, Hong, Mingyi, Sidiropoulos, Nicholas D., So, Anthony Man-Cho
Generalized canonical correlation analysis (GCCA) aims at finding latent low-dimensional common structure from multiple views (feature vectors in different domains) of the same entities. Unlike principal component analysis (PCA) that handles a single view, (G)CCA is able to integrate information from different feature spaces. Here we focus on MAX-VAR GCCA, a popular formulation which has recently gained renewed interest in multilingual processing and speech modeling. The classic MAX-VAR GCCA problem can be solved optimally via eigen-decomposition of a matrix that compounds the (whitened) correlation matrices of the views; but this solution has serious scalability issues, and is not directly amenable to incorporating pertinent structural constraints such as non-negativity and sparsity on the canonical components. We posit regularized MAX-VAR GCCA as a non-convex optimization problem and propose an alternating optimization (AO)-based algorithm to handle it. Our algorithm alternates between {\em inexact} solutions of a regularized least squares subproblem and a manifold-constrained non-convex subproblem, thereby achieving substantial memory and computational savings. An important benefit of our design is that it can easily handle structure-promoting regularization. We show that the algorithm globally converges to a critical point at a sublinear rate, and approaches a global optimal solution at a linear rate when no regularization is considered. Judiciously designed simulations and large-scale word embedding tasks are employed to showcase the effectiveness of the proposed algorithm.
Conditional Similarity Networks
Veit, Andreas, Belongie, Serge, Karaletsos, Theofanis
What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. A main reason for this is that contradicting notions of similarities cannot be captured in a single space. To address this shortcoming, we propose Conditional Similarity Networks (CSNs) that learn embeddings differentiated into semantically distinct subspaces that capture the different notions of similarities. CSNs jointly learn a disentangled embedding where features for different similarities are encoded in separate dimensions as well as masks that select and reweight relevant dimensions to induce a subspace that encodes a specific similarity notion. We show that our approach learns interpretable image representations with visually relevant semantic subspaces. Further, when evaluating on triplet questions from multiple similarity notions our model even outperforms the accuracy obtained by training individual specialized networks for each notion separately.