AITopics

1902.11102

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Telecommunications (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Machine LearningFeb-27-2019

Ranking in Genealogy: Search Results Fusion at Ancestry

Jiang, Peng, Yang, Yingrui, Bierner, Gann, Li, Fengjie Alex, Wang, Ruhan, Moghtaderi, Azadeh

Genealogy research is the study of family history using available resources such as historical records. Ancestry provides its customers with one of the world's largest online genealogical index with billions of records from a wide range of sources, including vital records such as birth and death certificates, census records, court and probate records among many others. Search at Ancestry aims to return relevant records from various record types, allowing our subscribers to build their family trees, research their family history, and make meaningful discoveries about their ancestors from diverse perspectives. In a modern search engine designed for genealogical study, the appropriate ranking of search results to provide highly relevant information represents a daunting challenge. In particular, the disparity in historical records makes it inherently difficult to score records in an equitable fashion. Herein, we provide an overview of our solutions to overcome such record disparity problems in the Ancestry search engine. Specifically, we introduce customized coordinate ascent (customized CA) to speed up ranking within a specific record type. We then propose stochastic search (SS) that linearly combines ranked results federated across contents from various record types. Furthermore, we propose a novel information retrieval metric, normalized cumulative entropy (NCE), to measure the diversity of results. We demonstrate the effectiveness of these two algorithms in terms of relevance (by NDCG) and diversity (by NCE) if applicable in the offline experiments using real customer data at Ancestry.

artificial intelligence, information retrieval, natural language, (20 more...)

1903.00099

Country:

North America > United States (0.46)
Asia (0.46)

Genre: Research Report (0.82)

Industry: Law (0.48)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Online Framework for Demand-Responsive Stochastic Route Optimization

Peled, Inon, Lee, Kelvin, Jiang, Yu, Dauwels, Justin, Pereira, Francisco C.

This study develops an online predictive optimization framework for operating a fleet of autonomous vehicles to enhance mobility in an area, where there exists a latent spatio-temporal distribution of demand for commuting between locations. The proposed framework integrates demand prediction and supply optimization in the network design problem. For demand prediction, our framework estimates a marginal demand distribution for each Origin-Destination pair of locations through Quantile Regression, using counts of crowd movements as a proxy for demand. The framework then combines these marginals into a joint demand distribution by constructing a Gaussian copula, which captures the structure of correlation between different Origin-Destination pairs. For supply optimization, we devise a demand-responsive service, based on linear programming, in which route structure and frequency vary according to the predicted demand. We evaluate our framework using a dataset of movement counts, aggregated from WiFi records of a university campus in Denmark, and the results show that our framework outperforms conventional methods for route optimization, which do not utilize the full predictive distribution.

deep learning, neural network, optimization, (22 more...)

1902.09745

Country:

Europe > Denmark (0.34)
Asia (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.94)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

A Dictionary-Based Generalization of Robust PCA Part II: Applications to Hyperspectral Demixing

Rambhatla, Sirisha, Li, Xingguo, Ren, Jineng, Haupt, Jarvis

We consider the task of localizing targets of interest in a hyperspectral (HS) image based on their spectral signature(s), by posing the problem as two distinct convex demixing task(s). With applications ranging from remote sensing to surveillance, this task of target detection leverages the fact that each material/object possesses its own characteristic spectral response, depending upon its composition. However, since $\textit{signatures}$ of different materials are often correlated, matched filtering-based approaches may not be apply here. To this end, we model a HS image as a superposition of a low-rank component and a dictionary sparse component, wherein the dictionary consists of the $\textit{a priori}$ known characteristic spectral responses of the target we wish to localize, and develop techniques for two different sparsity structures, resulting from different model assumptions. We also present the corresponding recovery guarantees, leveraging our recent theoretical results from a companion paper. Finally, we analyze the performance of the proposed approach via experimental evaluations on real HS datasets for a classification task, and compare its performance with related techniques.

artificial intelligence, d-rpca, machine learning, (18 more...)

1902.10238

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Europe > Italy (0.14)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Energy (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Iyer, Rishabh, Bilmes, Jeff

A Memoization Framework for Scaling Submodular Optimization to Large Scale Problems

arXiv.org Artificial IntelligenceFeb-26-2019

We are motivated by large scale submodular optimization problems, where standard algorithms that treat the submodular functions in the \emph{value oracle model} do not scale. In this paper, we present a model called the \emph{precomputational complexity model}, along with a unifying memoization based framework, which looks at the specific form of the given submodular function. A key ingredient in this framework is the notion of a \emph{precomputed statistic}, which is maintained in the course of the algorithms. We show that we can easily integrate this idea into a large class of submodular optimization problems including constrained and unconstrained submodular maximization, minimization, difference of submodular optimization, optimization with submodular constraints and several other related optimization problems. Moreover, memoization can be integrated in both discrete and continuous relaxation flavors of algorithms for these problems. We demonstrate this idea for several commonly occurring submodular functions, and show how the precomputational model provides significant speedups compared to the value oracle model. Finally, we empirically demonstrate this for large scale machine learning problems of data subset selection and summarization.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1902.10176

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)

Genre: Research Report (0.40)

Industry: Education > Focused Education > Special Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Adaptive Gradient Methods with Dynamic Bound of Learning Rate

Luo, Liangchen, Xiong, Yuanhao, Liu, Yan, Sun, Xu

Adaptive optimization methods such as AdaGrad, RMSprop and Adam have been proposed to achieve a rapid training process with an element-wise scaling term on learning rates. Though prevailing, they are observed to generalize poorly compared with SGD or even fail to converge due to unstable and extreme learning rates. Recent work has put forward some algorithms such as AMSGrad to tackle this issue but they failed to achieve considerable improvement over existing methods. In our paper, we demonstrate that extreme learning rates can lead to poor performance. We provide new variants of Adam and AMSGrad, called AdaBound and AMSBound respectively, which employ dynamic bounds on learning rates to achieve a gradual and smooth transition from adaptive methods to SGD and give a theoretical proof of convergence. We further conduct experiments on various popular tasks and models, which is often insufficient in previous work. Experimental results show that new variants can eliminate the generalization gap between adaptive methods and SGD and maintain higher learning speed early in training at the same time. Moreover, they can bring significant improvement over their prototypes, especially on complex deep networks. The implementation of the algorithm can be found at https://github.com/Luolc/AdaBound .

algorithm, artificial intelligence, machine learning, (20 more...)

1902.09843

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Quadratic Decomposable Submodular Function Minimization: Theory and Practice

Li, Pan, He, Niao, Milenkovic, Olgica

We introduce a new convex optimization problem, termed quadratic decomposable submodular function minimization (QDSFM), which allows to model a number of learning tasks on graphs and hypergraphs. The problem exhibits close ties to decomposable submodular function minimization (DSFM), yet is much more challenging to solve. We approach the problem via a new dual strategy and formulate an objective that can be optimized through a number of double-loop algorithms. The outer-loop uses either random coordinate descent (RCD) or alternative projection (AP) methods, for both of which we prove linear convergence rates. The inner-loop computes projections onto cones generated by base polytopes of the submodular functions, via the modified min-norm-point or Frank-Wolfe algorithm. We also describe two new applications of QDSFM: hypergraph-adapted PageRank and semi-supervised learning. The proposed hypergraph-based PageRank algorithm can be used for local hypergraph partitioning, and comes with provable performance guarantees. For hypergraph-adapted semi-supervised learning, we provide numerical experiments demonstrating the efficiency of our QDSFM solvers and their significant improvements on prediction accuracy when compared to state-of-the-art methods. Keywords: Submodular functions, Lovász extensions, Random coordinate descend, Frank-Wolf method, PageRank.

artificial intelligence, hypergraph, machine learning, (14 more...)

1902.10132

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Fused Lasso for Feature Selection using Structural Information

Cui, Lixin, Bai, Lu, Hancock, Edwin R.

Feature selection has been proven a powerful preprocessing step for high-dimensional data analysis. However, most state-of-the-art methods suffer from two major drawbacks. First, they usually overlook the structural correlation information between pairwise samples, which may encapsulate useful information for refining the performance of feature selection. Second, they usually consider candidate feature relevancy equivalent to selected feature relevancy, and some less relevant features may be misinterpreted as salient features. To overcome these issues, we propose a new fused lasso for feature selection using structural information. Our idea is based on converting the original vectorial features into structure-based feature graph representations to incorporate structural relationship between samples, and defining a new evaluation measure to compute the joint significance of pairwise feature combinations in relation to the target feature graph. Furthermore, we formulate the corresponding feature subset selection problem into a least square regression model associated with a fused lasso regularizer to simultaneously maximize the joint relevancy and minimize the redundancy of the selected features. To effectively solve the challenging optimization problem, an iterative algorithm is developed to identify the most discriminative features. Experiments demonstrate the effectiveness of the proposed approach.

artificial intelligence, information, machine learning, (18 more...)

1902.09947

Country: North America > United States (0.69)

Genre: Research Report (0.70)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Shiraishi, Tatsuya, Le, Tam, Kashima, Hisashi, Yamada, Makoto

Topological Bayesian Optimization with Persistence Diagrams

arXiv.org Machine LearningFeb-25-2019

Finding an optimal parameter of a black-box function is important for searching stable material structures and finding optimal neural network structures, and Bayesian optimization algorithms are widely used for the purpose. However, most of existing Bayesian optimization algorithms can only handle vector data and cannot handle complex structured data. In this paper, we propose the topological Bayesian optimization, which can efficiently find an optimal solution from structured data using \emph{topological information}. More specifically, in order to apply Bayesian optimization to structured data, we extract useful topological information from a structure and measure the proper similarity between structures. To this end, we utilize persistent homology, which is a topological data analysis method that was recently applied in machine learning. Moreover, we propose the Bayesian optimization algorithm that can handle multiple types of topological information by using a linear combination of kernels for persistence diagrams. Through experiments, we show that topological information extracted by persistent homology contributes to a more efficient search for optimal structures compared to the random search baseline and the graph Bayesian optimization algorithm.

artificial intelligence, machine learning, optimization, (16 more...)

1902.09722

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report (0.50)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Magnússon, Sindri, Shokri-Ghadikolaei, Hossein, Li, Na

On Maintaining Linear Convergence of Distributed Learning and Optimization under Limited Communication

arXiv.org Machine LearningFeb-25-2019

In parallel and distributed machine learning multiple nodes or processors coordinate to solve large problems. To do this, nodes need to compress important algorithm information to bits so it can be communicated. The goal of this paper is to explore how we can maintain the convergence of distributed algorithms under such compression. In particular, we consider a general class of linearly convergent parallel/distributed algorithms and illustrate how we can design quantizers compressing the communicated information to few bits while still preserving the linear convergence. We illustrate our results on learning algorithms using different communication structures, such as decentralized algorithms where a single master coordinates information from many workers and fully distributed algorithms where only neighbors in a communication graph can communicate. We also numerically implement our results in distributed learning on smartphones using real-world data.

algorithm, artificial intelligence, machine learning, (15 more...)

1902.11163

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)