AITopics | Information Technology

Collaborating Authors

Information Technology

Low-Cost Learning via Active Data Procurement

Abernethy, Jacob, Chen, Yiling, Ho, Chien-Ju, Waggoner, Bo

arXiv.org Machine LearningJun-5-2015

We design mechanisms for online procurement of data held by strategic agents for machine learning tasks. The challenge is to use past data to actively price future data and give learning guarantees even when an agent's cost for revealing her data may depend arbitrarily on the data itself. We achieve this goal by showing how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. Our results in a sense parallel classic sample complexity guarantees, but with the key resource being money rather than quantity of data: With a budget constraint $B$, we give robust risk (predictive error) bounds on the order of $1/\sqrt{B}$. Because we use an active approach, we can often guarantee to do significantly better by leveraging correlations between costs and data. Our algorithms and analysis go through a model of no-regret learning with $T$ arriving pairs (cost, data) and a budget constraint of $B$. Our regret bounds for this model are on the order of $T/\sqrt{B}$ and we give lower bounds on the same order.

algorithm, artificial intelligence, health & medicine, (17 more...)

arXiv.org Machine Learning

doi: 10.1145/2764468.2764519

1502.05774

Country: North America > United States (0.14)

Genre: Research Report (0.84)

Industry:

Health & Medicine (0.67)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Cai, T. Tony, Li, Xiaodong

arXiv.org Machine LearningJun-3-2015

Community detection, which aims to cluster $N$ nodes in a given graph into $r$ distinct groups based on the observed undirected edges, is an important problem in network data analysis. In this paper, the popular stochastic block model (SBM) is extended to the generalized stochastic block model (GSBM) that allows for adversarial outlier nodes, which are connected with the other nodes in the graph in an arbitrary way. Under this model, we introduce a procedure using convex optimization followed by $k$-means algorithm with $k=r$. Both theoretical and numerical properties of the method are analyzed. A theoretical guarantee is given for the procedure to accurately detect the communities with small misclassification rate under the setting where the number of clusters can grow with $N$. This theoretical result admits to the best-known result in the literature of computationally feasible community detection in SBM without outliers. Numerical results show that our method is both computationally fast and robust to different kinds of outliers, while some popular computationally fast community detection algorithms, such as spectral clustering applied to adjacency matrices or graph Laplacians, may fail to retrieve the major clusters due to a small portion of outliers. We apply a slight modification of our method to a political blogs data set, showing that our method is competent in practice and comparable to existing computationally feasible methods in the literature. To the best of the authors' knowledge, our result is the first in the literature in terms of clustering communities with fast growing numbers under the GSBM where a portion of arbitrary outlier nodes exist.

artificial intelligence, data mining, node, (19 more...)

arXiv.org Machine Learning

doi: 10.1214/14-AOS1290

1404.6

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Information Technology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Caffe con Troll: Shallow Ideas to Speed Up Deep Learning

Hadjis, Stefan, Abuzaid, Firas, Zhang, Ce, Ré, Christopher

arXiv.org Machine LearningMay-26-2015

We present Caffe con Troll (CcT), a fully compatible end-to-end version of the popular framework Caffe with rebuilt internals. We built CcT to examine the performance characteristics of training and deploying general-purpose convolutional neural networks across different hardware architectures. We find that, by employing standard batching optimizations for CPU training, we achieve a 4.5x throughput improvement over Caffe on popular networks like CaffeNet. Moreover, with these improvements, the end-to-end training time for CNNs is directly proportional to the FLOPS delivered by the CPU, which enables us to efficiently train hybrid CPU-GPU systems for CNNs.

caffe, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1504.04343

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MCODE: Multivariate Conditional Outlier Detection

Hong, Charmgil, Hauskrecht, Milos

arXiv.org Machine LearningMay-15-2015

Outlier detection aims to identify unusual data instances that deviate from expected patterns. The outlier detection is particularly challenging when outliers are context dependent and when they are defined by unusual combinations of multiple outcome variable values. In this paper, we develop and study a new conditional outlier detection approach for multivariate outcome spaces that works by (1) transforming the conditional detection to the outlier detection problem in a new (unconditional) space and (2) defining outlier scores by analyzing the data in the new space. Our approach relies on the classifier chain decomposition of the multi-dimensional classification problem that lets us transform the output space into a probability vector, one probability for each dimension of the output space. Outlier scores applied to these transformed vectors are then used to detect the outliers. Experiments on multiple multi-dimensional classification problems with the different outlier injection rates show that our methodology is robust and able to successfully identify outliers when outliers are either sparse (manifested in one or very few dimensions) or dense (affecting multiple dimensions).

health & medicine, outlier, survey article, (21 more...)

arXiv.org Machine Learning

1505.04097

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Petuum: A New Platform for Distributed Machine Learning on Big Data

Xing, Eric P., Ho, Qirong, Dai, Wei, Kim, Jin Kyu, Wei, Jinliang, Lee, Seunghak, Zheng, Xun, Xie, Pengtao, Kumar, Abhimanu, Yu, Yaoliang

arXiv.org Machine LearningMay-14-2015

What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches tends to pull systems and algorithms design in different directions, and it remains difficult to find a universal platform applicable to a wide range of ML programs at scale. We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions. This presents unique opportunities for an integrative system design, such as bounded-error network synchronization and dynamic scheduling based on ML program structure. We demonstrate the efficacy of these system designs versus well-known implementations of modern ML algorithms, allowing ML programs to run in much less time and at considerably larger model sizes, even on modestly-sized compute clusters.

algorithm, deep learning, neural network, (24 more...)

arXiv.org Machine Learning

1312.7651

Country: Asia (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Information Gathering in Networks via Active Exploration

Singla, Adish, Horvitz, Eric, Kohli, Pushmeet, White, Ryen, Krause, Andreas

arXiv.org Artificial IntelligenceMay-6-2015

How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm NetExp for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.

artificial intelligence, node, social media, (21 more...)

arXiv.org Artificial Intelligence

1504.06423

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Social Trust Prediction via Max-norm Constrained 1-bit Matrix Completion

Wang, Jing, Shen, Jie, Xu, Huan

arXiv.org Machine LearningApr-24-2015

Social trust prediction addresses the significant problem of exploring interactions among users in social networks. Naturally, this problem can be formulated in the matrix completion framework, with each entry indicating the trustness or distrustness. However, there are two challenges for the social trust problem: 1) the observed data are with sign (1-bit) measurements; 2) they are typically sampled non-uniformly. Most of the previous matrix completion methods do not well handle the two issues. Motivated by the recent progress of max-norm, we propose to solve the problem with a 1-bit max-norm constrained formulation. Since max-norm is not easy to optimize, we utilize a reformulation of max-norm which facilitates an efficient projected gradient decent algorithm. We demonstrate the superiority of our formulation on two benchmark datasets.

algorithm, crowdsourcing, social media, (20 more...)

arXiv.org Machine Learning

1504.06394

Country:

Asia > Singapore (0.14)
North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Communications > Social Media > Crowdsourcing (0.62)

Add feedback

Spectral Norm of Random Kernel Matrices with Applications to Privacy

Kasiviswanathan, Shiva Prasad, Rudelson, Mark

arXiv.org Machine LearningApr-22-2015

Kernel methods are an extremely popular set of techniques used for many important machine learning and data analysis applications. In addition to having good practical performances, these methods are supported by a well-developed theory. Kernel methods use an implicit mapping of the input data into a high dimensional feature space defined by a kernel function, i.e., a function returning the inner product between the images of two data points in the feature space. Central to any kernel method is the kernel matrix, which is built by evaluating the kernel function on a given sample dataset. In this paper, we initiate the study of non-asymptotic spectral theory of random kernel matrices. These are n x n random matrices whose (i,j)th entry is obtained by evaluating the kernel function on $x_i$ and $x_j$, where $x_1,...,x_n$ are a set of n independent random high-dimensional vectors. Our main contribution is to obtain tight upper bounds on the spectral norm (largest eigenvalue) of random kernel matrices constructed by commonly used kernel functions based on polynomials and Gaussian radial basis. As an application of these results, we provide lower bounds on the distortion needed for releasing the coefficients of kernel ridge regression under attribute privacy, a general privacy notion which captures a large class of privacy definitions. Kernel ridge regression is standard method for performing non-parametric regression that regularly outperforms traditional regression approaches in various domains. Our privacy distortion lower bounds are the first for any kernel technique, and our analysis assumes realistic scenarios for the input, unlike all previous lower bounds for other release problems which only hold under very restrictive input settings.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Machine Learning

1504.0588

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Add feedback

Learning of Behavior Trees for Autonomous Agents

Colledanchise, Michele, Parasuraman, Ramviyas, Ögren, Petter

arXiv.org Artificial IntelligenceApr-22-2015

Definition of an accurate system model for Automated Planner (AP) is often impractical, especially for real-world problems. Conversely, off-the-shelf planners fail to scale up and are domain dependent. These drawbacks are inherited from conventional transition systems such as Finite State Machines (FSMs) that describes the action-plan execution generated by the AP. On the other hand, Behavior Trees (BTs) represent a valid alternative to FSMs presenting many advantages in terms of modularity, reactiveness, scalability and domain-independence. In this paper, we propose a model-free AP framework using Genetic Programming (GP) to derive an optimal BT for an autonomous agent to achieve a given goal in unknown (but fully observable) environments. We illustrate the proposed framework using experiments conducted with an open source benchmark Mario AI for automated generation of BTs that can play the game character Mario to complete a certain level at various levels of difficulty to include enemies and obstacles.

computer game, node, planning & scheduling, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TG.2018.2816806

1504.05811

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Semantic Enrichment of Mobile Phone Data Records Using Background Knowledge

Dashdorj, Zolzaya, Sobolevsky, Stanislav, Serafini, Luciano, Antonelli, Fabrizio, Ratti, Carlo

arXiv.org Artificial IntelligenceApr-21-2015, 19:00:00 GMT

Every day, billions of mobile network events (i.e. CDRs) are generated by cellular phone operator companies. Latent in this data are inspiring insights about human actions and behaviors, the discovery of which is important because context-aware applications and services hold the key to user-driven, intelligent services, which can enhance our everyday lives such as social and economic development, urban planning, and health prevention. The major challenge in this area is that interpreting such a big stream of data requires a deep understanding of mobile network events' context through available background knowledge. This article addresses the issues in context awareness given heterogeneous and uncertain data of mobile network events missing reliable information on the context of this activity. The contribution of this research is a model from a combination of logical and statistical reasoning standpoints for enabling human activity inference in qualitative terms from open geographical data that aimed at improving the quality of human behaviors recognition tasks from CDRs. We use open geographical data, Openstreetmap (OSM), as a proxy for predicting the content of human activity in the area. The user study performed in Trento shows that predicted human activities (top level) match the survey data with around 93% overall accuracy. The extensive validation for predicting a more specific economic type of human activity performed in Barcelona, by employing credit card transaction data. The analysis identifies that appropriately normalized data on points of interest (POI) is a good proxy for predicting human economical activities, with 84% accuracy on average. So the model is proven to be efficient for predicting the context of human activity, when its total level could be efficiently observed from cell phone data records, missing contextual information however.

ground transportation, human activity, télécommunications, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.knosys.2017.11.038

1504.05895

Country:

Europe (1.00)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry:

Telecommunications (1.00)
Health & Medicine (1.00)
Transportation > Ground (0.68)
(6 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Add feedback