Information Technology
ATTac-2000: An Adaptive Autonomous Bidding Agent
Kearns, M., Littman, M. L., Singh, S., Stone, P.
The First Trading Agent Competition (TAC) was held from June 22nd to July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of e-marketplaces and to motivate researchers to apply unique approaches to a common task. This article describes ATTac-2000, the first-place finisher in TAC. ATTac-2000 uses a principled bidding strategy that includes several elements of adaptivity. In addition to the success at the competition, isolated empirical results are presented indicating the robustness and effectiveness of ATTac-2000's adaptive strategy.
b-Bit Minwise Hashing for Large-Scale Linear SVM
Li, Ping, Moore, Joshua, Konig, Christian
In this paper, we propose to (seamlessly) integrate b-bit minwise hashing with linear SVM to substantially improve the training (and testing) efficiency using much smaller memory, with essentially no loss of accuracy. Theoretically, we prove that the resemblance matrix, the minwise hashing matrix, and the b-bit minwise hashing matrix are all positive definite matrices (kernels). Interestingly, our proof for the positive definiteness of the b-bit minwise hashing kernel naturally suggests a simple strategy to integrate b-bit hashing with linear SVM. Our technique is particularly useful when the data can not fit in memory, which is an increasingly critical issue in large-scale machine learning. Our preliminary experimental results on a publicly available webspam dataset (350K samples and 16 million dimensions) verified the effectiveness of our algorithm. For example, the training time was reduced to merely a few seconds. In addition, our technique can be easily extended to many other linear and nonlinear machine learning applications such as logistic regression.
Mean field for Markov Decision Processes: from Discrete to Continuous Optimization
Gast, Nicolas, Gaujal, Bruno, Boudec, Jean-Yves Le
We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.
Happy Movie: A Group Recommender Application in Facebook
Quijano-Sรกnchez, Lara (Universidad Complutense de Madrid) | Recio-Garcia, Juan A. (Universidad Complutense de Madrid) | Dรญaz-Agudo, Belรฉn (Universidad Complutense de Madrid) | Jimenez-Diaz, Guillermo (Universidad Complutense de Madrid)
In this paper we introduce our recommender Happy Movie, a Facebook application for movie recommendation to groups. This system exploits information about the social relationships and behaviour of the users to provide better recommendations. Our previous works have shown that social factors improve the recommendation results. However it required many questionnaires to be filled for obtaining the social information, so we have moved to a social network environment where this information is easily available.
Building Integrated Opinion Delivery Environment
Galitsky, Boris (University of Girona) | Rose, Josep Lluis de la (Universitat de Girona) | Dobrocsi, Gabor (University of Miskolc Miskolc )
We introduce a search engine and information retrieval system for providing access to opinion data. Natural language technology of generalization of syntactic parse trees is introduced as a similarity measure between subjects of textual opinions to link them on the fly. Information extraction algorithm for automatic summarization of web pages in the format of Google sponsored links is presented. We outline the usability of the implemented system, integrated opinion delivery environment (IODE).
Geotagging Tweets Using Their Content
Paradesi, Sharon Myrtle (Massachusetts Institute of Technology)
Harnessing rich, but unstructured information on social networks in real-time and showing it to relevant audience based on its geographic location is a major challenge. The system developed, TwitterTagger, geotags tweets and shows them to users based on their current physical location. Experimental validation shows a performance improvement of three orders by TwitterTagger compared to that of the baseline model.
Supporting End-User Authoring of Alternate Reality Games with Cross-Location Compatibility
Hajarnis, Sanjeet (Georgia Institute of Technology) | Barve, Chinmay (Georgia Institute of Technology) | Karnik, Devika (Georgia Institute of Technology) | Riedl, Mark (Georgia Institute of Technology)
A typical ARG consists of a Puppet Master who issues that have historically prevented ARGs from designs the game and informs players of the unfolding of mainstream adoption. A generic game engine runs on a the story. With the advent of smart-phones with GPS, geo-location enabled mobile device enables users to play ARGs progressively make use of the actual world as the any game modeled as a dependency graph of game content.
Generating Similar Graphs From Spherical Features
Lunga, Dalton, Kirshner, Sergey
We propose a novel model for generating graphs similar to a given example graph. Unlike standard approaches that compute features of graphs in Euclidean space, our approach obtains features on a surface of a hypersphere. We then utilize a von Mises-Fisher distribution, an exponential family distribution on the surface of a hypersphere, to define a model over possible feature values. While our approach bears similarity to a popular exponential random graph model (ERGM), unlike ERGMs, it does not suffer from degeneracy, a situation when a significant probability mass is placed on unrealistic graphs. We propose a parameter estimation approach for our model, and a procedure for drawing samples from the distribution. We evaluate the performance of our approach both on the small domain of all 8-node graphs as well as larger real-world social networks.
All-at-once Optimization for Coupled Matrix and Tensor Factorizations
Acar, Evrim, Kolda, Tamara G., Dunlavy, Daniel M.
Joint analysis of data from multiple sources has the potential to improve our understanding of the underlying structures in complex data sets. For instance, in restaurant recommendation systems, recommendations can be based on rating histories of customers. In addition to rating histories, customers' social networks (e.g., Facebook friendships) and restaurant categories information (e.g., Thai or Italian) can also be used to make better recommendations. The task of fusing data, however, is challenging since data sets can be incomplete and heterogeneous, i.e., data consist of both matrices, e.g., the person by person social network matrix or the restaurant by category matrix, and higher-order tensors, e.g., the "ratings" tensor of the form restaurant by meal by person. In this paper, we are particularly interested in fusing data sets with the goal of capturing their underlying latent structures. We formulate this problem as a coupled matrix and tensor factorization (CMTF) problem where heterogeneous data sets are modeled by fitting outer-product models to higher-order tensors and matrices in a coupled manner. Unlike traditional approaches solving this problem using alternating algorithms, we propose an all-at-once optimization approach called CMTF-OPT (CMTF-OPTimization), which is a gradient-based optimization approach for joint analysis of matrices and higher-order tensors. We also extend the algorithm to handle coupled incomplete data sets. Using numerical experiments, we demonstrate that the proposed all-at-once approach is more accurate than the alternating least squares approach.
Stochastic blockmodels with growing number of classes
Choi, David S., Wolfe, Patrick J., Airoldi, Edoardo M.
We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising a collection of Facebook profiles, resulting in block estimates that reveal residual structure.