AITopics

1401.0086

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Desai, Kalpit V, Ranjan, Roopesh

Insights from the Wikipedia Contest (IEEE Contest for Data Mining 2011)

arXiv.org Machine LearningJan-7-2014

The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate into the Wikipedia editors' community, i.e. the community is becoming increasingly harder to penetrate [1]. To sustain healthy growth of the community, the Wikimedia Foundation aims to quantitatively understand the factors that determine the editing behavior, and explain why most new editors become inactive soon after joining. As a step towards this broader goal, the Wikimedia foundation sponsored the ICDM (IEEE International Conference for Data Mining) contest [2] for the year 2011. The objective for the participants was to develop models to predict the number of edits that an editor will make in future five months based on the editing history of the editor. Here we describe the approach we followed for developing predictive models towards this goal, the results that we obtained and the modeling insights that we gained from this exercise. In addition, towards the broader goal of Wikimedia Foundation, we also summarize the factors that emerged during our model building exercise as powerful predictors of future editing activity.

data mining, machine learning, predictor, (20 more...)

1405.7393

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Data Science > Data Mining (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Artificial IntelligenceJan-7-2014

Cortical prediction markets

Balduzzi, David

We investigate cortical learning from the perspective of mechanism design. First, we show that discretizing standard models of neurons and synaptic plasticity leads to rational agents maximizing simple scoring rules. Second, our main result is that the scoring rules are proper, implying that neurons faithfully encode expected utilities in their synaptic weights and encode high-scoring outcomes in their spikes. Third, with this foundation in hand, we propose a biologically plausible mechanism whereby neurons backpropagate incentives which allows them to optimize their usefulness to the rest of cortex. Finally, experiments show that networks that backpropagate incentives can learn simple tasks. Keywords: incentives for cooperation, multiagent learning, biologically-inspired approaches, prediction markets 1. Introduction How does the brain encode information about the environment into its structure [26]? Inspired by recent work in prediction markets, this paper investigates cortical learning and the neural code from the perspective of mechanism design [15, 18, 2, 3, 1]. To the best of our knowledge it is the first paper to do so.

artificial intelligence, machine learning, neuron, (17 more...)

1401.1465

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading > Prediction Market (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Kiwaki, Taichi, Makino, Takaki, Aihara, Kazuyuki

Approximated Infomax Early Stopping: Revisiting Gaussian RBMs on Natural Images

We pursue an early stopping technique that helps Gaussian Restricted Boltzmann Machines (GRBMs) to gain good natural image representations in terms of overcompleteness and data fitting. GRBMs are widely considered as an unsuitable model for natural images because they gain non-overcomplete representations which include uniform filters that do not represent useful image features. We have recently found that GRBMs once gain and subsequently lose useful filters during their training, contrary to this common perspective. We attribute this phenomenon to a tradeoff between overcompleteness of GRBM representations and data fitting. To gain GRBM representations that are overcomplete and fit data well, we propose a measure for GRBM representation quality, approximated mutual information, and an early stopping technique based on this measure. The proposed method boosts performance of classifiers trained on GRBM representations.

artificial intelligence, machine learning, representation, (15 more...)

1312.5412

Country:

Asia > Japan (0.15)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

Jin, Chi, Wang, Ziteng, Huang, Junliang, Zhong, Yiqiao, Wang, Liwei

Machine learning is often conducted on datasets containing sensitive information, such as medical records, commercial data, etc. The benefit of learning from such data is tremendous. But when releasing sensitive data, one must take privacy into consideration, and has to tradeoff between the accuracy and the amount of privacy loss of the individuals in the database. In this paper we study differential privacy [11], which has become a standard concept of privacy. Differential privacy guarantees that almost nothing new can be learned from the database that contains one specific individual's information compared with that from the database without that individual's information. More concretely, a mechanism which releases information about the database is said to preserve differential privacy, if the change of a single database element does not affect the probability distribution of the output significantly. Therefore differential privacy provides strong guarantees against attacks; the risk of any individual to submit her information to the database is very small.

artificial intelligence, machine learning, mechanism, (19 more...)

1401.0987

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Loh, Po-Ling, Wainwright, Martin J.

Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory and convex analysis. These population-level results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide nonasymptotic guarantees for such methods and illustrate the sharpness of these predictions via simulations.

artificial intelligence, graph, machine learning, (14 more...)

doi: 10.1214/13-AOS1162

1212.0478

Country: North America > United States > California > Alameda County (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)

Yang, Jiyan, Meng, Xiangrui, Mahoney, Michael W.

Quantile Regression for Large-scale Applications

Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed covariates than methods such as Least-squares or Least Absolute Deviations regression. It can be expressed as a linear program, and, with appropriate preprocessing, interior-point methods can be used to find a solution for moderately large problems. Dealing with very large problems, \emph(e.g.), involving data up to and beyond the terabyte regime, remains a challenge. Here, we present a randomized algorithm that runs in nearly linear time in the size of the input and that, with constant probability, computes a $(1+\epsilon)$ approximate solution to an arbitrary quantile regression problem. As a key step, our algorithm computes a low-distortion subspace-preserving embedding with respect to the loss function of quantile regression. Our empirical evaluation illustrates that our algorithm is competitive with the best previous work on small to medium-sized problems, and that in addition it can be implemented in MapReduce-like environments and applied to terabyte-sized problems.

algorithm, artificial intelligence, optimization problem, (16 more...)

1305.0087

Country: North America > United States > California > Santa Clara County (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

arXiv.org Artificial IntelligenceJan-6-2014

Constraint Solvers for User Interface Layout

Jamil, Noreen

Constraints have played an important role in the construction of GUIs, where they are mainly used to define the layout of the widgets. Resizing behavior is very important in GUIs because areas have domain specific parameters such as form the resizing of windows. If linear objective function is used and window is resized then error is not distributed equally. To distribute the error equally, a quadratic objective function is introduced. Different algorithms are widely used for solving linear constraints and quadratic problems in a variety of different scientific areas. The linear relxation, Kaczmarz, direct and linear programming methods are common methods for solving linear constraints for GUI layout. The interior point and active set methods are most commonly used techniques to solve quadratic programming problems. Current constraint solvers designed for GUI layout do not use interior point methods for solving a quadratic objective function subject to linear equality and inequality constraints. In this paper, performance aspects and the convergence speed of interior point and active set methods are compared along with one most commonly used linear programming method when they are implemented for graphical user interface layout. The performance and convergence of the proposed algorithms are evaluated empirically using randomly generated UI layout specifications of various sizes. The results show that the interior point algorithms perform significantly better than the Simplex method and QOCA-solver, which uses the active set method implementation for solving quadratic optimization.

artificial intelligence, constraint, optimization problem, (16 more...)

1401.1031

Country: North America > United States (0.69)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Hoos, Holger, Kaminski, Roland, Lindauer, Marius, Schaub, Torsten

Solver Scheduling via Answer Set Programming

arXiv.org Artificial IntelligenceJan-6-2014

Although Boolean Constraint Technology has made tremendous progress over the last decade, the efficacy of state-of-the-art solvers is known to vary considerably across different types of problem instances and is known to depend strongly on algorithm parameters. This problem was addressed by means of a simple, yet effective approach using handmade, uniform and unordered schedules of multiple solvers in ppfolio, which showed very impressive performance in the 2011 SAT Competition. Inspired by this, we take advantage of the modeling and solving capacities of Answer Set Programming (ASP) to automatically determine more refined, that is, non-uniform and ordered solver schedules from existing benchmarking data. We begin by formulating the determination of such schedules as multi-criteria optimization problems and provide corresponding ASP encodings. The resulting encodings are easily customizable for different settings and the computation of optimum schedules can mostly be done in the blink of an eye, even when dealing with large runtime data sets stemming from many solvers on hundreds to thousands of instances. Also, the fact that our approach can be customized easily enabled us to swiftly adapt it to generate parallel schedules for multi-processor machines.

artificial intelligence, logic & formal reasoning, solver, (18 more...)

1401.1024

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Maier, Marc, Marazopoulou, Katerina, Jensen, David

Reasoning about Independence in Probabilistic Models of Relational Data

arXiv.org Artificial IntelligenceJan-6-2014

We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1302.4381

Country: North America > United States > Massachusetts (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.45)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)