AITopics

1112.6411

Country: Europe (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Huang, Jian, Zhang, Cun-Hui

Estimation And Selection Via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

arXiv.org Machine LearningDec-29-2011

The $\ell_1$-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted $\ell_1$-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted $\ell_1$-penalized estimator in sparse, high-dimensional settings where the number of predictors $p$ can be much larger than the sample size $n$. Adaptive Lasso is considered as a special case. A multistage method is developed to apply an adaptive Lasso recursively. We provide $\ell_q$ oracle inequalities, a general selection consistency theorem, and an upper bound on the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results.

artificial intelligence, lasso, machine learning, (17 more...)

1112.6363

Country: Europe (0.28)

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Journal of Artificial Intelligence ResearchDec-28-2011

Dr.Fill: Crosswords and an Implemented Solver for Singly Weighted CSPs

Ginsberg, M. L.

We describe Dr.Fill, a program that solves American-style crossword puzzles. From a technical perspective, Dr.Fill works by converting crosswords to weighted CSPs, and then using a variety of novel techniques to find a solution. These techniques include generally applicable heuristics for variable and value selection, a variant of limited discrepancy search, and postprocessing and partitioning ideas. Branch and bound is not used, as it was incompatible with postprocessing and was determined experimentally to be of little practical value. Dr.Filll's performance on crosswords from the American Crossword Puzzle Tournament suggests that it ranks among the top fifty or so crossword solvers in the world.

artificial intelligence, natural language, puzzle, (17 more...)

doi: 10.1613/jair.3437

AI Access Foundation

10741

Country:

North America > United States > California > Santa Clara County > Stanford (0.14)
Europe > France (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report > Promising Solution (0.48)

Industry:

Media (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Rafi, Muhammad, Shaikh, M. Shahid, Farooq, Amir

Document Clustering based on Topic Maps

arXiv.org Artificial IntelligenceDec-28-2011

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next challenge lies in semantically performing clustering based on the semantic contents of the document. The problem of document clustering has two main components: (1) to represent the document in such a form that inherently captures semantics of the text. This may also help to reduce dimensionality of the document, and (2) to define a similarity measure based on the semantic representation such that it assigns higher numerical values to document pairs which have higher semantic relationship. Feature space of the documents can be very challenging for document clustering. A document may contain multiple topics, it may contain a large set of class-independent general-words, and a handful class-specific core-words. With these features in mind, traditional agglomerative clustering algorithms, which are based on either Document Vector model (DVM) or Suffix Tree model (STC), are less efficient in producing results with high cluster quality. This paper introduces a new approach for document clustering based on the Topic Map representation of the documents. The document is being transformed into a compact form. A similarity measure is proposed based upon the inferred information through topic maps data and structures. The suggested method is implemented using agglomerative hierarchal clustering and tested on standard Information retrieval (IR) datasets. The comparative experiment reveals that the proposed approach is effective in improving the cluster quality.

data mining, machine learning, topic map, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.5120/1640-2204

1112.6219

Country:

Europe (0.94)
Asia > Pakistan (0.31)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceDec-27-2011

Convex Optimization without Projection Steps

Jaggi, Martin

For the general problem of minimizing a convex function over a compact convex domain, we will investigate a simple iterative approximation algorithm based on the method by Frank & Wolfe 1956, that does not need projection steps in order to stay inside the optimization domain. Instead of a projection step, the linearized problem defined by a current subgradient is solved, which gives a step direction that will naturally stay in the domain. Our framework generalizes the sparse greedy algorithm of Frank & Wolfe and its primal-dual analysis by Clarkson 2010 (and the low-rank SDP approach by Hazan 2008) to arbitrary convex domains. We give a convergence proof guaranteeing {\epsilon}-small duality gap after O(1/{\epsilon}) iterations. The method allows us to understand the sparsity of approximate solutions for any l1-regularized convex optimization problem (and for optimization over the simplex), expressed as a function of the approximation quality. We obtain matching upper and lower bounds of {\Theta}(1/{\epsilon}) for the sparsity for l1-problems. The same bounds apply to low-rank semidefinite optimization with bounded trace, showing that rank O(1/{\epsilon}) is best possible here as well. As another application, we obtain sparse matrices of O(1/{\epsilon}) non-zero entries as {\epsilon}-approximate solutions when optimizing any convex function over a class of diagonally dominant symmetric matrices. We show that our proposed first-order method also applies to nuclear norm and max-norm matrix optimization problems. For nuclear norm regularized optimization, such as matrix completion and low-rank recovery, we demonstrate the practical efficiency and scalability of our algorithm for large matrix problems, as e.g. the Netflix dataset. For general convex optimization over bounded matrix max-norm, our algorithm is the first with a convergence guarantee, to the best of our knowledge.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1108.117

Country:

Europe (0.27)
North America > United States (0.27)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Snyder, John C., Rupp, Matthias, Hansen, Katja, Müller, Klaus-Robert, Burke, Kieron

Finding Density Functionals with Machine Learning

arXiv.org Machine LearningDec-22-2011

Institute of Pharmaceutical Sciences, ETH Zurich, 8093 Zürich, Switzerland (Dated: March 5, 2018) Machine learning is used to approximate density functionals. For the model problem of the kinetic energy of non-interacting fermions in 1d, mean absolute errors below 1 kcal/mol on test densities similar to the training set are reached with fewer than 100 training densities. A predictor identifies if a test density is within the interpolation region. Via principal component analysis, a projected functional derivative finds highly accurate self-consistent densities. Challenges for application of our method to real electronic structure problems are discussed.

approximation, artificial intelligence, machine learning, (15 more...)

doi: 10.1103/PhysRevLett.108.253002

1112.5441

Country:

Europe > Switzerland > Zürich > Zürich (0.45)
North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Duvenaud, David, Nickisch, Hannes, Rasmussen, Carl Edward

Additive Gaussian Processes

arXiv.org Machine LearningDec-19-2011

We introduce a Gaussian process model of functions which areadditive . An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

artificial intelligence, kernel, machine learning, (18 more...)

1112.4394

Country:

Europe > United Kingdom > England (0.14)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Machine LearningDec-19-2011

A New Algorithm for Exploratory Projection Pursuit

Dayal, Mohit

In this paper, we propose a new algorithm for exploratory projection pursuit. The basis of the algorithm is the insight that previous approaches used fairly narrow definitions of interestingness / non interestingness. We argue that allowing these definitions to depend on the problem / data at hand is a more natural approach in an exploratory technique. This also allows our technique much greater applicability than the approaches extant in the literature. Complementing this insight, we propose a class of projection indices based on the spatial distribution function that can make use of such information. Finally, with the help of real datasets, we demonstrate how a range of multivariate exploratory tasks can be addressed with our algorithm. The examples further demonstrate that the proposed indices are quite capable of focussing on the interesting structure in the data, even when this structure is otherwise hard to detect or arises from very subtle patterns.

artificial intelligence, machine learning, projection, (16 more...)

1112.4321

Country: Europe > Italy (1.00)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Freer, Cameron E., Roy, Daniel M.

Computable de Finetti measures

arXiv.org Machine LearningDec-19-2011

We prove a computable version of de Finetti's theorem on exchangeable sequences of real random variables. As a consequence, exchangeable stochastic processes expressed in probabilistic functional programming languages can be automatically rewritten as procedures that do not modify non-local state. Along the way, we prove that a distribution on the unit interval is computable if and only if its moments are uniformly computable. Key words: de Finetti's theorem, exchangeability, computable probability theory, probabilistic programming languages, mutation 2010 MSC: 03D78, 60G09, 68Q10, 03F60, 68N18 1. Introduction The classical de Finetti theorem states that an exchangeable sequence of real random variables is a mixture of independent and identically distributed (i.i.d.) sequences of random variables. Moreover, there is an (almost surely unique) measure-valued random variable, called the directing random measure, conditioned on which the random sequence is i.i.d. The distribution of the directing random measure is called the de Finetti measure or the mixing measure. This paper examines the computable probability theory of exchangeable sequences of real-valued random variables. We prove a computable version of de Finetti's theorem: the distribution of an exchangeable sequence of real random variables is computable if and only if its de Finetti measure is computable. The classical proofs do not readily effectivize; instead, we show how to directly compute the de Finetti measure (as characterized by the classical theorem) in terms of a computable representation of the distribution of the exchangeable sequence. A key step in the proof is to describe the de Finetti measure in terms of the moments of a set of random variables derived from the exchangeable sequence. When the directing random measure is (almost surely) continuous, we can show that these moments are computable, which suffices to complete the proof of the main theorem in this case.

artificial intelligence, machine learning, programming language, (17 more...)

doi: 10.1016/j.apal.2011.06.011

0912.1072

Country:

Europe (0.93)
North America > United States (0.68)

Genre: Research Report (0.70)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)

Journal of Artificial Intelligence ResearchDec-18-2011

Theoretical and Practical Foundations of Large-Scale Agent-Based Micro-Storage in the Smart Grid

Vytelingum, P., Voice, T. D., Ramchurn, S. D., Rogers, A., Jennings, N. R.

In this paper, we present a novel decentralised management technique that allows electricity micro-storage devices, deployed within individual homes as part of a smart electricity grid, to converge to profitable and efficient behaviours. Specifically, we propose the use of software agents, residing on the users' smart meters, to automate and optimise the charging cycle of micro-storage devices in the home to minimise its costs, and we present a study of both the theoretical underpinnings and the implications of a practical solution, of using software agents for such micro-storage management. First, by formalising the strategic choice each agent makes in deciding when to charge its battery, we develop a game-theoretic framework within which we can analyse the competitive equilibria of an electricity grid populated by such agents and hence predict the best consumption profile for that population given their battery properties and individual load profiles. Our framework also allows us to compute theoretical bounds on the amount of storage that will be adopted by the population. Second, to analyse the practical implications of micro-storage deployments in the grid, we present a novel algorithm that each agent can use to optimise its battery storage profile in order to minimise its owner's costs. This algorithm uses a learning strategy that allows it to adapt as the price of electricity changes in real-time, and we show that the adoption of these strategies results in the system converging to the theoretical equilibria. Finally, we empirically evaluate the adoption of our micro-storage management technique within a complex setting, based on the UK electricity market, where agents may have widely varying load profiles, battery types, and learning rates. In this case, our approach yields savings of up to 14% in energy cost for an average consumer using a storage device with a capacity of less than 4.5 kWh and up to a 7% reduction in carbon emissions resulting from electricity generation (with only domestic consumers adopting micro-storage and, commercial and industrial consumers not changing their demand). Moreover, corroborating our theoretical bound, an equilibrium is shown to exist where no more than 48% of households would wish to own storage devices and where social welfare would also be improved (yielding overall annual savings of nearly £1.5B).

agent, electricity, storage profile, (14 more...)

doi: 10.1613/jair.3446

AI Access Foundation

10739

Country:

North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Ground > Road (1.00)
Energy > Power Industry > Utilities (0.87)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)