AITopics

#artificialintelligenceMay-1-2011, 12:00:00 GMT

(PDF) What is AIED and why does Education need it?

Challenges for Computing include Learning for Life (Taylor et al, 2008). Grand Research Challenges in Information Systems identifies the need to "provide a teacher for These are amongst the key challenges that AIED responds to. What will next generation AIED learning environments be like? GROE report (Woolf, 2010), in order to highlight the expected role of AIED research.

artificial intelligence, human computer interaction, learner, (12 more...)

#artificialintelligence

Country:

North America > United States (0.05)
Europe > United Kingdom (0.05)

Genre: Instructional Material (0.33)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence (0.53)
Information Technology > Human Computer Interaction > Interfaces (0.33)

Papadopoulos, H., Vovk, V., Gammerman, A.

Regression Conformal Prediction with Nearest Neighbours

Journal of Artificial Intelligence ResearchApr-30-2011

In this paper we apply Conformal Prediction (CP) to the k-Nearest Neighbours Regression (k-NNR) algorithm and propose ways of extending the typical nonconformity measure used for regression so far. Unlike traditional regression methods which produce point predictions, Conformal Predictors output predictive regions that satisfy a given confidence level. The regions produced by any Conformal Predictor are automatically valid, however their tightness and therefore usefulness depends on the nonconformity measure used by each CP. In effect a nonconformity measure evaluates how strange a given example is compared to a set of other examples based on some traditional machine learning algorithm. We define six novel nonconformity measures based on the k-Nearest Neighbours Regression algorithm and develop the corresponding CPs following both the original (transductive) and the inductive CP approaches. A comparison of the predictive regions produced by our measures with those of the typical regression measure suggests that a major improvement in terms of predictive region tightness is achieved by the new measures.

icp, nonconformity measure, predictive region, (13 more...)

doi: 10.1613/jair.3198

AI Access Foundation

10703

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(7 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Journal of Artificial Intelligence ResearchApr-30-2011

Scaling up Heuristic Planning with Relational Decision Trees

De la Rosa, T., Jimenez, S., Fuentetaja, R., Borrajo, D.

Current evaluation functions for heuristic planning are expensive to compute. In numerous planning problems these functions provide good guidance to the solution, so they are worth the expense. However, when evaluation functions are misguiding or when planning problems are large enough, lots of node evaluations must be computed, which severely limits the scalability of heuristic planners. In this paper, we present a novel solution for reducing node evaluations in heuristic planning based on machine learning. Particularly, we define the task of learning search control for heuristic planning as a relational classification task, and we use an off-the-shelf relational classification tool to address this learning task. Our relational classification task captures the preferred action to select in the different planning contexts of a specific planning domain. These planning contexts are defined by the set of helpful actions of the current state, the goals remaining to be achieved, and the static predicates of the planning task. This paper shows two methods for guiding the search of a heuristic planner with the learned classifiers. The first one consists of using the resulting classifier as an action policy. The second one consists of applying the classifier to generate lookahead states within a Best First Search algorithm. Experiments over a variety of domains reveal that our heuristic planner using the learned classifiers solves larger problems than state-of-the-art planners.

algorithm, configuration, roller, (17 more...)

doi: 10.1613/jair.3231

AI Access Foundation

10702

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Galicia > Madrid (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Genre: Workflow (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Murtagh, Fionn, Contreras, Pedro

Methods of Hierarchical Clustering

arXiv.org Machine LearningApr-30-2011

Agglomerative hierarchical clustering has been the dominant approach to constructing embedded classification schemes. It is our aim to direct the reader's attention to practical algorithms and methods - both efficient (from the computational and storage points of view) and effective (from the application point of view). It is often helpful to distinguish between method, involving a compactness criterion and the target structure of a 2-way tree representing the partial order on subsets of the power set; as opposed to an implementation, which relates to the detail of the algorithm used. As with many other multivariate techniques, the objects to be classified have numerical measurements on a set of variables or attributes. Hence, the analysis is carried out on the rows of an array or matrix.

artificial intelligence, machine learning, survey article, (19 more...)

1105.0121

Country:

Europe (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Overview (0.93)
Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Choi, David S., Wolfe, Patrick J., Airoldi, Edoardo M.

Stochastic blockmodels with growing number of classes

arXiv.org Machine LearningApr-30-2011

We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising a collection of Facebook profiles, resulting in block estimates that reveal residual structure.

artificial intelligence, machine learning, social media, (19 more...)

doi: 10.1093/biomet/asr053

1011.4644

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Mannor, Shie, Tsitsiklis, John

Mean-Variance Optimization in Markov Decision Processes

arXiv.org Artificial IntelligenceApr-29-2011

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NPhard for some cases, and strongly NPhard for others. We finally offer pseudopolynomial exact and approximation algorithms. The classical theory of Markov decision processes (MDPs) deals with the maximization of the cumulative (possibly discounted) expected reward, to be denoted by W. However, a risk-averse decision maker may be interested in additional distributional properties of W. In this paper, we focus on the case where the decision maker is interested in both the mean and the variance of the cumulative reward, and we explore the associated computational issues. Risk aversion in MDPs is of course an old subject. Problems of this type can be handled by state augmentation (e.g., Bertsekas, 1995), namely, by introducing an auxiliary state variable that keeps track of the cumulative past reward. In a few special cases, e.g., with an exponential utility function, state augmentation is unnecessary, and optimal policies can be found by solving a modified Bellman equation (Chung & Sobel, 1987).

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1104.5601

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Agarwal, Alekh, Duchi, John C.

Distributed Delayed Stochastic Optimization

arXiv.org Machine LearningApr-28-2011

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to the development of gradient-based distributed optimization algorithms where a master node performs parameter updates while worker nodes compute stochastic gradients based on local information in parallel, which may give rise to delays due to asynchrony. We take motivation from statistical problems where the size of the data is so large that it cannot fit on one computer; with the advent of huge datasets in biology, astronomy, and the internet, such problems are now common. Our main contribution is to show that for smooth stochastic problems, the delays are asymptotically negligible and we can achieve order-optimal convergence results. In application to distributed optimization, we develop procedures that overcome communication bottlenecks and synchronization requirements. We show $n$-node architectures whose optimization error in stochastic problems---in spite of asynchronous delays---scales asymptotically as $\order(1 / \sqrt{nT})$ after $T$ iterations. This rate is known to be optimal for a distributed system with $n$ nodes even in the absence of delays. We additionally complement our theoretical results with numerical experiments on a statistical machine learning task.

algorithm, artificial intelligence, machine learning, (18 more...)

1104.5525

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

arXiv.org Machine LearningApr-28-2011

Notes on a New Philosophy of Empirical Science

Burfoot, Daniel

This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.

artificial intelligence, machine learning, natural language, (25 more...)

1104.5466

Country:

Europe (0.67)
Asia (0.67)
North America > United States > New York (0.45)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(11 more...)

Oymak, Samet, Hassibi, Babak

Finding Dense Clusters via "Low Rank + Sparse" Decomposition

arXiv.org Machine LearningApr-27-2011

Finding "densely connected clusters" in a graph is in general an important and well studied problem in the literature \cite{Schaeffer}. It has various applications in pattern recognition, social networking and data mining \cite{Duda,Mishra}. Recently, Ames and Vavasis have suggested a novel method for finding cliques in a graph by using convex optimization over the adjacency matrix of the graph \cite{Ames, Ames2}. Also, there has been recent advances in decomposing a given matrix into its "low rank" and "sparse" components \cite{Candes, Chandra}. In this paper, inspired by these results, we view "densely connected clusters" as imperfect cliques, where imperfections correspond missing edges, which are relatively sparse. We analyze the problem in a probabilistic setting and aim to detect disjointly planted clusters. Our main result basically suggests that, one can find \emph{dense} clusters in a graph, as long as the clusters are sufficiently large. We conclude by discussing possible extensions and future research directions.

artificial intelligence, machine learning, optimization problem, (17 more...)

1104.5186

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)