AITopics

doi: 10.1109/JSTSP.2014.2299517

1309.6707

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (0.74)
Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceJan-21-2014

Compositional Operators in Distributional Semantics

Kartsaklis, Dimitri

The recent developments on the syntactical and morphological analysis of natural language text constitute the first step towards a more ambitious goal, that of assigning a proper form of meaning to arbitrary text compounds. Indeed, for certain really "intelligent" applications, such as machine translation, question-answering systems, paraphrase detection, or automatic essay scoring, to name just a few, there will always exist a gap between raw linguistic information (such as part-of-speech labels, for example) and the knowledge of the real world that is needed for the completion of the task in a satisfactory way. Semantic analysis has exactly this role, aiming to close (or reduce as much as possible) this gap by linking the linguistic information with semantic representations that embody this elusive real-world knowledge. The traditional way of adding semantics to sentences is a syntax-driven compositional approach: every word in the sentence is associated with a primitive symbol or a predicate, and these are combined to larger and larger logical forms based on the syntactical rules of the grammar. At the end of the syntactical analysis, the logical representation of the whole sentence is a complex formula that can be fed to a theorem prover for further processing. Although such an approach seems intuitive, it has been shown that it is rather inefficient for any practical application (for example, Bos and Markert (2006) get very low recall scores for a textual entailment task).

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s40362-014-0017-z

1401.5327

Country:

North America > United States (0.67)
Asia (0.67)
Europe > United Kingdom > England (0.28)

Genre:

Overview (1.00)
Research Report (0.63)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJan-19-2014

Anomaly detection in reconstructed quantum states using a machine-learning technique

Hara, Satoshi, Ono, Takafumi, Okamoto, Ryo, Washio, Takashi, Takeuchi, Shigeki

The accurate detection of small deviations in given density matrices is important for quantum information processing. Here we propose a new method based on the concept of data mining. We demonstrate that the proposed method can more accurately detect small erroneous deviations in reconstructed density matrices, which contain intrinsic fluctuations due to the limited number of samples, than a naive method of checking the trace distance from the average of the given density matrices. This method has the potential to be a key tool in broad areas of physics where the detection of small deviations of quantum states reconstructed using a limited number of samples are essential.

artificial intelligence, data mining, machine learning, (18 more...)

doi: 10.1103/PhysRevA.89.022104

1401.4785

Country: Asia > Japan > Honshū (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Garcia-Cardona, Cristina, Merkurjev, Ekaterina, Bertozzi, Andrea L., Flenner, Arjuna, Percus, Allon

Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

arXiv.org Machine LearningJan-17-2014

We present two graph-based algorithms for multiclass segmentation of high-dimensional data. The algorithms use a diffuse interface model based on the Ginzburg-Landau functional, related to total variation compressed sensing and image processing. A multiclass extension is introduced using the Gibbs simplex, with the functional's double-well potential modified to handle the multiclass case. The first algorithm minimizes the functional using a convex splitting numerical scheme. The second algorithm is a uses a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding. We demonstrate the performance of both algorithms experimentally on synthetic data, grayscale and color images, and several benchmark data sets such as MNIST, COIL and WebKB. We also make use of fast numerical solvers for finding the eigenvectors and eigenvalues of the graph Laplacian, and take advantage of the sparsity of the matrix. Experiments indicate that the results are competitive with or better than the current state-of-the-art multiclass segmentation algorithms.

algorithm, artificial intelligence, upstream oil & gas, (18 more...)

1302.3913

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Wisconsin (0.14)
South America (0.14)
(4 more...)

Genre:

Research Report (0.64)
Personal (0.46)

Industry:

Education > Educational Setting > Higher Education (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)

Baingana, Brian, Giannakis, Georgios B.

Embedding Graphs under Centrality Constraints for Network Visualization

arXiv.org Machine LearningJan-17-2014

In this case, the vertex dissimilarity structure is preserved through pairwise distance metrics between vertices. Principal component analysis (PCA) of the graph adjacency matrix is advocated in [3], leading to a spectral embedding whose vertices correspond to entries of the leading component vectors. The structure preserving embedding algorithm [4] solves a semidefinite program with linear topology constraints so that a nearest neighbor algorithm can recover the graph edges from the embedding. Visual analytics approaches developed in [7] and [12] emphasize community structures with applications to community browsing in graphs. Concentric graph layouts developed in [39] and [30] capture notions of node hierarchy by placing the highest ranked nodes at the center of the embedding. Although the graph embedding problem has been studied for years, development of fast and optimal visualization algorithms with hierarchical constraints is challenging and existing methods typically resort to heuristic approaches. The growing interest in analysis of very large networks has prioritized the need for effectively capturing hierarchy over aesthetic appeal in visualization. For instance, a hierarchy-aware visual analysis of a global computer network is naturally more useful to security experts trying to protect the most critical nodes from a viral infection. Layouts of metro-transit networks that clearly show terminals routing the bulk of traffic convey a better picture about the most critical nodes in the event of a terrorist attack.

constraint, data mining, machine learning, (19 more...)

1401.4408

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Terrorism (0.54)
Information Technology (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.48)

arXiv.org Artificial IntelligenceJan-17-2014

Learning AMP Chain Graphs and some Marginal Models Thereof under Faithfulness: Extended Version

Peña, Jose M.

This paper deals with chain graphs under the Andersson-Madigan-Perlman (AMP) interpretation. In particular, we present a constraint based algorithm for learning an AMP chain graph a given probability distribution is faithful to. Moreover, we show that the extension of Meek's conjecture to AMP chain graphs does not hold, which compromises the development of efficient and correct score+search learning algorithms under assumptions weaker than faithfulness. We also introduce a new family of graphical models that consists of undirected and bidirected edges. We name this new family maximal covariance-concentration graphs (MCCGs) because it includes both covariance and concentration graphs as subfamilies. However, every MCCG can be seen as the result of marginalizing out some nodes in an AMP CG. We describe global, local and pairwise Markov properties for MCCGs and prove their equivalence. We characterize when two MCCGs are Markov equivalent, and show that every Markov equivalence class of MCCGs has a distinguished member. We present a constraint based algorithm for learning a MCCG a given probability distribution is faithful to. Finally, we present a graphical criterion for reading dependencies from a MCCG of a probability distribution that satisfies the graphoid properties, weak transitivity and composition. We prove that the criterion is sound and complete in certain sense.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

1303.0691

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Talvitie, Erik, Singh, Satinder

Learning to Make Predictions In Partially Observable Environments Without a Generative Model

When faced with the problem of learning a model of a high-dimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (non-Markov) environments, standard model-learning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: non-generative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original model-learning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models.

generative model, history, prediction, (14 more...)

doi: 10.1613/jair.3396

1401.387

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Pennsylvania > Lancaster County > Lancaster (0.04)

Genre: Research Report (0.69)

Industry: Education > Focused Education > Special Education (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Nonparametric Latent Tree Graphical Models: Inference, Estimation, and Structure Learning

Song, Le, Liu, Han, Parikh, Ankur, Xing, Eric

Modern data acquisition routinely produces massive amounts of high dimensional data with complex statistical dependency structures. Latent variable graphical models provide a succinct representation of such complex dependency structures by relating the observed variables to a set of latent ones. By defining a joint distribution over observed and latent variables, the marginal distribution of the observed variables can be obtained by integrating out the latent ones. This allows complex distributions over observed variables (e.g., clique models) to be expressed in terms of more tractable joint models (e.g., tree models) over the augmented variable space. Probabilistic graphical models with latent variables have been deployed successfully to a diverse range of problems such as in document analysis (Blei et al., 2002), social network modeling (Hoff et al., 2002), speech recognition (Rabiner and Juang, 1986) and bioinformatics (Clark, 1990). In this paper, we focus on latent variable models where the latent structures are trees (we call it a "latent tree" for short).

artificial intelligence, hilbert space, machine learning, (17 more...)

1401.394

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Choi, David, Wolfe, Patrick J.

Co-clustering separately exchangeable network data

This article establishes the performance of stochastic blockmodels in addressing the co-clustering problem of partitioning a binary array into subsets, assuming only that the data are generated by a nonparametric process satisfying the condition of separate exchangeability. We provide oracle inequalities with rate of convergence $\mathcal{O}_P(n^{-1/4})$ corresponding to profile likelihood maximization and mean-square error minimization, and show that the blockmodel can be interpreted in this setting as an optimal piecewise-constant approximation to the generative nonparametric model. We also show for large sample sizes that the detection of co-clusters in such data indicates with high probability the existence of co-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.

artificial intelligence, lemma 5, machine learning, (19 more...)

doi: 10.1214/13-AOS1173

1212.4093

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Telecommunications > Networks (0.41)
Information Technology > Networks (0.41)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

György, András, Kocsis, Levente

Efficient Multi-Start Strategies for Local Search Algorithms

Local search algorithms applied to optimization problems often suffer from getting trapped in a local optimum. The common solution for this deficiency is to restart the algorithm when no progress is observed. Alternatively, one can start multiple instances of a local search algorithm, and allocate computational resources (in particular, processing time) to the instances depending on their behavior. Hence, a multi-start strategy has to decide (dynamically) when to allocate additional resources to a particular instance and when to start new instances. In this paper we propose multi-start strategies motivated by works on multi-armed bandit problems and Lipschitz optimization with an unknown constant. The strategies continuously estimate the potential performance of each algorithm instance by supposing a convergence rate of the local search algorithm up to an unknown constant, and in every phase allocate resources to those instances that could converge to the optimum for a particular range of the constant. Asymptotic bounds are given on the performance of the strategies. In particular, we prove that at most a quadratic increase in the number of times the target function is evaluated is needed to achieve the performance of a local search algorithm started from the attraction region of the optimum. Experiments are provided using SPSA (Simultaneous Perturbation Stochastic Approximation) and k-means as local search algorithms, and the results indicate that the proposed strategies work well in practice, and, in all cases studied, need only logarithmically more evaluations of the target function as opposed to the theoretically suggested quadratic increase.

algorithm, artificial intelligence, local search algorithm, (12 more...)

doi: 10.1613/jair.3313

1401.3894

Country:

Europe > Hungary (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.92)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)