AITopics

1309.6158

Country: North America > United States > New York (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

arXiv.org Artificial IntelligenceSep-24-2013

Generating Explanations for Biomedical Queries

Erdem, Esra, Oztok, Umut

We introduce novel mathematical models and algorithms to generate (shortest or k different) explanations for biomedical queries, using answer set programming. We implement these algorithms and integrate them in BIOQUERY-ASP. We illustrate the usefulness of these methods with some complex biomedical queries related to drug discovery, over the biomedical knowledge resources PHARMGKB, DRUGBANK, BIOGRID, CTD, SIDER, DISEASE ONTOLOGY and ORPHADATA. To appear in Theory and Practice of Logic Programming (TPLP).

artificial intelligence, logic & formal reasoning, vertex, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S1471068413000598

1309.6297

Country: Asia > Middle East > Republic of Türkiye (0.28)

Genre: Research Report (0.63)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Cai, T. Tony, Zhou, Wen-Xin

A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion

Matrix completion, which aims to recover a low-rank matrix from a subset of its entries, has been an active area of research in the last few years. It has a range of successful applications. In some real-life situations, however, the observations are highly quantized, sometimes even to a single bit and thus the standard matrix completion techniques do not apply. Take the Netflix problem as an example, the observations are the ratings of movies, which are quantized to the set of integers from 1 to 5. In the more extreme case such as recommender systems, only a single bit of rating standing for a "thumbs up" or "thumbs down" is recorded at each occurrence. Another example of applications is targeted advertising, such as the relevance of advertisements on Hulu.

artificial intelligence, machine learning, optimization problem, (18 more...)

1309.6013

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (1.00)

Industry:

Media > Television (0.68)
Media > Film (0.54)
Information Technology > Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Borgström, Johannes, Gordon, Andrew D, Greenberg, Michael, Margetson, James, Van Gael, Jurgen

Measure Transformer Semantics for Bayesian Machine Learning

arXiv.org Artificial IntelligenceSep-23-2013

The Bayesian approach to machine learning amounts to computing posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define measure-transformer combinators inspired by theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that is processed by an existing inference engine for factor graphs, which are data structures that enable many efficient inference algorithms. This allows efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.

artificial intelligence, machine learning, programming language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.2168/LMCS-9(3:11)2013

1308.0689

Country:

Europe (0.92)
North America > United States (0.92)

Genre: Research Report (0.82)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Fercoq, Olivier, Richtárik, Peter

Smooth minimization of nonsmooth functions with parallel coordinate descent methods

We study the performance of a family of randomized parallel coordinate descent methods for minimizing the sum of a nonsmooth and separable convex functions. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss ("AdaBoost problem"). We assume the input data defining the loss function is contained in a sparse $m\times n$ matrix $A$ with at most $\omega$ nonzeros in each row. Our methods need $O(n \beta/\tau)$ iterations to find an approximate solution with high probability, where $\tau$ is the number of processors and $\beta = 1 + (\omega-1)(\tau-1)/(n-1)$ for the fastest variant. The notation hides dependence on quantities such as the required accuracy and confidence levels and the distance of the starting iterate from an optimal point. Since $\beta/\tau$ is a decreasing function of $\tau$, the method needs fewer iterations when more processors are used. Certain variants of our algorithms perform on average only $O(\nnz(A)/n)$ arithmetic operations during a single iteration per processor and, because $\beta$ decreases when $\omega$ does, fewer iterations are needed for sparser problems.

artificial intelligence, descent method, machine learning, (19 more...)

1309.5885

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Scalable Spectral Algorithms for Community Detection in Directed Networks

Kim, Sungmin, Shi, Tao

Many real world problems can be effectively modeled as pairwise relationship in networks where nodes represent entities of interest and links mimic the interactions or relationships between them. The study of networks, recently referred to as network science, can provide insight into their structures and properties. One particularly interesting problem in network studies is searching for important sub-networks which are called communities, modules or groups. A community in a network is typically characterized by a group of nodes that have more links connected within the community than connected to other nodes (Fortunato, 2010). In many practical applications, the networks in study are directed in nature, such as the World Wide Web, tweeter's follower-followee network, and citation networks. Compared with in-depth studies of community structures in undirected networks (Danon et al., 2005; Fortunato, 2010; Coscia, Giannotti and Pedreschi, 2011), community detection in directed networks has not been as fruitful.

artificial intelligence, data mining, machine learning, (18 more...)

1211.6807

Country: North America > United States (0.46)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Mousavi, Ali, Maleki, Arian, Baraniuk, Richard G.

Asymptotic Analysis of LASSOs Solution Path with Implications for Approximate Message Passing

This paper concerns the performance of the LASSO (also knows as basis pursuit denoising) for recovering sparse signals from undersampled, randomized, noisy measurements. We consider the recovery of the signal $x_o \in \mathbb{R}^N$ from $n$ random and noisy linear observations $y= Ax_o + w$, where $A$ is the measurement matrix and $w$ is the noise. The LASSO estimate is given by the solution to the optimization problem $x_o$ with $\hat{x}_{\lambda} = \arg \min_x \frac{1}{2} \|y-Ax\|_2^2 + \lambda \|x\|_1$. Despite major progress in the theoretical analysis of the LASSO solution, little is known about its behavior as a function of the regularization parameter $\lambda$. In this paper we study two questions in the asymptotic setting (i.e., where $N \rightarrow \infty$, $n \rightarrow \infty$ while the ratio $n/N$ converges to a fixed number in $(0,1)$): (i) How does the size of the active set $\|\hat{x}_\lambda\|_0/N$ behave as a function of $\lambda$, and (ii) How does the mean square error $\|\hat{x}_{\lambda} - x_o\|_2^2/N$ behave as a function of $\lambda$? We then employ these results in a new, reliable algorithm for solving LASSO based on approximate message passing (AMP).

artificial intelligence, lasso, optimization problem, (16 more...)

1309.5979

Genre: Research Report (0.64)

Technology:

Information Technology > Architecture > Distributed Systems (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Narayanan, Hariharan, Rakhlin, Alexander

Efficient Sampling from Time-Varying Log-Concave Distributions

We propose a computationally efficient random walk on a convex body which rapidly mixes and closely tracks a time-varying log-concave distribution. We develop general theoretical guarantees on the required number of steps; this number can be calculated on the fly according to the distance from and the shape of the next distribution. We then illustrate the technique on several examples. Within the context of exponential families, the proposed method produces samples from a posterior distribution which is updated as data arrive in a streaming fashion. The sampling technique can be used to track time-varying truncated distributions, as well as to obtain samples from a changing mixture model, fitted in a streaming fashion to data. In the setting of linear optimization, the proposed method has oracle complexity with best known dependence on the dimension for certain geometries. In the context of online learning and repeated games, the algorithm is an efficient method for implementing no-regret mixture forecasting strategies. Remarkably, in some of these examples, only one step of the random walk is needed to track the next distribution.

artificial intelligence, machine learning, markov chain, (19 more...)

1309.5977

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Ghadimi, Saeed, Lan, Guanghui

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

arXiv.org Machine LearningSep-21-2013

In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and show that such modification allows to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

artificial intelligence, machine learning, optimization problem, (17 more...)

1309.5549

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Ramamurthy, Karthikeyan Natesan, Thiagarajan, Jayaraman J., Spanias, Andreas

Recovering Non-negative and Combined Sparse Representations

arXiv.org Machine LearningSep-20-2013

The non-negative solution to an underdetermined linear system can be uniquely recovered sometimes, even without imposing any additional sparsity constraints. In this paper, we derive conditions under which a unique non-negative solution for such a system can exist, based on the theory of polytopes. Furthermore, we develop the paradigm of combined sparse representations, where only a part of the coefficient vector is constrained to be non-negative, and the rest is unconstrained (general). We analyze the recovery of the unique, sparsest solution, for combined representations, under three different cases of coefficient support knowledge: (a) the non-zero supports of non-negative and general coefficients are known, (b) the non-zero support of general coefficients alone is known, and (c) both the non-zero supports are unknown. For case (c), we propose the combined orthogonal matching pursuit algorithm for coefficient recovery and derive the deterministic sparsity threshold under which recovery of the unique, sparsest coefficient vector is possible. We quantify the order complexity of the algorithms, and examine their performance in exact and approximate recovery of coefficients under various conditions of noise. Furthermore, we also obtain their empirical phase transition characteristics. We show that the basis pursuit algorithm, with partial non-negative constraints, and the proposed greedy algorithm perform better in recovering the unique sparse representation when compared to their unconstrained counterparts. Finally, we demonstrate the utility of the proposed methods in recovering images corrupted by saturation noise.

algorithm, artificial intelligence, machine learning, (19 more...)

1303.4694

Country:

North America > United States > Texas (0.14)
North America > United States > Arizona (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)