AITopics | arXiv.org Machine Learning

Collaborating Authors

arXiv.org Machine Learning

Time-Varying Networks: Recovering Temporally Rewiring Genetic Networks During the Life Cycle of Drosophila melanogaster

arXiv.org Machine LearningJan-6-2009

Due to the dynamic nature of biological systems, biological networks underlying temporal process such as the development of {\it Drosophila melanogaster} can exhibit significant topological changes to facilitate dynamic regulatory functions. Thus it is essential to develop methodologies that capture the temporal evolution of networks, which make it possible to study the driving forces underlying dynamic rewiring of gene regulation circuity, and to predict future network structures. Using a new machine learning method called Tesla, which builds on a novel temporal logistic regression technique, we report the first successful genome-wide reverse-engineering of the latent sequence of temporally rewiring gene networks over more than 4000 genes during the life cycle of \textit{Drosophila melanogaster}, given longitudinal gene expression measurements and even when a single snapshot of such measurement resulted from each (time-specific) network is available. Our methods offer the first glimpse of time-specific snapshots and temporal evolution patterns of gene networks in a living organism during its full developmental course. The recovered networks with this unprecedented resolution chart the onset and duration of many gene interactions which are missed by typical static network analysis, and are suggestive of a wide array of other temporal behaviors of the gene network over time not noticed before.

artificial intelligence, health & medicine, interaction, (18 more...)

arXiv.org Machine Learning

0901.0138

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

Information, Divergence and Risk for Binary Experiments

Reid, Mark D., Williamson, Robert C.

arXiv.org Machine LearningJan-5-2009

We unify f-divergences, Bregman divergences, surrogate loss bounds (regret bounds), proper scoring rules, matching losses, cost curves, ROC-curves and information. We do this by systematically studying integral and variational representations of these objects and in so doing identify their primitives which all are related to cost-sensitive binary classification. As well as clarifying relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate loss bounds and generalised Pinsker inequalities relating f-divergences to variational divergence. The new viewpoint illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants. It also suggests new techniques for estimating f-divergences.

artificial intelligence, divergence, survey article, (15 more...)

arXiv.org Machine Learning

0901.0356

Country:

Europe > United Kingdom > England (0.27)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

On the Geometry of Discrete Exponential Families with Application to Exponential Random Graph Models

Fienberg, Stephen E., Rinaldo, Alessandro, Zhou, Yi

arXiv.org Machine LearningDec-30-2008

There has been an explosion of interest in statistical models for analyzing network data, and considerable interest in the class of exponential random graph (ERG) models, especially in connection with difficulties in computing maximum likelihood estimates. The issues associated with these difficulties relate to the broader structure of discrete exponential families. This paper re-examines the issues in two parts. First we consider the closure of $k$-dimensional exponential families of distribution with discrete base measure and polyhedral convex support $\mathrm{P}$. We show that the normal fan of $\mathrm{P}$ is a geometric object that plays a fundamental role in deriving the statistical and geometric properties of the corresponding extended exponential families. We discuss its relevance to maximum likelihood estimation, both from a theoretical and computational standpoint. Second, we apply our results to the analysis of ERG models. In particular, by means of a detailed example, we provide some characterization of the properties of ERG models, and, in particular, of certain behaviors of ERG models known as degeneracy.

artificial intelligence, erg model, télécommunications, (18 more...)

arXiv.org Machine Learning

0901.0026

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Identifying Relevant Eigenimages - a Random Matrix Approach

Ding, Yu, Chung, Yiu-Cho, Huang, Kun, Simonetti, Orlando P.

arXiv.org Machine LearningDec-25-2008

Dimensional reduction of high dimensional data can be achieved by keeping only the relevant eigenmodes after principal component analysis. However, differentiating relevant eigenmodes from the random noise eigenmodes is problematic. A new method based on the random matrix theory and a statistical goodness-of-fit test is proposed in this paper. It is validated by numerical simulations and applied to real-time magnetic resonance cardiac cine images.

artificial intelligence, health & medicine, noise, (18 more...)

arXiv.org Machine Learning

0812.4618

Country: North America > United States > Ohio (0.16)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.47)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Efficient Exact Inference in Planar Ising Models

Schraudolph, Nicol N., Kamenetsky, Dmitry

arXiv.org Machine LearningDec-17-2008

We give polynomial-time algorithms for the exact computation of lowest-energy (ground) states, worst margin violators, log partition functions, and marginal edge probabilities in certain binary undirected graphical models. Our approach provides an interesting alternative to the well-known graph cut paradigm in that it does not impose any submodularity constraints; instead we require planarity to establish a correspondence with perfect matchings (dimer coverings) in an expanded dual graph. We implement a unified framework while delegating complex but well-understood subproblems (planar embedding, maximum-weight perfect matching) to established algorithms for which efficient implementations are freely available. Unlike graph cut methods, we can perform penalized maximum-likelihood as well as maximum-margin parameter estimation in the associated conditional random fields (CRFs), and employ marginal posterior probabilities as well as maximum a posteriori (MAP) states for prediction. Maximum-margin CRF parameter estimation on image denoising and segmentation problems shows our approach to be efficient and effective. A C++ implementation is available from http://nic.schraudolph.org/isinf/

artificial intelligence, bayesian inference, graph, (16 more...)

arXiv.org Machine Learning

0810.4401

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

On the Distribution of the Adaptive LASSO Estimator

Pötscher, Benedikt M., Schneider, Ulrike

arXiv.org Machine LearningDec-16-2008

We study the distribution of the adaptive LASSO estimator (Zou (2006)) in finite samples as well as in the large-sample limit. The large-sample distributions are derived both for the case where the adaptive LASSO estimator is tuned to perform conservative model selection as well as for the case where the tuning results in consistent model selection. We show that the finite-sample as well as the large-sample distributions are typically highly non-normal, regardless of the choice of the tuning parameter. The uniform convergence rate is also obtained, and is shown to be slower than $n^{-1/2}$ in case the estimator is tuned to perform consistent model selection. In particular, these results question the statistical relevance of the `oracle' property of the adaptive LASSO estimator established in Zou (2006). Moreover, we also provide an impossibility result regarding the estimation of the distribution function of the adaptive LASSO estimator.The theoretical results, which are obtained for a regression model with orthogonal design, are complemented by a Monte Carlo study using non-orthogonal regressors.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jspi.2009.01.003

0801.4627

Country: Europe (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Classification of Cell Images Using MPEG-7-influenced Descriptors and Support Vector Machines in Cell Morphology

Abenius, Tobias

arXiv.org Machine LearningDec-12-2008

Counting and classifying blood cells is an important diagnostic tool in medicine. Support Vector Machines are increasingly popular and efficient and could replace artificial neural network systems. Here a method to classify blood cells is proposed using SVM. A set of statistics on images are implemented in C++. The MPEG-7 descriptors Scalable Color Descriptor, Color Structure Descriptor, Color Layout Descriptor and Homogeneous Texture Descriptor are extended in size and combined with textural features corresponding to textural properties perceived visually by humans. From a set of images of human blood cells these statistics are collected. A SVM is implemented and trained to classify the cell images. The cell images come from a CellaVision DM-96 machine which classify cells from images from microscopy. The output images and classification of the CellaVision machine is taken as ground truth, a truth that is 90-95% correct. The problem is divided in two -- the primary and the simplified. The primary problem is to classify the same classes as the CellaVision machine. The simplified problem is to differ between the five most common types of white blood cells. An encouraging result is achieved in both cases -- error rates of 10.8% and 3.1% -- considering that the SVM is misled by the errors in ground truth. Conclusion is that further investigation of performance is worthwhile.

health & medicine, immunology, support vector machine, (18 more...)

arXiv.org Machine Learning

0812.2309

Genre: Research Report (0.51)

Industry: Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Prediction with Restricted Resources and Finite Automata

Macleod, Finn, Gleeson, James

arXiv.org Machine LearningDec-10-2008

We obtain an index of the complexity of a random sequence by allowing the role of the measure in classical probability theory to be played by a function we call the generating mechanism. Typically, this generating mechanism will be a finite automata. We generate a set of biased sequences by applying a finite state automata with a specified number, $m$, of states to the set of all binary sequences. Thus we can index the complexity of our random sequence by the number of states of the automata. We detail optimal algorithms to predict sequences generated in this way.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Machine Learning

0812.1949

Country:

Europe (0.46)
North America > United States > New Jersey (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

Missing Data using Decision Forest and Computational Intelligence

Moon, D., Marwala, T.

arXiv.org Machine LearningDec-8-2008

Autoencoder neural network is implemented to estimate the missing data. Genetic algorithm is implemented for network optimization and estimating the missing data. Missing data is treated as Missing At Random mechanism by implementing maximum likelihood algorithm. The network performance is determined by calculating the mean square error of the network prediction. The network is further optimized by implementing Decision Forest. The impact of missing data is then investigated and decision forrests are found to improve the results.

decision tree learning, immunology, missing data, (19 more...)

arXiv.org Machine Learning

0812.1615

Country:

Africa (0.69)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.51)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.33)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
(2 more...)

Add feedback

An information-theoretic derivation of min-cut based clustering

Raj, Anil, Wiggins, Chris H.

arXiv.org Machine LearningNov-26-2008

Min-cut clustering, based on minimizing one of two heuristic cost-functions proposed by Shi and Malik, has spawned tremendous research, both analytic and algorithmic, in the graph partitioning and image segmentation communities over the last decade. It is however unclear if these heuristics can be derived from a more general principle facilitating generalization to new problem settings. Motivated by an existing graph partitioning framework, we derive relationships between optimizing relevance information, as defined in the Information Bottleneck method, and the regularized cut in a K-partitioned graph. For fast mixing graphs, we show that the cost functions introduced by Shi and Malik can be well approximated as the rate of loss of predictive information about the location of random walkers on the graph. For graphs generated from a stochastic algorithm designed to model community structure, the optimal information theoretic partition and the optimal min-cut partition are shown to be the same with high probability.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Machine Learning

0811.4208

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.90)

Add feedback