AITopics

doi: 10.1016/j.eswa.2010.07.143

1104.3904

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

AAAI ConferencesMar-19-2011

Automatic Seizure Detection in an In-Vivo Model of Epilepsy

Saulnier, Guillaume (McGill University) | Pineau, Joelle (McGill University)

The goal of our research is to find patterns of EEG activity that will allow us to correctly identify seizures in living rats using machine learning techniques. Features are extracted from the EEG to characterize the signal over time. We perform model selection to reduce the set of features, as the goal is to have the algorithm running on a small personal device. The chosen features are used within a supervised classifier, based on randomized forests, in order to separate the different brain states. One of the challenges of this research is to detect all seizures, while preserving a low false positive rate, and low detection latency. We present results showing we can achieve this using data from three separate animals. The long-term goal of this research is to use this seizure detection method as part of a closed-loop adaptive neuro-stimulation device to reduce the incidence and duration of seizures.

classifier, seizure, seizure detection, (12 more...)

AAAI Conferences

2011 AAAI Spring Symposium Series

Country: North America > Canada > Quebec > Montreal (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningMar-2-2011

Stochastic Stepwise Ensembles for Variable Selection

Xin, Lu, Zhu, Mu

The ensemble approach for statistical modelling was first made popular by such algorithms as boosting (Freund and Schapire 1996; Friedman et al. 2000), bagging (Breiman 1996), random forest (Breiman 2001), and the gradient boosting machine (Friedman 2001). They are powerful algorithms for solving prediction problems. This article is concerned with using the ensemble approach for a different problem, variable selection. We shall use the terms "prediction ensemble" and "variableselection ensemble" to differentiate ensembles used for these different purposes.

artificial intelligence, machine learning, selection, (18 more...)

1003.593

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceFeb-18-2011

Inferring Disease and Gene Set Associations with Rank Coherence in Networks

Hwang, TaeHyun, Zhang, Wei, Xie, Maoqiang, Kuang, Rui

A computational challenge to validate the candidate disease genes identified in a high-throughput genomic study is to elucidate the associations between the set of candidate genes and disease phenotypes. The conventional gene set enrichment analysis often fails to reveal associations between disease phenotypes and the gene sets with a short list of poorly annotated genes, because the existing annotations of disease causative genes are incomplete. We propose a network-based computational approach called rcNet to discover the associations between gene sets and disease phenotypes. Assuming coherent associations between the genes ranked by their relevance to the query gene set, and the disease phenotypes ranked by their relevance to the hidden target disease phenotypes of the query gene set, we formulate a learning framework maximizing the rank coherence with respect to the known disease phenotype-gene associations. An efficient algorithm coupling ridge regression with label propagation, and two variants are introduced to find the optimal solution of the framework. We evaluated the rcNet algorithms and existing baseline methods with both leave-one-out cross-validation and a task of predicting recently discovered disease-gene associations in OMIM. The experiments demonstrated that the rcNet algorithms achieved the best overall rankings compared to the baselines. To further validate the reproducibility of the performance, we applied the algorithms to identify the target diseases of novel candidate disease genes obtained from recent studies of GWAS, DNA copy number variation analysis, and gene expression profiling. The algorithms ranked the target disease of the candidate genes at the top of the rank list in many cases across all the three case studies. The rcNet algorithms are available as a webtool for disease and gene set association analysis at http://compbio.cs.umn.edu/dgsa_rcNet.

artificial intelligence, machine learning, phenotype, (15 more...)

1102.3919

Country:

North America > United States > Minnesota (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (0.93)
Health & Medicine > Therapeutic Area > Gastroenterology (0.68)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Chaudhuri, Kamalika, Monteleoni, Claire, Sarwate, Anand D.

Differentially Private Empirical Risk Minimization

arXiv.org Artificial IntelligenceFeb-16-2011

Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $\epsilon$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.

algorithm, artificial intelligence, machine learning, (18 more...)

0912.0071

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

arXiv.org Artificial IntelligenceFeb-14-2011

Feature Selection via Sparse Approximation for Face Recognition

Liang, Yixiong, Wang, Lei, Xiang, Yao, Zou, Beiji

Inspired by biological vision systems, the over-complete local features with huge cardinality are increasingly used for face recognition during the last decades. Accordingly, feature selection has become more and more important and plays a critical role for face data description and recognition. In this paper, we propose a trainable feature selection algorithm based on the regularized frame for face recognition. By enforcing a sparsity penalty term on the minimum squared error (MSE) criterion, we cast the feature selection problem into a combinatorial sparse approximation problem, which can be solved by greedy methods or convex relaxation methods. Moreover, based on the same frame, we propose a sparse Ho-Kashyap (HK) procedure to obtain simultaneously the optimal sparse solution and the corresponding margin vector of the MSE criterion. The proposed methods are used for selecting the most informative Gabor features of face images for recognition and the experimental results on benchmark face databases demonstrate the effectiveness of the proposed methods.

artificial intelligence, machine learning, recognition, (15 more...)

1102.2748

Country:

Asia > China (0.05)
North America > United States > Texas (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Lahti, Leo, Myllykangas, Samuel, Knuutila, Sakari, Kaski, Samuel

Dependency detection with similarity constraints

arXiv.org Machine LearningJan-31-2011

Unsupervised two-view learning, or detection of dependencies between two paired data sets, is typically done by some variant of canonical correlation analysis (CCA). CCA searches for a linear projection for each view, such that the correlations between the projections are maximized. The solution is invariant to any linear transformation of either or both of the views; for tasks with small sample size such flexibility implies overfitting, which is even worse for more flexible nonparametric or kernel-based dependency discovery methods. We develop variants which reduce the degrees of freedom by assuming constraints on similarity of the projections in the two views. A particular example is provided by a cancer gene discovery application where chromosomal distance affects the dependencies between gene copy number and activity levels. Similarity constraints are shown to improve detection performance of known cancer genes.

artificial intelligence, correlation, machine learning, (16 more...)

doi: 10.1109/MLSP.2009.5306192

1101.5919

Country: North America > United States (0.94)

Genre: Research Report > Experimental Study (0.35)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningJan-18-2011

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models

Fan, Jianqing, Feng, Yang, Song, Rui

A variable screening procedure via correlation learning was proposed Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening is called NIS, a specific member of the sure independence screening. Several closely related variable screening procedures are proposed. Under the nonparametric additive models, it is shown that under some mild technical conditions, the proposed independence screening methods enjoy a sure screening property. The extent to which the dimensionality can be reduced by independence screening is also explicitly quantified. As a methodological extension, an iterative nonparametric independence screening (INIS) is also proposed to enhance the finite sample performance for fitting sparse additive models. The simulation results and a real data analysis demonstrate that the proposed procedure works well with moderate sample size and large dimension and performs better than competing methods.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

0912.2695

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.69)
Information Technology > Modeling & Simulation (0.66)

Selke, Joachim, Balke, Wolf-Tilo

Extracting Features from Ratings: The Role of Factor Models

arXiv.org Artificial IntelligenceJan-12-2011

Performing effective preference-based data retrieval requires detailed and preferentially meaningful structurized information about the current user as well as the items under consideration. A common problem is that representations of items often only consist of mere technical attributes, which do not resemble human perception. This is particularly true for integral items such as movies or songs. It is often claimed that meaningful item features could be extracted from collaborative rating data, which is becoming available through social networking services. However, there is only anecdotal evidence supporting this claim; but if it is true, the extracted information could very valuable for preference-based data retrieval. In this paper, we propose a methodology to systematically check this common claim. We performed a preliminary investigation on a large collection of movie ratings and present initial evidence.

artificial intelligence, coordinate space, machine learning, (19 more...)

1101.2378

Country:

North America > United States (0.04)
Europe > Germany (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

arXiv.org Machine LearningJan-7-2011

Node harvest

Meinshausen, Nicolai

When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.

artificial intelligence, machine learning, node, (18 more...)

doi: 10.1214/10-AOAS367

0910.2145

Country:

Europe (0.93)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)