Genre
Model Selection by Loss Rank for Classification and Unsupervised Learning
Tran, Minh-Ngoc, Hutter, Marcus
Hutter (2007) recently introduced the loss rank principle (LoRP) as a generalpurpose principle for model selection. The LoRP enjoys many attractive properties and deserves further investigations. The LoRP has been well-studied for regression framework in Hutter and Tran (2010). In this paper, we study the LoRP for classification framework, and develop it further for model selection problems in unsupervised learning where the main interest is to describe the associations between input measurements, like cluster analysis or graphical modelling. Theoretical properties and simulation studies are presented.
The Loss Rank Criterion for Variable Selection in Linear Regression Analysis
Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model selection criterion is proposed to select the best one among this preselected set. The approach leads to a fast and efficient procedure for variable selection, especially in high-dimensional settings. Model selection consistency of the suggested criterion is proven when the number of covariates d is fixed. Simulation studies suggest that the criterion still enjoys model selection consistency when d is much larger than the sample size. The simulations also show that our approach for variable selection works surprisingly well in comparison with existing competitors. The method is also applied to a real data set.
Probabilistic Inferences in Bayesian Networks
Bayesian network is a complete model for the variables and their relationships, it can be used to answer probabilistic queries about them. A Bayesian network can thus be considered a mechanism for automatically applying Bayes' theorem to complex problems. In the application of Bayesian networks, most of the work is related to probabilistic inferences. Any variable updating in any node of Bayesian networks might result in the evidence propagation across the Bayesian networks. This paper sums up various inference techniques in Bayesian networks and provide guidance for the algorithm calculation in probabilistic inference in Bayesian networks.
The Lasso under Heteroscedasticity
Jia, Jinzhu, Rohe, Karl, Yu, Bin
Preprint 1 The Lasso under Heteroscedasticity Jinzhu Jia 1, Karl Rohe 1 and Bin Yu 1, 2 Department of Statistics 1 and Department of EECS 2 University of California, Berkeley Abstract: The performance of the Lasso is well understood under the assumptions of the standard linear model with homoscedastic noise. However, in several applications, the standard model does not describe the important features of the data. This paper examines how the Lasso performs on a nonstandard model that is motivated by medical imaging applications. Like all heteroscedas-tic models, the noise terms in this Poisson-like model are not independent of the design matrix. More specifically, this paper studies the sign consistency of the Lasso under a sparse Poisson-like model. In addition to studying sufficient conditions for the sign consistency of the Lasso estimate, this paper also gives necessary conditions for sign consistency. Both sets of conditions are comparable to results for the homoscedastic model, showing that when a measure of the signal to noise ratio is large, the Lasso performs well on both Poisson-like data and homoscedastic data. Simulations reveal that the Lasso performs equally well in terms of model selection performance on both Poisson-like data and homoscedastic data (with properly scaled noise variance), across a range of parameterizations. Taken as a whole, these results suggest that the Lasso is robust to the Poisson-like heteroscedastic noise. Key words and phrases: Lasso, Poisson-like Model, Sign Consistency, Heteroscedas-ticity 1 Introduction The Lasso (Tibshirani, 1996) is widely used in high dimensional regression for variable selection. Its model selection performance has been well studied under a standard sparse and homoskedastic regression model. Several researchers have shown that under sparsity and regularity conditions, the Lasso can select the true model asymptotically even whenp n (Donoho et al., 2006; Meinshausen arXiv:1011.1026v1 To define the Lasso estimate, suppose the observed data are independent pairs { (x i,Y i)} R p R for i 1, 2,...,n following the linear regression model Y i x T i β i, (1) where x T i is a row vector representing the predictors for thei th observation,Y i is the correspondingi th response variable, i's are independent and mean zero noise terms, andβ R p . Let Y (Y 1,...,Y n)T and ( 1, 2,..., n)T R n . The Lasso estimate (Tibshirani, 1996) is then defined as the solution to a penalized least squares problem (with regularization parameterλ): ˆ β (λ) arg min β 1 2 n ‖Y X β‖ 2 2 λ‖β‖ 1, (2) where for some vectorx R k,‖ x ‖ r ( k i 1 x i r) 1/r .
Detecting Ontological Conflicts in Protocols between Semantic Web Services
Ghosh, Priyankar, Dasgupta, Pallab
The task of verifying the compatibility between interacting web services has traditionally been limited to checking the compatibility of the interaction protocol in terms of message sequences and the type of data being exchanged. Since web services are developed largely in an uncoordinated way, different services often use independently developed ontologies for the same domain instead of adhering to a single ontology as standard. In this work we investigate the approaches that can be taken by the server to verify the possibility to reach a state with semantically inconsistent results during the execution of a protocol with a client, if the client ontology is published. Often database is used to store the actual data along with the ontologies instead of storing the actual data as a part of the ontology description. It is important to observe that at the current state of the database the semantic conflict state may not be reached even if the verification done by the server indicates the possibility of reaching a conflict state. A relational algebra based decision procedure is also developed to incorporate the current state of the client and the server databases in the overall verification procedure.
Reasoning about Cardinal Directions between Extended Objects: The Hardness Result
The cardinal direction calculus (CDC) proposed by Goyal and Egenhofer is a very expressive qualitative calculus for directional information of extended objects. Early work has shown that consistency checking of complete networks of basic CDC constraints is tractable while reasoning with the CDC in general is NP-hard. This paper shows, however, if allowing some constraints unspecified, then consistency checking of possibly incomplete networks of basic CDC constraints is already intractable. This draws a sharp boundary between the tractable and intractable subclasses of the CDC. The result is achieved by a reduction from the well-known 3-SAT problem.
Community Detection in Networks: The Leader-Follower Algorithm
Traditional spectral clustering methods cannot naturally learn the number of communities in a network and often fail to detect smaller community structure in dense networks because they are based upon external community connectivity properties such as graph cuts. We propose an algorithm for detecting community structure in networks called the leader-follower algorithm which is based upon the natural internal structure expected of communities in social networks. The algorithm uses the notion of network centrality in a novel manner to differentiate leaders (nodes which connect different communities) from loyal followers (nodes which only have neighbors within a single community). Using this approach, it is able to naturally learn the communities from the network structure and does not require the number of communities as an input, in contrast to other common methods such as spectral clustering. We prove that it will detect all of the communities exactly for any network possessing communities with the natural internal structure expected in social networks. More importantly, we demonstrate the effectiveness of the leader-follower algorithm in the context of various real networks ranging from social networks such as Facebook to biological networks such as an fMRI based human brain network. We find that the leader-follower algorithm finds the relevant community structure in these networks without knowing the number of communities beforehand. Also, because the leader-follower algorithm detects communities using their internal structure, we find that it can resolve a finer community structure in dense networks than common spectral clustering methods based on external community structure.
Significance of Classification Techniques in Prediction of Learning Disabilities
Balakrishnan, Julie M. David And Kannan
The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.
Random Graph Generator for Bipartite Networks Modeling
Chojnacki, Szymon, Kłopotek, Mieczysław
The purpose of this article is to introduce a new iterative algorithm with properties resembling real life bipartite graphs. The algorithm enables us to generate wide range of random bigraphs, which features are determined by a set of parameters.We adapt the advances of last decade in unipartite complex networks modeling to the bigraph setting. This data structure can be observed in several situations. However, only a few datasets are freely available to test the algorithms (e.g. community detection, influential nodes identification, information retrieval) which operate on such data. Therefore, artificial datasets are needed to enhance development and testing of the algorithms. We are particularly interested in applying the generator to the analysis of recommender systems. Therefore, we focus on two characteristics that, besides simple statistics, are in our opinion responsible for the performance of neighborhood based collaborative filtering algorithms. The features are node degree distribution and local clustering coeficient.
CUR from a Sparse Optimization Viewpoint
Bien, Jacob, Xu, Ya, Mahoney, Michael W.
The CUR decomposition provides an approximation of a matrix $X$ that has low reconstruction error and that is sparse in the sense that the resulting approximation lies in the span of only a few columns of $X$. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic approach, whereas most sparse PCA methods are framed as convex optimization problems. In this paper, we try to understand CUR from a sparse optimization viewpoint. We show that CUR is implicitly optimizing a sparse regression objective and, furthermore, cannot be directly cast as a sparse PCA method. We also observe that the sparsity attained by CUR possesses an interesting structure, which leads us to formulate a sparse PCA method that achieves a CUR-like sparsity.