Country
Limits of Preprocessing
We present a first theoretical analysis of the power of polynomial-time preprocessing for important combinatorial problems from various areas in AI. We consider problems from Constraint Satisfaction, Global Constraints, Satisfiability, Nonmonotonic and Bayesian Reasoning. We show that, subject to a complexity theoretic assumption, none of the considered problems can be reduced by polynomial-time preprocessing to a problem kernel whose size is polynomial in a structural problem parameter of the input, such as induced width or backdoor size. Our results provide a firm theoretical boundary for the performance of polynomial-time preprocessing algorithms for the considered problems.
Independent screening for single-index hazard rate models with ultra-high dimensional features
Gorst-Rasmussen, Anders, Scheike, Thomas H.
In data sets with many more features than observations, independent screening based on all univariate regression models leads to a computationally convenient variable selection method. Recent efforts have shown that in the case of generalized linear models, independent screening may suffice to capture all relevant features with high probability, even in ultra-high dimension. It is unclear whether this formal sure screening property is attainable when the response is a right-censored survival time. We propose a computationally very efficient independent screening method for survival data which can be viewed as the natural survival equivalent of correlation screening. We state conditions under which the method admits the sure screening property within a general class of single-index hazard rate models with ultra-high dimensional features. An iterative variant is also described which combines screening with penalized regression in order to handle more complex feature covariance structures. The methods are evaluated through simulation studies and through application to a real gene expression dataset.
Uncertain Nearest Neighbor Classification
Angiulli, Fabrizio, Fassetti, Fabio
This work deals with the problem of classifying uncertain data. With this aim the Uncertain Nearest Neighbor (UNN) rule is here introduced, which represents the generalization of the deterministic nearest neighbor rule to the case in which uncertain objects are available. The UNN rule relies on the concept of nearest neighbor class, rather than on that of nearest neighbor object. The nearest neighbor class of a test object is the class that maximizes the probability of providing its nearest neighbor. It is provided evidence that the former concept is much more powerful than the latter one in the presence of uncertainty, in that it correctly models the right semantics of the nearest neighbor decision rule when applied to the uncertain scenario. An effective and efficient algorithm to perform uncertain nearest neighbor classification of a generic (un)certain test object is designed, based on properties that greatly reduce the temporal cost associated with nearest neighbor class probability computation. Experimental results are presented, showing that the UNN rule is effective and efficient in classifying uncertain data.
A Knowledge Mining Model for Ranking Institutions using Rough Computing with Ordering Rules and Formal Concept analysis
Acharjya, D. P., Ezhilarasi, L.
Emergences of computers and information technological revolution made tremendous changes in the real world and provides a different dimension for the intelligent data analysis. Well formed fact, the information at right time and at right place deploy a better knowledge.However, the challenge arises when larger volume of inconsistent data is given for decision making and knowledge extraction. To handle such imprecise data certain mathematical tools of greater importance has developed by researches in recent past namely fuzzy set, intuitionistic fuzzy set, rough Set, formal concept analysis and ordering rules. It is also observed that many information system contains numerical attribute values and therefore they are almost similar instead of exact similar. To handle such type of information system, in this paper we use two processes such as pre process and post process. In pre process we use rough set on intuitionistic fuzzy approximation space with ordering rules for finding the knowledge whereas in post process we use formal concept analysis to explore better knowledge and vital factors affecting decisions.
Distance Dependent Chinese Restaurant Processes
Blei, David M., Frazier, Peter I.
We develop the distance dependent Chinese restaurant process (CRP), a flexible class of distributions over partitions that allows for non-exchangeability. This class can be used to model many kinds of dependencies between data in infinite clustering models, including dependencies across time or space. We examine the properties of the distance dependent CRP, discuss its connections to Bayesian nonparametric mixture models, and derive a Gibbs sampler for both observed and mixture settings. We study its performance with three text corpora. We show that relaxing the assumption of exchangeability with distance dependent CRPs can provide a better fit to sequential data. We also show its alternative formulation of the traditional CRP leads to a faster-mixing Gibbs sampling algorithm than the one based on the original formulation.
Robust graphical modeling of gene networks using classical and alternative T-distributions
Finegold, Michael, Drton, Mathias
Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models, and resulting recent progress includes the development of fitting methodology involving penalization of the likelihood function. In this paper we advocate the use of multivariate $t$-distributions for more robust inference of graphs. In particular, we demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a computationally efficient approach to model selection in the $t$-distribution case. We consider two versions of multivariate $t$-distributions, one of which requires the use of approximation techniques. For this distribution, we describe a Markov chain Monte Carlo EM algorithm based on a Gibbs sampler as well as a simple variational approximation that makes the resulting method feasible in large problems.
Application of Microsimulation Towards Modelling of Behaviours in Complex Environments
Keep, Daniel (University of Wollongong) | Bunder, Rachel (University of Wollongong) | Piper, Ian (University of Wollongong) | Green, Anthony (University of Wollongong)
In this paper, we introduce new capabilities to our existing microsimulation framework, Simulacron. These new capabilities add the modelling of behaviours based on motivations and improve our existing non-deterministic movement capacity. We then discuss the application of these new features to a simple, synthetic, proof of concept, scenario involving the transit of people through a corridor and how an induced panic affects their throughput. Finally we describe a more complex scenario, which is currently under development, involving the detonation of an explosive device in a major metropolitan transport hub at peak hour and the analysis of subsequent reaction.
InSitu: An Approach for Dynamic Context Labeling Based on Product Usage and Sound Analysis
Lyardet, Fernando (Technical University of Darmstadt) | Hadjakos, Aristotelis (Technical University of Darmstadt) | Szeto, Diego Wong (Technical University of Darmstadt)
Smart environments offer a vision of unobtrusive interaction with our surroundings, interpreting and anticipating our needs. One key aspect for making environments smart is the ability to recognize the current context. However, like any human space, smart environments are subject to changes and mutations of their purposes and their composition as people shape their living places according to their needs. In this paper we present an approach for recognizing context situations in smart environments that addresses this challenge. We propose a formalism for describing and sharing context states (or situations) and an architecture for gradually introducing contextual knowledge to an environment, where the current context is determined on sensing people's usage of devices and sound analysis.
Error Identification and Correction in Human Computation: Lessons from the WPA
Grier, David Alan (George Washington University)
Human computing promises new capabilities that cannot be easily provided by computing machinery. However, humans are less disciplined than their mechanical counterparts and hence are liable to produce accidental or deliberate mistakes. As we start to develop regimes for identifying and correcting errors in human computation, we find an important model in the computing groups that operated at the start of the 20th century.