Not enough data to create a plot.
Try a different view from the menu above.
Country
Using a Kernel Adatron for Object Classification with RCS Data
Byl, Marten F., Demers, James T., Rietman, Edward A.
Rapid identification of object from radar cross section (RCS) signals is important for many space and military applications. This identification is a problem in pattern recognition which either neural networks or support vector machines should prove to be high-speed. Bayesian networks would also provide value but require significant preprocessing of the signals. In this paper, we describe the use of a support vector machine for object identification from synthesized RCS data. Our best results are from data fusion of X-band and S-band signals, where we obtained 99.4%, 95.3%, 100% and 95.6% correct identification for cylinders, frusta, spheres, and polygons, respectively. We also compare our results with a Bayesian approach and show that the SVM is three orders of magnitude faster, as measured by the number of floating point operations.
Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation
Haupt, Jarvis, Castro, Rui, Nowak, Robert
Adaptive sampling results in dramatic improvements in the recovery of sparse signals in white Gaussian noise. A sequential adaptive sampling-and-refinement procedure called Distilled Sensing (DS) is proposed and analyzed. DS is a form of multi-stage experimental design and testing. Because of the adaptive nature of the data collection, DS can detect and localize far weaker signals than possible from non-adaptive measurements. In particular, reliable detection and localization (support estimation) using non-adaptive samples is possible only if the signal amplitudes grow logarithmically with the problem dimension. Here it is shown that using adaptive sampling, reliable detection is possible provided the amplitude exceeds a constant, and localization is possible when the amplitude exceeds any arbitrarily slowly growing function of the dimension.
Modeling Social Annotation: a Bayesian Approach
Plangprasopchok, Anon, Lerman, Kristina
Collaborative tagging systems, such as Delicious, CiteULike, and others, allow users to annotate resources, e.g., Web pages or scientific papers, with descriptive labels called tags. The social annotations contributed by thousands of users, can potentially be used to infer categorical knowledge, classify documents or recommend new relevant information. Traditional text inference methods do not make best use of social annotation, since they do not take into account variations in individual users' perspectives and vocabulary. In a previous work, we introduced a simple probabilistic model that takes interests of individual annotators into account in order to find hidden topics of annotated resources. Unfortunately, that approach had one major shortcoming: the number of topics and interests must be specified a priori. To address this drawback, we extend the model to a fully Bayesian framework, which offers a way to automatically estimate these numbers. In particular, the model allows the number of interests and topics to change as suggested by the structure of the data. We evaluate the proposed model in detail on the synthetic and real-world data by comparing its performance to Latent Dirichlet Allocation on the topic extraction task. For the latter evaluation, we apply the model to infer topics of Web resources from social annotations obtained from Delicious in order to discover new resources similar to a specified one. Our empirical results demonstrate that the proposed model is a promising method for exploiting social knowledge contained in user-generated annotations.
An axiomatic approach to the roughness measure of rough sets
In Pawlak's rough set theory, a set is approximated by a pair of lower and upper approximations. To measure numerically the roughness of an approximation, Pawlak introduced a quantitative measure of roughness by using the ratio of the cardinalities of the lower and upper approximations. Although the roughness measure is effective, it has the drawback of not being strictly monotonic with respect to the standard ordering on partitions. Recently, some improvements have been made by taking into account the granularity of partitions. In this paper, we approach the roughness measure in an axiomatic way. After axiomatically defining roughness measure and partition measure, we provide a unified construction of roughness measure, called strong Pawlak roughness measure, and then explore the properties of this measure. We show that the improved roughness measures in the literature are special instances of our strong Pawlak roughness measure and introduce three more strong Pawlak roughness measures as well. The advantage of our axiomatic approach is that some properties of a roughness measure follow immediately as soon as the measure satisfies the relevant axiomatic definition.
Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection
Farid, Dewan Md., Harbi, Nouria, Rahman, Mohammad Zahidur
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
Distantly Labeling Data for Large Scale Cross-Document Coreference
Singh, Sameer, Wick, Michael, McCallum, Andrew
Cross-document coreference, the problem of resolving entity mentions across multi-document collections, is crucial to automated knowledge base construction and data mining tasks. However, the scarcity of large labeled data sets has hindered supervised machine learning research for this task. In this paper we develop and demonstrate an approach based on ``distantly-labeling'' a data set from which we can train a discriminative cross-document coreference model. In particular we build a dataset of more than a million people mentions extracted from 3.5 years of New York Times articles, leverage Wikipedia for distant labeling with a generative model (and measure the reliability of such labeling); then we train and evaluate a conditional random field coreference model that has factors on cross-document entities as well as mention-pairs. This coreference model obtains high accuracy in resolving mentions and entities that are not present in the training data, indicating applicability to non-Wikipedia data. Given the large amount of data, our work is also an exercise demonstrating the scalability of our approach.
Evidence Algorithm and System for Automated Deduction: A Retrospective View
Lyaletski, Alexander, Verchinine, Konstantin
A research project aimed at the development of an automated theorem proving system was started in Kiev (Ukraine) in early 1960s. The mastermind of the project, Academician V.Glushkov, baptized it "Evidence Algorithm", EA. The work on the project lasted, off and on, more than 40 years. In the framework of the project, the Russian and English versions of the System for Automated Deduction, SAD, were constructed. They may be already seen as powerful theorem-proving assistants.
Inaccuracy Minimization by Partioning Fuzzy Data Sets - Validation of Analystical Methodology
Arutchelvan, G., Srivatsa, S. K., Jagannathan, R.
In the last two decades, a number of methods have been proposed for forecasting based on fuzzy time series. Most of the fuzzy time series methods are presented for forecasting of car road accidents. However, the forecasting accuracy rates of the existing methods are not good enough. In this paper, we compared our proposed new method of fuzzy time series forecasting with existing methods. Our method is based on means based partitioning of the historical data of car road accidents. The proposed method belongs to the kth order and time-variant methods. The proposed method can get the best forecasting accuracy rate for forecasting the car road accidents than the existing methods.
A Soft Computing Model for Physicians' Decision Process
In this paper the author presents a kind of Soft Computing Technique, mainly an application of fuzzy set theory of Prof. Zadeh [16], on a problem of Medical Experts Systems. The choosen problem is on design of a physician's decision model which can take crisp as well as fuzzy data as input, unlike the traditional models. The author presents a mathematical model based on fuzzy set theory for physician aided evaluation of a complete representation of information emanating from the initial interview including patient past history, present symptoms, and signs observed upon physical examination and results of clinical and diagnostic tests.
Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Petrik, Marek, Taylor, Gavin, Parr, Ron, Zilberstein, Shlomo
Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using $L_1$ regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.