AITopics

This paper introduces Streaming Half-Space-Trees (HS-Trees), a fast one-class anomaly detector for evolving data streams. It requires only normal data for training and works well when anomalous data are rare. The model features an ensemble of random HS-Trees, and the tree structure is constructed without any data. This makes the method highly efficient because it requires no model restructuring when adapting to evolving data streams. Our analysis shows that Streaming HS-Trees has constant amortised time complexity and constant memory requirement. When compared with a state-of-the-art method, our method performs favourably in terms of detection accuracy and runtime performance. Our experimental results also show that the detection performance of Streaming HS-Trees is not sensitive to its parameter settings.

data stream, dataset, streaming hs-tree, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Oceania > Australia (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence (1.00)

Angular Decomposition

Sun, Dengdi (Anhui University) | Ding, Chris H.Q. (University of Texas at Arlington) | Luo, Bin (Anhui University) | Tang, Jin (Anhui University)

Dimensionality reduction plays a vital role in pattern recognition. However, for normalized vector data, existing methods do not utilize the fact that the data is normalized. In this paper, we propose to employ an Angular Decomposition of the normalized vector data which corresponds to embedding them on a unit surface. On graph data for similarity/kernel matrices with constant diagonal elements, we propose the Angular Decomposition of the similarity matrices which corresponds to embedding objects on a unit sphere. In these angular embeddings, the Euclidean distance is equivalent to the cosine similarity. Thus data structures best described in the cosine similarity and data structures best captured by the Euclidean distance can both be effectively detected in our angular embedding. We provide the theoretical analysis, derive the computational algorithm, and evaluate the angular embedding on several datasets. Experiments on data clustering demonstrate that our method can provide a more discriminative subspace.

angular decomposition, graph data, vector data, (12 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Texas > Tarrant County > Arlington (0.04)
North America > United States > New York (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Active Online Classification Via Information Maximization

Slonim, Noam (IBM Haifa Research Lab) | Yom-Tov, Elad (IBM Haifa Research Lab) | Crammer, Koby (The Technion)

We propose an online classification approach for co-occurrence data which is based on a simple information theoretic principle. We further show how to properly estimate the uncertainty associated with each prediction of our scheme and demonstrate how to exploit these uncertainty estimates. First, in order to abstain highly uncertain predictions. And second, within an active learning framework, in order to preserve classification accuracy while substantially reducing training set size. Our method is highly efficient in terms of run-time and memory footprint requirements. Experimental results in the domain of text classification demonstrate that the classification accuracy of our method is superior or comparable to other state-of-the-art online classification algorithms.

algorithm, prediction, true label, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.05)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
North America > United States > New York (0.04)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.35)

Consistency Measures for Feature Selection: A Formal Definition, Relative Sensitivity Comparison and a Fast Algorithm

Shin, Kilho (University of Hyogo) | Fernandes, Danny (University of Hyogo) | Miyazaki, Seiya (Panasonic Corporation)

Consistency-based feature selection is an important category of feature selection research yet is defined only intuitively in the literature. First, we formally define a consistency measure, and then using this definition, evaluate 19 feature selection measures from the literature. While only 5 of these were labeledas consistency measures by their original authors, by our definition, an additional 9 measures should be classified as consistency measures. To compare these 14 consistency measures in terms of sensitivity, we introduce the concept of quasilinear compatibility order, and partially determine the order among the measures. Next, we proposea new fast algorithm for consistency-based feature selection. We ran experiments using eleven large datasets to compare the performance of our algorithm against INTERACT and LCC, the only two instances of consistency-based algorithms with potential real world application. Our algorithm shows vast improvement in time efficiency, while its performance in accuracy is comparable with that of INTERACT and LCC.

algorithm, consistency measure, dataset, (12 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Sharara, Hossam (University of Maryland, College Park) | Getoor, Lise (University of Maryland, College Park) | Norton, Myra (Community Analytics, Baltimore)

Active Surveying: A Probabilistic Approach for Identifying Key Opinion Leaders

Opinion leaders play an important role in influencing people’s beliefs, actions and behaviors. Although a number of methods have been proposed for identifying influentials using secondary sources of information, the use of primary sources, such as surveys, is still favored in many domains. In this work we present a new surveying method which combines secondary data with partial knowledge from primary sources to guide the information gathering process. We apply our proposed active surveying method to the problem of identifying key opinion leaders in the medical field, and show how we are able to accurately identify the opinion leaders while minimizing the amount of primary data required, which results in significant cost reduction in data acquisition without sacrificing its integrity.

nomination, opinion leader, respondent, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Maryland > Baltimore (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre:

Questionnaire & Opinion Survey (0.47)
Research Report (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Classification of Emerging Extreme Event Tracks in Multivariate Spatio-Temporal Physical Systems Using Dynamic Network Structures: Application to Hurricane Track Prediction

Sencan, Huseyin (North Carolina State University) | Chen, Zhengzhang (North Carolina State University) | Hendrix, William (Northwestern University) | Pansombut, Tatdow (North Carolina State University) | Semazzi, Frederick (North Carolina State University) | Choudhary, Alok (North Carolina State University) | Kumar, Vipin (University of Minnesota) | Melechko, Anatoli V. (North Carolina State University) | Samatova, Nagiza F. (Oak Ridge National Laboratory)

Understanding extreme events, such as hurricanes or forest fires, is of paramount importance because of their adverse impacts on human beings. Such events often propagate in space and time. Predicting-even a few days in advance-what locations will get affected by the event tracks could benefit our society in many ways. Arguably, simulations from “first principles,” where underlying physics-based models are described by a system of equations, provide least reliable predictions for variables characterizing the dynamics of these extreme events. Data-driven model building has been recently emerging as a complementary approach that could learn the relationships between historically observed or simulated multiple, spatio-temporal ancillary variables and the dynamic behavior of extreme events of interest. While promising, the methodology for predictive learning from such complex data is still in its infancy. In this paper, we propose a dynamic networks-based methodology for in-advance prediction of the dynamic tracks of emerging extreme events. By associating a network model of the system with the known tracks, our method is capable of learning the recurrent network motifs that could be used as discriminatory signatures for the event's behavioral class. When applied to classifying the behavior of the hurricane tracks at their early formation stages in Western Africa region, our method is able to predict whether hurricane tracks will hit the land of the North Atlantic region at least 10-15 days lead lag time in advance with more than 90% accuracy using 10-fold cross-validation. To the best of our knowledge, no comparable methodology exists for solving this problem using data-driven models.

classifier, hurricane, network motif, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Africa > West Africa (0.24)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Braga > Braga (0.05)
(5 more...)

Genre:

Research Report > Experimental Study (0.70)
Research Report > New Finding (0.47)

Industry:

Energy (0.94)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Sato, Taisuke (Tokyo Institute of Technology)

A General MCMC Method for Bayesian Inference in Logic-Based Probabilistic Modeling

We propose a general MCMC method for Bayesian inference in logic-based probabilistic modeling. It covers a broad class of generative models including Bayesian networks and PCFGs. The idea is to generalize an MCMC method for PCFGs to the one for a Turing-complete probabilistic modeling language PRISM in the context of statistical abduction where parse trees are replaced with explanations. We describe how to estimate the marginal probability of data from MCMC samples and how to perform Bayesian Viterbi inference using an example of Naive Bayes model augmented with a hidden variable.

explanation, inference, prism, (13 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Discovering Deformable Motifs in Continuous Time Series Data

Saria, Suchi (Stanford University) | Duchi, Andrew (Stanford University) | Koller, Daphne (Stanford University)

Continuous time series data often comprise or contain repeated motifs — patterns that have similar shape, and yet exhibit nontrivial variability. Identifying these motifs, even in the presence of variation, is an important subtask in both unsupervised knowledge discovery and constructing useful features for discriminative tasks. This paper addresses this task using a probabilistic framework that models generation of data as switching between a random walk state and states that generate motifs. A motif is generated from a continuous shape template that can undergo non-linear transformations such as temporal warping and additive noise. We propose an unsupervised algorithm that simultaneously discovers both the set of canonical shape templates and a template-specific model of variability manifested in the data. Experimental results on three real-world data sets demonstrate that our model is able to recover templates in data where repeated instances show large variability. The recovered templates provide higher classification accuracy and coverage when compared to those from alternatives such as random projection based methods and simpler generative models that do not model variability. Moreover, in analyzing physiological signals from infants in the ICU, we discover both known signatures as well as novel physiomarkers.

initialization, motif, template, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Samdani, Rajhans Yih (University of Illinois at Urbana-Champaign) | Yih, Wen-tau (Microsoft Research)

Domain Adaptation with Ensemble of Feature Groups

We present a novel approach for domain adaptation based on feature grouping and re-weighting. Our algorithm operates by creating an ensemble of multiple classifiers, where each classifier is trained on one particular feature group. Faced with the distribution change involved in domain change, different feature groups exhibit different cross-domain prediction abilities. Herein, ensemble models provide us the flexibility of tuning the weights of corresponding classifiers in order to adapt to the new domain. Our approach is supported by a solid theoretical analysis based on the expressiveness of ensemble classifiers, which allows trading-off errors across source and target domains. Moreover, experimental results on sentiment classification and spam detection show that our approach not only outperforms the baseline method, but is also superior to other state-of-the-art methods.

adaptation, classifier, feature group, (13 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Q-Error as a Selection Mechanism in Modular Reinforcement-Learning Systems

Ring, Mark B. (IDSIA) | Schaul, Tom (IDSIA)

This paper introduces a novel multi-modular method for reinforcement learning.A multi-modular system is one that partitions the learning task among a set of experts (modules), where each expert is incapable of solving the entire task by itself.There are many advantages to splitting up large tasks in this way, but existing methods face difficulties when choosing which module(s) should contribute to the agent's actions at any particular moment.We introduce a novel selection mechanism where every module, besides calculating a set of action values, also estimates its own error for the current input.The selection mechanism combines each module's estimate of long-term reward and self-error to produce a score by which the next module is chosen.As a result, the modules can use their resources effectively and efficiently divide up the task.The system is shown to learn complex tasks even when the individual modules use only linear function approximators.

controller, module, vector, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)
Europe > Switzerland (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)