AITopics

Multi-label learning deals with the problem where each training example is represented by a single instance while associated with a set of class labels. For an unseen example, existing approaches choose to determine the membership of each possible class label to it based on identical feature set, i.e. the very instance representation of the unseen example is employed in the discrimination processes of all labels. However, this commonly-used strategy might be suboptimal as different class labels usually carry specific characteristics of their own, and it could be beneficial to exploit different feature sets for the discrimination of different labels. Based on the above reflection, we propose a new strategy to multi-label learning by leveraging label-specific features, where a simple yet effective algorithm named LIFT is presented. Briefly, LIFT constructs features specific to each label by conducting clustering analysis on its positive and negative instances, and then performs training and testing by querying the clustering results. Extensive experiments across sixteen diversified data sets clearly validate the superiority of LIFT against other well-established multi-label learning algorithms.

algorithm, classification, multi-label learning, (11 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Diversity Regularized Machine

Yu, Yang (Nanjing University) | Li, Yu-Feng (Nanjing University) | Zhou, Zhi-Hua (Nanjing University)

Ensemble methods, which train multiple learners for a task, are among the state-of-the-art learning approaches. The diversity of the component learners has been recognized as a key to a good ensemble, and existing ensemble methods try different ways to encourage diversity, mostly by heuristics. In this paper, we propose the diversity regularized machine (DRM) in a mathematical programming framework, which efficiently generates an ensemble of diverse support vector machines (SVMs). Theoretical analysis discloses that the diversity constraint used in DRM can lead to an effective reduction on its hypothesis space complexity, implying that the diversity control in ensemble methods indeed plays a role of regularization as in popular statistical learning approaches. Experiments show that DRM can significantly improve generalization ability and is superior to some state-of-the-art SVM ensemble methods.

diversity, drm, learner, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Spyromitros-Xioufis, Eleftherios (Aristotle University of Thessaloniki) | Spiliopoulou, Myra (Otto-von-Guericke University of Magdeburg) | Tsoumakas, Grigorios (Aristotle University of Thessaloniki) | Vlahavas, Ioannis (Aristotle University of Thessaloniki)

Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification

Data streams containing objects that are (or can be) associated with more than one label at the same time are ubiquitous. In spite of its important applications, classification of streaming multi-label data is largely unexplored. Existing approaches try to tackle the problem by transferring traditional single-label stream classification practices to the multi-label domain. Nevertheless, they fail to consider some of the unique properties of the problem such as within and between class imbalance and multiple concept drift. To deal with these challenges, this paper proposes a novel multi-label stream classification approach that employs two windows for each label, one for positive and one for negative examples. Instance-sharing is exploited for space efficiency, while a time-efficient instantiation based on the k-Nearest Neighbor algorithm is also proposed. Finally, a batch-incremental thresholding technique is proposed to further deal with the class imbalance problem. Results of an empirical comparison against two other methods on three real world datasets are in favor of the proposed approach.

classification, classifier, negative example, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Germany > Saxony-Anhalt > Magdeburg (0.05)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
North America > United States > District of Columbia > Washington (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.88)

Wu, Ou (NLPR, Institute of Automation, Chinese Academy of Sciences) | Hu, Weiming (NLPR, Institute of Automation, Chinese Academy of Sciences) | Gao, Jun (NLPR, Institute of Automation, Chinese Academy of Sciences)

Learning to Rank Under Multiple Annotators

Learning to rank has received great attention in recent years as it plays a crucial role in information retrieval. The existing concept of learning to rank assumes that each training sample is associated with an instance and a reliable label. However, in practice, this assumption does not necessarily hold true. This study focuses on the learning to rank when each training instance is labeled by multiple annotators that may be unreliable. In such a scenario, no accurate labels can be obtained. This study proposes two learning approaches. One is to simply estimate the ground truth first and then to learn a ranking model with it. The second approach is a maximum likelihood learning approach which estimates the ground truth and learns the ranking model iteratively. The two approaches have been tested on both synthetic and real-world data. The results reveal that the maximum likelihood approach outperforms the first approach significantly and is comparable of achieving results with the learning model considering reliable labels. Further more, both the approaches have been applied for ranking the Web visual clutter.

algorithm, annotator, ranking model, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > Middle East > Lebanon (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)

Wingate, David (Massachusetts Institute of Technology) | Goodman, Noah D. (Stanford University) | Roy, Daniel M. (Massachusetts Institute of Technology) | Kaelbling, Leslie P. (Massachusetts Institute of Technology) | Tenenbaum, Joshua B. (Massachusetts Institute of Technology)

Bayesian Policy Search with Policy Priors

We consider the problem of learning to act in partially observable, continuous-state-and-action worlds where we have abstract prior knowledge about the structure of the optimal policy in the form of a distribution over policies. Using ideas from planning-as-inference reductions and Bayesian unsupervised learning, we cast Markov Chain Monte Carlo as a stochastic, hill-climbing policy search algorithm. Importantly, this algorithm's search bias is directly tied to the prior and its MCMC proposal kernels, which means we can draw on the full Bayesian toolbox to express the search bias, including nonparametric priors and structured, recursive processes like grammars over action sequences. Furthermore, we can reason about uncertainty in the search bias itself by constructing a hierarchical prior and reasoning about latent variables that determine the abstract structure of the policy. This yields an adaptive search algorithm---our algorithm learns to learn a structured policy efficiently. We show how inference over the latent variables in these policy priors enables intra- and intertask transfer of abstract knowledge. We demonstrate the flexibility of this approach by learning meta search biases, by constructing a nonparametric finite state controller to model memory, by discovering motor primitives using a simple grammar over primitive actions, and by combining all three.

algorithm, maze, motor primitive, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Local and Structural Consistency for Multi-Manifold Clustering

Wang, Yong (National University of Defense Technology) | Jiang, Yuan (Nanjing University) | Wu, Yi (National University of Defense Technology) | Zhou, Zhi-Hua (Nanjing University)

Data sets containing multi-manifold structures are ubiquitous in real-world tasks, and effective grouping of such data is an important yet challenging problem. Though there were many studies on this problem, it is not clear on how to design principled methods for the grouping of multiple hybrid manifolds. In this paper, we show that spectral methods are potentially helpful for hybridmanifold clustering when the neighborhood graph is constructed to connect the neighboring samples from the same manifold. However, traditional algorithms which identify neighbors according to Euclidean distance will easily connect samples belonging to different manifolds. To handle this drawback, we propose a new criterion, i.e., local and structural consistency criterion, which considers the neighboring information as well as the structural information implied by the samples. Based on this criterion, we develop a simple yet effective algorithm, named Local and Structural Consistency (LSC), for clustering with multiple hybrid manifolds. Experiments show that LSC achieves promising performance.

algorithm, manifold, neighbor, (13 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Bi-Weighting Domain Adaptation for Cross-Language Text Classification

Wan, Chang (Sun Yat-sen University) | Pan, Rong (Sun Yat-sen University) | Li, Jiefei (Sun Yat-sen University)

Text classification is widely used in many real-world applications. To obtain satisfied classification performance, most traditional data mining methods require lots of labeled data, which can be costly in terms of both time and human efforts. In reality, there are plenty of such resources in English since it has the largest population in the Internet world, which is not true in many other languages. In this paper, we present a novel transfer learning approach to tackle the cross-language text classification problems. We first align the feature spaces in both domains utilizing some on-line translation service, which makes the two feature spaces under the same coordinate. Although the feature sets in both domains are the same, the distributions of the instances in both domains are different, which violates the i.i.d. assumption in most traditional machine learning methods. For this issue, we propose an iterative feature and instance weighting (Bi-Weighting) method for domain adaptation. We empirically evaluate the effectiveness and efficiency of our approach. The experimental results show that our approach outperforms some baselines including four transfer learning algorithms.

algorithm, classification, target domain, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.82)

Fast Anomaly Detection for Streaming Data

Tan, Swee Chuan (SIM University) | Ting, Kai Ming (Monash University) | Liu, Tony Fei (Monash University)

This paper introduces Streaming Half-Space-Trees (HS-Trees), a fast one-class anomaly detector for evolving data streams. It requires only normal data for training and works well when anomalous data are rare. The model features an ensemble of random HS-Trees, and the tree structure is constructed without any data. This makes the method highly efficient because it requires no model restructuring when adapting to evolving data streams. Our analysis shows that Streaming HS-Trees has constant amortised time complexity and constant memory requirement. When compared with a state-of-the-art method, our method performs favourably in terms of detection accuracy and runtime performance. Our experimental results also show that the detection performance of Streaming HS-Trees is not sensitive to its parameter settings.

data stream, dataset, streaming hs-tree, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Oceania > Australia (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence (1.00)

Angular Decomposition

Sun, Dengdi (Anhui University) | Ding, Chris H.Q. (University of Texas at Arlington) | Luo, Bin (Anhui University) | Tang, Jin (Anhui University)

Dimensionality reduction plays a vital role in pattern recognition. However, for normalized vector data, existing methods do not utilize the fact that the data is normalized. In this paper, we propose to employ an Angular Decomposition of the normalized vector data which corresponds to embedding them on a unit surface. On graph data for similarity/kernel matrices with constant diagonal elements, we propose the Angular Decomposition of the similarity matrices which corresponds to embedding objects on a unit sphere. In these angular embeddings, the Euclidean distance is equivalent to the cosine similarity. Thus data structures best described in the cosine similarity and data structures best captured by the Euclidean distance can both be effectively detected in our angular embedding. We provide the theoretical analysis, derive the computational algorithm, and evaluate the angular embedding on several datasets. Experiments on data clustering demonstrate that our method can provide a more discriminative subspace.

angular decomposition, graph data, vector data, (12 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Texas > Tarrant County > Arlington (0.04)
North America > United States > New York (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Active Online Classification Via Information Maximization

Slonim, Noam (IBM Haifa Research Lab) | Yom-Tov, Elad (IBM Haifa Research Lab) | Crammer, Koby (The Technion)

We propose an online classification approach for co-occurrence data which is based on a simple information theoretic principle. We further show how to properly estimate the uncertainty associated with each prediction of our scheme and demonstrate how to exploit these uncertainty estimates. First, in order to abstain highly uncertain predictions. And second, within an active learning framework, in order to preserve classification accuracy while substantially reducing training set size. Our method is highly efficient in terms of run-time and memory footprint requirements. Experimental results in the domain of text classification demonstrate that the classification accuracy of our method is superior or comparable to other state-of-the-art online classification algorithms.

algorithm, prediction, true label, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.05)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
North America > United States > New York (0.04)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.35)