Wu, Fei (Wuhan University and Nanjing University of Posts and Telecommunications) | Jing, Xiao-Yuan (Wuhan University and Nanjing University of Posts and Telecommunications) | Shan, Shiguang (Chinese Academy of Sciences (CAS)) | Zuo, Wangmeng (Harbin Institute of Technology) | Yang, Jing-Yu (Nanjing University of Science and Technology)
With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio of data is high, most existing imbalanced learning methods decline in classification performance. To address this problem, a few highly imbalanced learning methods have been presented. However, most of them are still sensitive to the high imbalance ratio. This work aims to provide an effective solution for the highly imbalanced data classification problem. We conduct highly imbalanced learning from the perspective of feature learning. We partition the majority class into multiple blocks with each being balanced to the minority class and combine each block with the minority class to construct a balanced sample set. Multiset feature learning (MFL) is performed on these sets to learn discriminant features. We thus propose an uncorrelated cost-sensitive multiset learning (UCML) approach. UCML provides a multiple sets construction strategy, incorporates the cost-sensitive factor into MFL, and designs a weighted uncorrelated constraint to remove the correlation among multiset features. Experiments on five highly imbalanced datasets indicate that: UCML outperforms state-of-the-art imbalanced learning methods.
We address the problem of propositional logic-based abduction, i.e., the problem of searching for a best explanation for a given propositional observation according to a given propositional knowledge base. We give a general algorithm, based on the notion of projection; then we study restrictions over the representations of the knowledge base and of the query, and find new polynomial classes of abduction problems.
TL;DR: Get help understanding sets, logic, stats, and more with the Discrete Mathematics Course for $15.99, a 91% savings as of Aug. 17. Remember that joke when you were in school about math being useless? Let's face it: You were wrong. And while it's highly unlikely that you'll have to SIN COS TAN your way through life, you still need to master how numbers work, especially if you or your newly-homeschooled child is gunning for a highly technical job in STEM. If you're diving deep into the world of mathematics and computer science or are trying to find some additional learning materials for your kids, the Discrete Mathematics course is a great option.
Statistical relational models provide compact encodings of probabilistic dependencies in relational domains, but result in highly intractable graphical models. The goal of lifted inference is to carry out probabilistic inference without needing to reason about each individual separately, by instead treating exchangeable, undistinguished objects as a whole. In this paper, we study the domain recursion inference rule, which, despite its central role in early theoretical results on domain-lifted inference, has later been believed redundant. We show that this rule is more powerful than expected, and in fact significantly extends the range of models for which lifted inference runs in time polynomial in the number of individuals in the domain. This includes an open problem called S4, the symmetric transitivity model, and a first-order logic encoding of the birthday paradox.
Most machine learning tools work with a single table where each row is an instance and each column is an attribute. Each cell of the table contains an attribute value for an instance. This representation prevents one important form of learning, which is, classification based on groups of correlated records, such as multiple exams of a single patient, internet customer preferences, weather forecast or prediction of sea conditions for a given day. To some extent, relational learning methods, such as inductive logic programming, can capture this correlation through the use of intensional predicates added to the background knowledge. In this work, we propose SPPAM, an algorithm that aggregates past observations in one single record. We show that applying SPPAM to the original correlated data, before the learning task, can produce classifiers that are better than the ones trained using all records.