Goto

Collaborating Authors

 Inductive Learning


Process mining classification with a weightless neural network

arXiv.org Artificial Intelligence

Using a weightless neural network architecture WiSARD we propose a straightforward graph to retina codification to represent business process graph flows avoiding kernels, and we present how WiSARD outperforms the classification performance with small training sets in the process mining context.


TREX: Tree-Ensemble Representer-Point Explanations

arXiv.org Machine Learning

How can we identify the training examples that contribute most to the prediction of a tree ensemble? In this paper, we introduce TREX, an explanation system that provides instance-attribution explanations for tree ensembles, such as random forests and gradient boosted trees. TREX builds on the representer point framework previously developed for explaining deep neural networks. Since tree ensembles are non-differentiable, we define a kernel that captures the structure of the specific tree ensemble. By using this kernel in kernel logistic regression or a support vector machine, TREX builds a surrogate model that approximates the original tree ensemble. The weights in the kernel expansion of the surrogate model are used to define the global or local importance of each training example. Our experiments show that TREX's surrogate model accurately approximates the tree ensemble; its global importance weights are more effective in dataset debugging than the previous state-of-the-art; its explanations identify the most influential samples better than alternative methods under the remove and retrain evaluation framework; it runs orders of magnitude faster than alternative methods; and its local explanations can identify and explain errors due to domain mismatch.


Representation Learning from Limited Educational Data with Crowdsourced Labels

arXiv.org Artificial Intelligence

Representation learning has been proven to play an important role in the unprecedented success of machine learning models in numerous tasks, such as machine translation, face recognition and recommendation. The majority of existing representation learning approaches often require a large number of consistent and noise-free labels. However, due to various reasons such as budget constraints and privacy concerns, labels are very limited in many real-world scenarios. Directly applying standard representation learning approaches on small labeled data sets will easily run into over-fitting problems and lead to sub-optimal solutions. Even worse, in some domains such as education, the limited labels are usually annotated by multiple workers with diverse expertise, which yields noises and inconsistency in such crowdsourcing settings. In this paper, we propose a novel framework which aims to learn effective representations from limited data with crowdsourced labels. Specifically, we design a grouping based deep neural network to learn embeddings from a limited number of training samples and present a Bayesian confidence estimator to capture the inconsistency among crowdsourced labels. Furthermore, to expedite the training process, we develop a hard example selection procedure to adaptively pick up training examples that are misclassified by the model. Extensive experiments conducted on three real-world data sets demonstrate the superiority of our framework on learning representations from limited data with crowdsourced labels, comparing with various state-of-the-art baselines. In addition, we provide a comprehensive analysis on each of the main components of our proposed framework and also introduce the promising results it achieved in our real production to fully understand the proposed framework.


Enhancing Mixup-based Semi-Supervised Learning with Explicit Lipschitz Regularization

arXiv.org Machine Learning

The success of deep learning relies on the availability of large-scale annotated data sets, the acquisition of which can be costly, requiring expert domain knowledge. Semi-supervised learning (SSL) mitigates this challenge by exploiting the behavior of the neural function on large unlabeled data. The smoothness of the neural function is a commonly used assumption exploited in SSL. A successful example is the adoption of mixup strategy in SSL that enforces the global smoothness of the neural function by encouraging it to behave linearly when interpolating between training examples. Despite its empirical success, however, the theoretical underpinning of how mixup regularizes the neural function has not been fully understood. In this paper, we offer a theoretically substantiated proposition that mixup improves the smoothness of the neural function by bounding the Lipschitz constant of the gradient function of the neural networks. We then propose that this can be strengthened by simultaneously constraining the Lipschitz constant of the neural function itself through adversarial Lipschitz regularization, encouraging the neural function to behave linearly while also constraining the slope of this linear function. On three benchmark data sets and one real-world biomedical data set, we demonstrate that this combined regularization results in improved generalization performance of SSL when learning from a small amount of labeled data. We further demonstrate the robustness of the presented method against single-step adversarial attacks. Our code is available at https://github.com/Prasanna1991/Mixup-LR.


Probabilistic Label Trees for Extreme Multi-label Classification

arXiv.org Machine Learning

Extreme multi-label classification (XMLC) is a learning task of tagging instances with a small subset of relevant labels chosen from an extremely large pool of possible labels. Problems of this scale can be efficiently handled by organizing labels as a tree, like in hierarchical softmax used for multi-class problems. In this paper, we thoroughly investigate probabilistic label trees (PLTs) which can be treated as a generalization of hierarchical softmax for multi-label problems. We first introduce the PLT model and discuss training and inference procedures and their computational costs. Next, we prove the consistency of PLTs for a wide spectrum of performance metrics. To this end, we upperbound their regret by a function of surrogate-loss regrets of node classifiers. Furthermore, we consider a problem of training PLTs in a fully online setting, without any prior knowledge of training instances, their features, or labels. In this case, both node classifiers and the tree structure are trained online. We prove a specific equivalence between the fully online algorithm and an algorithm with a tree structure given in advance. Finally, we discuss several implementations of PLTs and introduce a new one, napkinXC, which we empirically evaluate and compare with state-of-the-art algorithms.


Weakly Supervised Learning of Nuanced Frames for Analyzing Polarization in News Media

arXiv.org Artificial Intelligence

In this paper we suggest a minimally-supervised approach for identifying nuanced frames in news article coverage of politically divisive topics. We suggest to break the broad policy frames suggested by Boydstun et al., 2014 into fine-grained subframes which can capture differences in political ideology in a better way. We evaluate the suggested subframes and their embedding, learned using minimal supervision, over three topics, namely, immigration, gun-control and abortion. We demonstrate the ability of the subframes to capture ideological differences and analyze political discourse in news media.


The 6 Biggest Pitfalls That Companies Must Avoid When Implementing AI

#artificialintelligence

The age of AI is upon us and many companies begin to start their AI journey and reap the full potential of AI in their respective industries. But, some still consider AI as an immature technology with plenty of ways for it to go wrong. Therefore, before starting your long AI journey, there are some pitfalls you should avoid in implementing and developing AI solutions. They're a result of the anecdotal, personal and published experience of AI projects that could have gone better. Reinventing the wheel, that's the reasonable words to describe building an AI system that has become an industry standard.


Australia coronavirus cases 'set to be lowest in months'

BBC News

Victoria's Premier Daniel Andrews said the numbers were "cause for great optimism". His state, which has accounted for 75% of Australia's 26,900 cases and 90% of its 849 deaths, has been under lockdown since early July.


New DeepMind Approach 'Bootstraps' Self-Supervised Learning of Image Representations

#artificialintelligence

The Cambridge Dictionary defines "bootstrap" as: "to improve your situation or become more successful, without help from others or without advantages that others have." While a machine learning algorithm's strength depends heavily on the quality of data it is fed, an algorithm that can do the work required to improve itself should become even stronger. A team of researchers from DeepMind and Imperial College recently set out to prove that in the arena of computer vision. In the updated paper Bootstrap Your Own Latent – A New Approach to Self-Supervised Learning, the researchers release the source code and checkpoint for their new "BYOL" approach to self-supervised image representation learning along with new theoretical and experimental insights. In computer vision, learning good image representations is critical as it allows for efficient training on downstream tasks. Image representation learning basically leverages neural networks that have been trained to produce good representations.


Supervised Ontology and Instance Matching with MELT

arXiv.org Artificial Intelligence

Our contributions are twofold: We present an open source machine learning extension to the matching toolkit as well as two supervised learning use cases demonstrating the capabilities of the new extension.