Inductive Learning
This online retailer uses AI for product categorisation - here's how
Whilst the use of machine learning in marketing seems to be skyrocketing, there's a dearth of coverage in the media that tries to get to the bottom of this technology in terms a layman can understand. I am irrefutably a layman, and have written a little on the topic of AI for Econsultancy, often specifically about ecommerce. In this blog post, we'll be talking to the founder of LoveTheSales.com, Before we start, let me point out that this year's Festival of Marketing has a whole stage (one of 12) dedicated to AI. View the agenda and get your tickets here. Machine learning is used to classify these products, tagging them to enable the website to sort them into the right categories and to show a user products they may be interested in.
Supervised and Unsupervised Machine Learning Algorithms - Machine Learning Mastery
What is supervised machine learning and how does it relate to unsupervised machine learning? In this post you will discover supervised learning, unsupervised learning and semis-supervised learning. Supervised and Unsupervised Machine Learning Algorithms Photo by US Department of Education, some rights reserved. The majority of practical machine learning uses supervised learning. Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.
Improved Representation Learning for Predicting Commonsense Ontologies
Li, Xiang, Vilnis, Luke, McCallum, Andrew
Recent work in learning ontologies (hierarchical and partially-ordered structures) has leveraged the intrinsic geometry of spaces of learned representations to make predictions that automatically obey complex structural constraints. We explore two extensions of one such model, the order-embedding model for hierarchical relation learning, with an aim towards improved performance on text data for commonsense knowledge representation. Our first model jointly learns ordering relations and non-hierarchical knowledge in the form of raw text. Our second extension exploits the partial order structure of the training data to find long-distance triplet constraints among embeddings which are poorly enforced by the pairwise training procedure. We find that both incorporating free text and augmented training constraints improve over the original order-embedding model and other strong baselines.
Hacking in the World of Artificial Intelligence - The Ape Machine
One of the things we haven't talked about much is the concept of human intervention when it comes to the dangers of artificial intelligence, especially in its early stages, where we are now with the technology. This may be far less of a "doomsday" or apocalyptic scenario, but the consequences could still be quite devastating on a person by person basis, or even affect larger groups depending on where a machine learning algorithm is deployed. We want to look at very specific cases where reinforcement learning is deployed with public access, much like how you can tell Google Translate that a translation is incorrect, and submit your improvements to them. Another good example would be marking an email that is not in your spam folder, but definitely belongs there, as such so the machine learning algorithm will become better over time. Exploiting these technologies can be done in a variety of ways, and while I initially thought it would take a large group to skew the learning of a machine by flooding it with many badly labeled training examples, it would not be impossible to have this done by some kind of botnet. See, most machine learning algorithms learn by training them on a so-called "labeled" data set, which is a large set of input data, and a label which is the desired perfect output of that input data.
How to squeeze the most from your training data
In many cases, the acquisition of well-labelled training data is a huge hurdle for developing accurate prediction systems with supervised learning. At Love the Sales, we had the requirement to apply classification to the textual metadata of 2 million products (mostly fashion and homewares) into 1,000 different categories โ represented in a hierarchy. In order to achieve this, we have architected a hierarchical tree of chained 2-class linear (Positive vs Negative) Support Vector Machines (LibSVM), each responsible for binary document classification of each hierarchical class. A key learning, is that the way in which these SVM's are structured can actually have a significant impact on how much training data has to be applied, for example, a naive approach would have been as follows: This approach requires that for every additional sub-category, two new SVM's be trained โ for example, the addition of a new class for'Swimwear' would require an additional SVM under Men's and Women's โ not to mention the potential complexity of adding a'Unisex' class at the top level. Overall, deep hierarchical structures can be too rigid to work with.
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Zhao, Jieyu, Wang, Tianlu, Yatskar, Mark, Ordonez, Vicente, Chang, Kai-Wei
Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding social biases found in web corpora. In this work, we study data and models associated with multilabel object classification and visual semantic role labeling. We find that (a) datasets for these tasks contain significant gender bias and (b) models trained on these datasets further amplify existing bias. For example, the activity cooking is over 33% more likely to involve females than males in a training set, and a trained model further amplifies the disparity to 68% at test time. We propose to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference. Our method results in almost no performance loss for the underlying recognition task but decreases the magnitude of bias amplification by 47.5% and 40.5% for multilabel classification and visual semantic role labeling, respectively.
A Consistent Regularization Approach for Structured Prediction
Ciliberto, Carlo, Rudi, Alessandro, Rosasco, Lorenzo
We propose and analyze a regularization approach for structured prediction problems. We characterize a large class of loss functions that allows to naturally embed structured outputs in a linear space. We exploit this fact to design learning algorithms using a surrogate loss approach and regularization techniques. We prove universal consistency and finite sample bounds characterizing the generalization properties of the proposed methods. Experimental results are provided to demonstrate the practical usefulness of the proposed approach.
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities
One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.
The Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey
Glauner, Patrick, Meira, Jorge Augusto, Valtchev, Petko, State, Radu, Bettinger, Franck
Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science. NTLs cause significant harm to the economy, as in some countries they may range up to 40% of the total electricity distributed. The predominant research direction is employing artificial intelligence to predict whether a customer causes NTL. This paper first provides an overview of how NTLs are defined and their impact on economies, which include loss of revenue and profit of electricity providers and decrease of the stability and reliability of electrical power grids. It then surveys the state-of-the-art research efforts in a up-to-date and comprehensive review of algorithms, features and data sets used. It finally identifies the key scientific and engineering challenges in NTL detection and suggests how they could be addressed in the future.
An Introduction to Machine Learning Algorithms
Kernel methods are a group of machine learning algorithms used for pattern analysis, which involves organizing raw data into rankings, clusters, or classifications. These methods allow data scientists to apply their domain knowledge of a given problem by building custom kernels that incorporate the data transformations that are most likely to improve the accuracy of the overall mode The most popular application of kernels is the support vector machine (SVM), which builds a model that classifies new data as belonging to one category or another based on a set of training examples. A SVM makes these determinations by representing each example as a point in a multi-dimensional space called a hyperplane. The points are then separated into categories by maximizing the distance (called a "margin") between the different apparent groups in the data.