Goto

Collaborating Authors

 abalone


Mysterious, numbered mollusk discovered on Australian beach

Popular Science

Researchers are urging beachgoers to report the endangered, tagged sea snails. Breakthroughs, discoveries, and DIY tips sent six days a week. The black abalone mollusk () is a delicacy in many regions of the world, with fancy restaurant diners doling out as much as $40 per 6 to 8 ounce serving . Although the sea snails are often grown in oyster farms, they are now considered critically endangered due to overdemand and black market harvesting. But while a woman's recent abalone discovery along a beach in Australia is attracting worldwide attention, it's not due to any illegal activity or a lucrative payout.


A Novel Metric for Measuring Data Quality in Classification Applications (extended version)

Roxane, Jouseau, Sébastien, Salva, Chafik, Samir

arXiv.org Artificial Intelligence

Data quality is a key element for building and optimizing good learning models. Despite many attempts to characterize data quality, there is still a need for rigorous formalization and an efficient measure of the quality from available observations. Indeed, without a clear understanding of the training and testing processes, it is hard to evaluate the intrinsic performance of a model. Besides, tools allowing to measure data quality specific to machine learning are still lacking. In this paper, we introduce and explain a novel metric to measure data quality. This metric is based on the correlated evolution between the classification performance and the deterioration of data. The proposed method has the major advantage of being model-independent. Furthermore, we provide an interpretation of each criterion and examples of assessment levels. We confirm the utility of the proposed metric with intensive numerical experiments and detail some illustrative cases with controlled and interpretable qualities.


Predicting the age of abalone from physical measurements Part 1 - Projects Based Learning

#artificialintelligence

Abalone is a common name for any of a group of small to very large sea snails, marine gastropod molluscs in the family Haliotidae. Other common names are ear shells, sea ears, and muttonfish or muttonshells in Australia, ormer in the UK, perlemoen in South Africa, and paua in New Zealand. The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope a boring and time consuming task. Other measurements, which are easier to obtain, are used to predict the age. Given is the attribute name, attribute type, the measurement unit and a brief description.


AI brings automation to seafood industry

#artificialintelligence

JCU's Phoebe Arbon was presented with a Science and Innovation Award for Young People in Agriculture, Fisheries and Forestry last night in Canberra. "I'll use the grant to develop, train and validate an AI model to identify, count and measure abalone from an image. So, very basically, the AI model will learn to predict the weight and size of abalone from images," she said. Ms Arbon said the technology is already in existence but needs specific instructions and application to work within the abalone industry. "Currently, assessing abalone is done manually which can cause harm to the abalone, and costs each farm about $25,000 a year," she said.


OCKELM+: Kernel Extreme Learning Machine based One-class Classification using Privileged Information (or KOC+: Kernel Ridge Regression or Least Square SVM with zero bias based One-class Classification using Privileged Information)

Gautam, Chandan, Tiwari, Aruna, Tanveer, M.

arXiv.org Machine Learning

Kernel method-based one-class classifier is mainly used for outlier or novelty detection. In this letter, kernel ridge regression (KRR) based one-class classifier (KOC) has been extended for learning using privileged information (LUPI). LUPI-based KOC method is referred to as KOC+. This privileged information is available as a feature with the dataset but only for training (not for testing). KOC+ utilizes the privileged information differently compared to normal feature information by using a so-called correction function. Privileged information helps KOC+ in achieving better generalization performance which is exhibited in this letter by testing the classifiers with and without privileged information. Existing and proposed classifiers are evaluated on the datasets from UCI machine learning repository and also on MNIST dataset. Moreover, experimental results evince the advantage of KOC+ over KOC and support vector machine (SVM) based one-class classifiers.


Exact Passive-Aggressive Algorithms for Learning to Rank Using Interval Labels

Manwani, Naresh, Chandra, Mohit

arXiv.org Machine Learning

In this paper, we propose exact passive-aggressive (PA) online algorithms for learning to rank. The proposed algorithms can be used even when we have interval labels instead of actual labels for examples. The proposed algorithms solve a convex optimization problem at every trial. We find exact solution to those optimization problems to determine the updated parameters. We propose support class algorithm (SCA) which finds the active constraints using the KKT conditions of the optimization problems. These active constrains form support set which determines the set of thresholds that need to be updated. We derive update rules for PA, PA-I and PA-II. We show that the proposed algorithms maintain the ordering of the thresholds after every trial. We provide the mistake bounds of the proposed algorithms in both ideal and general settings. We also show experimentally that the proposed algorithms successfully learn accurate classifiers using interval labels as well as exact labels. Proposed algorithms also do well compared to other approaches.


Machine Learning using ML.NET and its integration into ASP.NET Core Web application – Microsoft Faculty Connection

#artificialintelligence

My name is Zurab Murvanidze, I am 1st year computer science student at UCL. I love learning about technology and have deep interest in machine learning, data science, quantum computing and artificial intelligence. I like developing applications and games in my spare time and in this article would love to share my experience in ML.NET. This article will cover basics of machine learning, will introduce you to ML.NET and teach you how to create and train machine learning models. It will also demonstrate how can we implement machine learning in ASP.NET Core Web Application.


Interpreting Decision Trees and Random Forests

#artificialintelligence

The random forest has been a burgeoning machine learning technique in the last few years. It is a non-linear tree-based model that often provides accurate results. However, being mostly black box, it is oftentimes hard to interpret and fully understand. In this blog, we will deep dive into the fundamentals of random forests to better grasp them. We start by looking at the decision tree--the building block of the random forest.


An efficient model-free estimation of multiclass conditional probability

Xu, Tu, Wang, Junhui

arXiv.org Machine Learning

Conventional multiclass conditional probability estimation methods, such as Fisher's discriminate analysis and logistic regression, often require restrictive distributional model assumption. In this paper, a model-free estimation method is proposed to estimate multiclass conditional probability through a series of conditional quantile regression functions. Specifically, the conditional class probability is formulated as difference of corresponding cumulative distribution functions, where the cumulative distribution functions can be converted from the estimated conditional quantile regression functions. The proposed estimation method is also efficient as its computation cost does not increase exponentially with the number of classes. The theoretical and numerical studies demonstrate that the proposed estimation method is highly competitive against the existing competitors, especially when the number of classes is relatively large.


Active Comparison of Prediction Models

Sawade, Christoph, Landwehr, Niels, Scheffer, Tobias

Neural Information Processing Systems

We address the problem of comparing the risks of two given predictive models - for instance, a baseline model and a challenger - as confidently as possible on a fixed labeling budget. This problem occurs whenever models cannot be compared on held-out training data, possibly because the training data are unavailable or do not reflect the desired test distribution. In this case, new test instances have to be drawn and labeled at a cost. We devise an active comparison method that selects instances according to an instrumental sampling distribution. We derive the sampling distribution that maximizes the power of a statistical test applied to the observed empirical risks, and thereby minimizes the likelihood of choosing the inferior model. Empirically, we investigate model selection problems on several classification and regression tasks and study the accuracy of the resulting p-values.