Collaborating Authors

Statistical Learning

Linear Machine Learning Algorithms: An Overview - KDnuggets


Linear machine learning algorithms assume a linear relationship between the features and the target variable. In this article, we'll discuss several linear algorithms and their concepts. Here's a glimpse into what you can expect to learn: You can use linear algorithms for classification and regression problems. Let's start by looking at different algorithms and what problems they solve. Linear regression is arguably one of the oldest and most popular algorithms.

8 Ways You Can 'Level Up' Your Machine Learning Projects


Need to classify data or predict outcomes? Are you struggling with your machine learning (Machine Learning) project? There are various techniques that can improve the situation. Some of the eight methods discussed below will dramatically accelerate the Machine Learning process, and others will not only accelerate the process, but will also help you build better models. Not all of these techniques will be suitable for a particular project.

Artificial intelligence: a new paradigm in the swine industry - Pig Progress


Machine learning is one of the artificial intelligence models frequently used for modeling, prediction, and management of swine farming. Machine learning models mainly include algorithms of a decision tree, clustering, a support vector machine, and the Markov chain model focused on disease detection, behaviour recognition for postural classification, and sound detection of animals. The researchers from North Carolina State University and Smithfield Premium Genetics* demonstrated the application of machine learning algorithms to estimate body weight in growing pigs from feeding behaviour and feed intake data. Feed intake, feeder occupation time, and body weight information were collected from 655 pigs of 3 breeds (Duroc, Landrace, and Large White) from 75 to 166 days of age. 2 machine learning algorithms (long short-term memory network and random forest) were selected to forecast the body weight of pigs using 4 scenarios. Long short-term memory was used to accurately predict time series data due to its ability in learning and storing long term patterns in a sequence-dependent order and random forest approach was used as a representative algorithm in the machine learning space.

Principled machine learning


N1 - UKRI Rights Retention: For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising Funding: DS acknowledges support from the EPSRC Programme Grant TRANSNET (EP/R035342/1) and the Leverhulme trust (RPG-2018-092). YR acknowledges support by the EPSRC Horizon Digital Economy Research grant'Trusted Data Driven Products: EP/T022493/1 and grant'From Human Data to Personal Experience': EP/M02315X/1. N2 - We introduce the underlying concepts which give rise to some of the commonly used machine learning methods, excluding deep-learning machines and neural networks. We point to their advantages, limitations and potential use in various areas of photonics. The main methods covered include parametric and non-parametric regression and classification techniques, kernel-based methods and support vector machines, decision trees, probabilistic models, Bayesian graphs, mixture models, Gaussian processes, message passing methods and visual informatics.

3D Machine Learning 201 Guide: Point Cloud Semantic Segmentation


Having the skills and the knowledge to attack every aspect of point cloud processing opens up many ideas and development doors. It is like a toolbox for 3D research creativity and development agility. And at the core, there is this incredible Artificial Intelligence space that targets 3D scene understanding. It is particularly relevant due to its importance for many applications, such as self-driving cars, autonomous robots, 3D mapping, virtual reality, and the Metaverse. And if you are an automation geek like me, it is hard to resist the temptation to have new paths to answer these challenges! This tutorial aims to give you what I consider the essential footing to do just that: the knowledge and code skills for developing 3D Point Cloud Semantic Segmentation systems. But actually, how can we apply semantic segmentation? And how challenging is 3D Machine Learning? Let me present a clear, in-depth 201 hands-on course focused on 3D Machine Learning.

Digital medicine and the curse of dimensionality - npj Digital Medicine


Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.

OmniXAI: A Library for Explainable AI


Machine Learning models are frequently seen as black boxes that are impossible to decipher. Because the learner is trained to respond to "yes" and "no" type questions without explaining how the answer was obtained. An explanation of how an answer was achieved is critical in many applications for assuring confidence and openness. Explainable AI refers to strategies and procedures in the use of artificial intelligence technology (AI) that allow human specialists to understand the solution's findings. This article will focus on explaining the machine learner using OmniXAI.

Does gridsearch on random forest make sense?


You are right that randomness will play a role (like with many other algorithms including MCMC samplers for Bayesian models, XGBoost, LightGBM, neural networks etc.) in the results. The obvious way to minimize randomness in the results of any hyper-parameter optimization method for RF (whether it's random grid-search, grid search or some Bayesian hyperparameter optimization method) is to increase the number of trees (which reduces the randomness in the model behavior - albeit at the cost of an increased training time). Alternatively, you construct a surrogate model on top of the results that takes into account that the signal, of where the best model in the hyperparameter landscape is, is noisy through an appropriate amount of smoothing/regularization.

How to Build an Online Machine Learning App With Python


Machine learning is rapidly becoming as ubiquitous as data itself. Quite literally wherever there is an abundance of data, machine learning is somehow intertwined. After all, what utility would data have if we were not able to use it to predict something about the future? Luckily there is a plethora of toolkits and frameworks that have made it rather simple to deploy ML in Python. Specifically, Sklearn has done a terrifically effective job at making ML accessible to developers.