Goto

Collaborating Authors

Machine Learning


BetaBoosting

#artificialintelligence

At this point, we all know of XGBoost due to the massive success it has had in numerous Data Science competitions held on platforms like Kaggle. Along with its success, we have seen several variations such as CatBoost and LightGBM. All of these implementations are based on the Gradient Boosting algorithm developed by Friedman¹, which involves iteratively building an ensemble of weak learners (usually decision trees) where each subsequent learner is trained on the previous learner's errors. Let's take a look at some general pseudo-code for the algorithm from Elements of Statistical Learning²: However, this is not complete! A core mechanism which allows boosting to work is a shrinkage parameter that penalizes each learner at each boosting round that is commonly called the'learning rate'.


A Road Map for Deep Learning

#artificialintelligence

Deep learning is a form of machine learning which allows a computer to learn from experience and understand things from a hierarchy of concepts where each concept being defined from a simpler one. This approach avoids the need for humans to specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them on top of each other through a deep setup with many layers. The first thing you need to learn when it comes to learning deep learning is the applied math which is the fundamental building block of deep learning. Linear algebra is a branch of mathematics that is widely used throughout engineering.


AI, machine learning may be key to further insurance digital transformation

#artificialintelligence

The pandemic has pushed IT departments to adapt quickly to various challenges. A new report, IT's Changing Mandate in an Age of Disruption, suggests that to continue with various digital transformations and increase adaptability for the future some IT improvements must be made. For the insurance industry, artificial intelligence and machine learning may be key, according to the report, which was conducted by the Economist Intelligence Unit, supported by Appian, an enterprise software company. The report includes information from two surveys, conducted in May and June of this year, and responses from 1,002 IT and senior business executives, who worked across six different sectors including financial services and insurance and were from nine countries. Forty-one percent of respondents from the insurance industry said expanding the use of AI and machine learning is the most impactful way that technology can help organizations respond to potential changes.


How biological detective work can reveal who engineered a virus – Vox

#artificialintelligence

The most successful entrants in the competition could predict, using machine-learning algorithms, which lab produced a certain genetic sequence …


5 Minutes With Tableau's Francois Zimmermann – Manufacturing Global

#artificialintelligence

There are also ethical challenges with the use of machine learning models as these may perpetuate bias and discrimination.


Wrangle Timeseries Data in TensorFlow

#artificialintelligence

Timeseries data is a set of data that is connected to each other where each value is based on previous values. In machine learning model building, predicting the next coming step in future based on later steps in history will be usually accurate. To enhance the learning in timeseries models it is required to handle the data in batches of sequences. Risk of Data Leakage, since model needs to forecast the future, if we shuffle data and split into training and testing dataset randomly; then we may leak some future answers to the model during learning phase. That will result in unexpected results during testing.


5 Unsexy Truths About Working in Machine Learning

#artificialintelligence

I work in Machine Learning. To readers/viewers of my work, this won't come as a surprise. To people who don't know me as well, feel free to check out my LinkedIn/articles/videos for a better understanding of my skills/experience. My specialty is in statistical analysis. I've had experience working in Road Safety, Health System Analysis, Big Data Analysis for a Bank, disease detection, biometric recreation, and currently work in Supply Chain Analysis.


Can It Really Do That? -- Introducing the Edge X AI Camera

#artificialintelligence

The MXC Foundation has made a remarkable entry into the nascent multi-billion dollar AI smart device market. With the exponential growth of its network across the globe, the Foundation is thrilled to introduce more aspects to its network usage, allowing its mining community to utilize the data republic and see the network in action. The proprietary MXProtocol, together with scalable and secure aspects of device provisioning that connect with sensor technology, has proven successful and brings us a step closer to realizing truly smart cities. One such use case, which the MXC Foundation recently tested in a controlled environment, was the Edge X AI Camera. Read on to find out more about all the great functionalities packed into one small device.


5 Concrete Benefits of Bayesian Statistics

#artificialintelligence

Many of us (myself included) have felt discouraged from using Bayesian statistics for analysis. Supposedly, Bayesian statistics has a bad reputation: it is difficult and heavily dependent on math. Also, because of its relevance to many fields, Data Science included, writers and professionals, want to get a head start by publishing articles on how the formula works. I believe data professionals, academics, existing books, and online courses are responsible for creating the negative stereotype of Bayes' hard work. We can all agree that not everyone is attracted to mathematical formulas.


GPT-3 Scared You? Meet Wu Dao 2.0: A Monster of 1.75 Trillion Parameters

#artificialintelligence

Jack Clark, OpenAI's policy director, calls this trend of copying GPT-3, "model diffusion." Yet, among all the copies, Wu Dao 2.0 holds the record of being the largest of all with a striking 1.75 trillion parameters (10x GPT-3). Coco Feng reported for South China Morning Post that Wu Dao 2.0 was trained on 4.9TB of high-quality text and image data, which makes GPT-3's training dataset (570GB) pale in comparison. Yet, it's worth noting OpenAI researchers curated 45TB of data to extract clean those 570GB. It can learn from text and images and tackle tasks that include both types of data (something GPT-3 can't do).