Collaborating Authors


Self-Supervised Learning and Its Applications -


In the past decade, the research and development in AI have skyrocketed, especially after the results of the ImageNet competition in 2012. The focus was largely on supervised learning methods that require huge amounts of labeled data to train systems for specific use cases. In this article, we will explore Self Supervised Learning (SSL) – a hot research topic in a machine learning community. Self-supervised learning (SSL) is an evolving machine learning technique poised to solve the challenges posed by the over-dependence of labeled data. For many years, building intelligent systems using machine learning methods has been largely dependent on good quality labeled data. Consequently, the cost of high-quality annotated data is a major bottleneck in the overall training process.

What is Training Data and Why Is It Important for AI and Computer Vision? Find Out Here.


Simply put, training data is a dataset that is used to train a machine learning model. The purpose of training data is to provide the model with examples of how it should behave in different situations. Without training data, it would be very difficult for machines to learn how to perform specific tasks. In this article, we will discuss why training data is important for AI and computer vision, and we will provide some tips on where you can find high-quality training datasets. Training data is important for AI and computer vision because it allows machines to learn from examples.

Complex Technology vs AI: What's the Difference?


Often, artificial intelligence (AI) is used broadly to describe all types of systems that seem to make decisions we do not quite understand. But while many reasonably complex systems make decisions like this, it does not immediately make them "intelligent." For example, I might not understand how my "smart" oven thermometer seems to know when my roast beef will be perfectly done, or how my garden light knows when to turn on, but the engineers putting together the (not-too-complex) mathematical equation do. There are many other systems that, at first glance, look intelligent--but they are just constructed by smart people. We should not label these as "intelligent" because that suggests they are making their own decisions instead of simply following a human-designed path. A better way to distinguish (artificially) intelligent systems from those that just follow human-made rules is to look for the person who can explain the systems' inner workings (i.e., the person ultimately responsible for what the systems do).

Lazy learning


Lazy learning refers to machine learning processes in which generalization of the training data is delayed until a query is made to the system. This type of learning is also known as Instance-based Learning. Lazy classifiers are very useful when working with large datasets that have a few attributes. Learning systems have computation occurring at two different times: training time and consultation times. Training time is the time before the consultation time.

5 Useful Machine Learning Repositories on Github


The above given GitHub repository provides an organized list of machine learning libraries, frameworks and tools. This repository contains instances of the most used and widely used machine learning codes and algorithms implemented using Python explained along with the mathematics and logic working behind them. Also, each algorithm is explained through Jupyter notebook'sinteractive environment. The codes are not only run on a training set for data analysis but also the mathematics is explained which makes it one of the best resources to strengthen one's basics. For supervised learning it provides assistance for regression and classification techniques by explaining the mathematics behind linear regression, logistic regression providing the code for it and running it on Jupyter notebook.

Choose the best data source for your Amazon SageMaker training job


Amazon SageMaker is a managed service that makes it easy to build, train, and deploy machine learning (ML) models. Data scientists use SageMaker training jobs to easily train ML models; you don’t have to worry about managing compute resources, and you pay only for the actual training time. Data ingestion is an integral part of […]

Top resources to learn decision trees in 2022


Decision trees are a supervised learning method used to build a model that predicts the value of a target variable by learning simple decision rules from the data features. DTs are used for both classification and regression and are simple to understand and interpret. Below, we have listed down the top online courses, YouTube videos and guides for enthusiasts to master decision trees. The course by CodeAcademy focuses on teaching developers how to build and use decision trees and random forests. The course looks at two methods in detail: Gini impurity and Information Gain.

Leveraging machine learning to find security vulnerabilities


GitHub code scanning now uses machine learning (ML) to alert developers to potential security vulnerabilities in their code. If you want to set up your repositories to surface more alerts using our new ML technology, get started here. Code security vulnerabilities can allow malicious actors to manipulate software into behaving in unintended and harmful ways. The best way to prevent such attacks is to detect and fix vulnerable code before it can be exploited. GitHub's code scanning capabilities leverage the CodeQL analysis engine to find security vulnerabilities in source code and surface alerts in pull requests – before the vulnerable code gets merged and released.

General Cyclical Training of Neural Networks Machine Learning

This paper describes the principle of "General Cyclical Training" in machine learning, where training starts and ends with "easy training" and the "hard training" happens during the middle epochs. We propose several manifestations for training neural networks, including algorithmic examples (via hyper-parameters and loss functions), data-based examples, and model-based examples. Specifically, we introduce several novel techniques: cyclical weight decay, cyclical batch size, cyclical focal loss, cyclical softmax temperature, cyclical data augmentation, cyclical gradient clipping, and cyclical semi-supervised learning. In addition, we demonstrate that cyclical weight decay, cyclical softmax temperature, and cyclical gradient clipping (as three examples of this principle) are beneficial in the test accuracy performance of a trained model. Furthermore, we discuss model-based examples (such as pretraining and knowledge distillation) from the perspective of general cyclical training and recommend some changes to the typical training methodology. In summary, this paper defines the general cyclical training concept and discusses several specific ways in which this concept can be applied to training neural networks. In the spirit of reproducibility, the code used in our experiments is available at \url{}.

Confident AI Artificial Intelligence

In this paper, we propose "Confident AI" as a means to designing Artificial Intelligence (AI) and Machine Learning (ML) systems with both algorithm and user confidence in model predictions and reported results. The 4 basic tenets of Confident AI are Repeatability, Believability, Sufficiency, and Adaptability. Each of the tenets is used to explore fundamental issues in current AI/ML systems and together provide an overall approach to Confident AI.