Collaborating Authors

Testing in production: using JSON Schema for 3rd party API response validation


Submotion offers a central place to manage SaaS accounts and subscriptions. This is made possible by connecting to a bunch of third party API's. It's an amazing fact that so many companies expose API's that allow you to integrate with their service, to mutual benefit. Using such API's is fairly trivial and even though things like OAuth implementations quite often diverge from the standard, it's usually not too hard to set up. However, testing such integrations is painful.

IAM Access Analyzer Update – Policy Validation


AWS Identity and Access Management (IAM) is an important and fundamental part of AWS. You can create IAM policies and service control policies (SCPs) that define the desired level of access to specific AWS services and resources, and then attach the policies to IAM principals (users and roles), groups of users, or to AWS resources. With the fine-grained control that you get with IAM comes the responsibility to use it properly, almost always seeking to establish least privilege access. The IAM tutorials will help you to learn more, and the IAM Access Analyzer will help you to identify resources that are shared with an external entity. We recently launched an update to IAM Access Analyzer that allows you to Validate Access to Your S3 Buckets Before Deploying Permissions Changes.

Proper Model Selection through Cross Validation


So, what is cross validation? Recalling my post about model selection, where we saw that it may be necessary to split data into three different portions, one for training, one for validation (to choose among models) and eventually measure the true accuracy through the last data portion. This procedure is one viable way to choose the best among several models. Cross validation (CV) is not too different from this idea, but deals with the model training/validation in quite a smart way. For CV we use a larger combined training and validation data set, followed by a testing dataset.

Hyperparameter Tuning to Reduce Overfitting -- LightGBM


Easy access to an enormous amount of data and high computing power has made it possible to design complex machine learning algorithms. As the model complexity increases, the amount of data required to train it also increases. Data is not the only factor in the performance of a model. Complex models have many hyperparameters that need to be correctly adjusted or tuned in order to make the most out of them. For instance, the performance of XGBoost and LightGBM highly depend on the hyperparameter tuning.

An Algorithmic Framework for Computing Validation Performance Bounds by Using Suboptimal Models Machine Learning

Practical model building processes are often time-consuming because many different models must be trained and validated. In this paper, we introduce a novel algorithm that can be used for computing the lower and the upper bounds of model validation errors without actually training the model itself. A key idea behind our algorithm is using a side information available from a suboptimal model. If a reasonably good suboptimal model is available, our algorithm can compute lower and upper bounds of many useful quantities for making inferences on the unknown target model. We demonstrate the advantage of our algorithm in the context of model selection for regularized learning problems.