Collaborating Authors

Machine Learning Datasets in R (10 datasets you can use right now) - Machine Learning Mastery


You need standard datasets to practice machine learning. In this short post you will discover how you can load standard classification and regression datasets in R. This post will show you 3 R libraries that you can use to load standard datasets and 10 specific datasets that you can use for machine learning in R. It is invaluable to load standard datasets in R so that you can test, practice and experiment with machine learning techniques and improve your skill with the platform. There are hundreds of standard test datasets that you can use to practice and get better at machine learning. Most of them are hosted for free on the UCI Machine Learning Repository.

Machine Learning Datasets: 250+ ML Repository Of Speech Datasets


While open data or public data sets are convenient, we offer an extensive catalog of'off-the-shelf', 250 licensable datasets across 80 languages across multiple dialects for a variety of common AI use cases. We are excited to announce 30 new datasets for 2020 that deliver immediate value to our customers. Among our offerings, you will find data sets for speech recognition, learning datasets for machine learning algorithms, all created with the most advanced available data science. Whether you are working on a text-to-speech system, a voice recognition system or another solution that relies on natural language, high-quality licensed speech and language datasets allow you to go to market faster and reach more potential customers. Should You Build or Buy a Data Annotation Tool?

Train a model on fashion dataset


Fashion MNIST is a direct drop-in replacement for the original MNIST dataset. The dataset is made up of 60,000 training examples and 10,000 testing examples, where each example is a 28 28 grayscaled picture of various articles of clothing. The Fashion MNIST dataset is more difficult than the original MNIST, and thus serves as a more complete benchmarking tool. The model being trained is a CNN with three convolutional layers followed by two dense layers. The job will run for 30 epochs, with a batch size of 128.

10 Best Legal Datasets for Machine Learning Lionbridge AI


AI technology is making headlines in a wide range of industries including financial services and medical, but legal AI may not immediately come to mind for many. However, AI is already transforming the legal sector in many ways, primarily because it is streamlining traditionally cumbersome processes and allowing professionals to focus on higher-level tasks. For those interested in developing legal machine learning applications, we at Lionbridge AI have scoured the web to put together a collection of the best publicly available legal datasets. In case you missed our previous dataset compilations, you can find them all here. Still can't find the custom data you need to train your model?

Open Datasets for Machine Learning Lionbridge AI


Datasets are an integral part of machine learning. Without high quality training datasets, machine learning algorithms would have no way of knowing how to conduct sentiment analysis, categorize products or understand foreign languages. This spreadsheet contains the ultimate list of open datasets for machine learning. Organized by industry and use case, this database contains a diverse range of 300 datasets to train machine learning models.