Datasets for Machine Learning

The article introduces 10 open datasets for linear regression tasks and includes medical data, real estate data, and stock exchange data.
The MNIST dataset is considered one of the benchmark datasets for machine learning. Many of the datasets on this list were inspired by MNIST or created as drop-in replacements for the original.
We've compiled a list of Chinese datasets that can cover a wide range of use cases, from optical character recognition (OCR) to sentiment analysis.
Introducing 13 free Japanese language text datasets for machine learning, natural language processing, sentiment analysis, and more.
With data taken from "the front page of the Internet", this guide will introduce the top 10 Reddit datasets for machine learning. 
From sentiment analysis models to content moderation models and other NLP use cases, Twitter data can be used to train various machine learning algorithms.
Page of 9 Next

Welcome!

Sign up to our newsletter for fresh developments from the world of training data. Lionbridge brings you interviews with industry experts, dataset collections and more.