
At Lionbridge, we know that high quality training data can be difficult to find. To help students, data scientists, and development teams get the data they need, we’ve posted a large amount of dataset aggregations on our blog. Here, you can find all of those datasets in one convenient place and search for the data you need based on use case or data type. This list will be constantly updated, providing you with the best curated dataset library available online.
The datasets have been listed in alphabetical order according to use case. Some datasets have been repeated if they belong to multiple categories.
Audio Datasets
Computer Vision Dataset Library
Data Analytics
Fintech and Financial Services Data
Language Dataset Library
NLP Datasets
Social Media Datasets
Miscellaneous Datasets
This dataset library will be constantly updated with new curated lists of the best datasets for each category and use case. Subscribe to our newsletter to receive notifications for future updates and keep up with all the latest in machine learning.
Lionbridge Data Annotation Services
Still can’t find the data you need for your project? Get in touch to learn more about our services. With over 20 years of experience in translation, linguistics, and AI training data, Lionbridge is trusted by governments and large tech companies worldwide. We are a leader in NLP data outsourcing, image annotation, and more.