10 Free Marketing & Advertising Datasets for Machine Learning

Article by Rei Morikawa | April 01, 2019

Machine learning plays a big role in marketing automation — any software that makes marketer’s lives easier by automating tasks like email, online advertising, customer support, and analytics. Many marketing automation tools nowadays are driven with machine learning, such as sentiment analysis and chatbots for customer support. Marketers can also use marketing automation software to analyze their monthly leads and conversions, or recommend similar products to repeat customers.

Download these free datasets to kickstart your marketing automation initiatives and machine learning projects.

 

Marketing & Advertising Datasets for Machine Learning

  • Largest Food & Packaged Goods Companies 2016: This dataset lists the largest publicly held US food, beverage, personal care, pharmaceutical and tobacco companies for 2016. Last year, it was used for the Texas Beer Project.
  • Women’s Shoe Prices: A list of 10,000 women’s shoes and the various prices at which they are sold. You can use this dataset to correlate specific shoe product features with price changes and cross- reference the data with Men’s Shoe Prices.
  • Yelp Open Dataset: The Yelp dataset is a subset of the company’s businesses, reviews and user data. This dataset is available as JSON files and is intended to teach students about machine learning and natural language processing.
  • Impulsive Buying: Survey data of shoppers who were asked the same questions in order to understand the tendency for impulse shopping, as well as the store environment, products, and promotions that influenced them. Data includes the shoppers’ age, income source, and how many days per month they shopped.
  • Labeled Faces in the Wild: Database of 13,000 face photographs labeled with the person’s name. This dataset was designed for face recognition but it can be useful for marketers too.
  • ADS-16: Computation advertising dataset that includes 300 real ads voted by 120 unacquainted individuals.
  • Classified Ads for Cars: This data was scraped from several websites in the Czech Republic and Germany, for over a year. It includes ads for used cars starting in 2015.
  • Sales Conversion Optimization: The data comes from an anonymous organization’s social media ad campaign. You can cluster customer data for campaign marketing.
  • Google Analytics Sample: This dataset contains Google Analytics 360 data from the Google Merchandise Store. It includes information about site traffic source, content, and transaction data.
  • Customer Support on Twitter: Large, modern corpus of tweets and replies to aid innovation in natural language understanding and conversational models, and to study modern customer support practices and impact.

Still can’t find what you need? Lionbridge AI provides custom datasets for a wide variety of machine learning projects. Our crowd of 500,000+ certified language specialists can create AI training datasets that ensure that your brand and voice are not lost when you add machine learning tools to your marketing strategy. Contact us now to get started.

Interested? Get high-quality data now
The Author
Rei Morikawa

Rei writes content for Lionbridge’s website, blog articles, and social media. Born and raised in Tokyo, but also studied abroad in the US. A huge people person, and passionate about long-distance running, traveling, and discovering new music on Spotify.

Welcome!

Sign up to our newsletter for fresh developments from the world of training data. Lionbridge brings you interviews with industry experts, dataset collections and more.