Success Stories

'Success Stories'

Success Stories

Discover how we helped our clients to build industry-leading machine learning models

  • “Lionbridge initially stood out because of their capabilities in the regional language space, but we were also impressed by their flexible approach to data annotation. It was clear from early on that they were committed to providing us with a high level of support and eager to align with our project requirements.”

    Deb Goswami, PhD

    Data Science Lead, Traveloka

  • “We used several internal tools to assess the quality of what Lionbridge provided, and the results were very good. We have confidence that we could potentially use the data for critical deep-learning projects.”

    Zachary Yocum

    Senior Linguistic Data Engineer, Basis Technology

  • “Many underestimate the technical infrastructure and operational excellence needed to get such high quality training data. Services like Lionbridge AI are great for engineering teams who need data fast and at scale.”

    Dr. Rasmus Rothe, PhD (Computer Vision)

    Founder, Merantix

Improving machine learning solutions across the globe

From secure image annotation to linguistic component development, our community of contributors can build you a solid ground truth in any language and location.

Liam

150 ads reviewed

Derifa

45 recordings collected

Mwamba

25 locations verified

Olivia

300 texts translated

Lucas

150 entities linked

Maryam

350 images segmented

Luis

200 intents created

Anna

100 text strings classified
  • 1,000,000
    registered contributors

  • 5,000
    locations

  • 300
    languages

Liam

150 ads reviewed

Derifa

45 recordings collected

Mwamba

25 locations verified

Olivia

300 texts translated

Lucas

150 entities linked

Maryam

350 images segmented

Luis

200 intents created

Anna

100 text strings classified

Providing AI training data to leading global technology companies

  • ntt
  • traveloka
  • expedia
  • line
  • crowdworks
  • basis-technology

Case Studies

Text Classification for Traveloka

We labeled over 200,000 search queries for Traveloka, one of south-east Asia’s largest travel companies. Our data was used to build a multi-platform search engine that compiles results from across 76 unique product combinations.

Download case study ›

Speech Data Collection for NICT

We created a dataset of 300,000 speech samples for the National Institute of Information and Communications Technology (NICT) using our annotation platform. The data is being used to develop their automated translation app.

Download case study ›

Text Classification for Zaizen

Lionbridge collected over 5,000 Q&A text samples for Zaizen, a company developing conversational AI systems. This data was used to train a personalized artificial intelligence capable of maintaining everyday conversations with users.

Download case study ›

Sentiment Analysis

We annotated the sentiment of over 20,000 text records for one of the world’s largest technology companies. Thanks to our language expertise, our client was able to expand their sentiment analysis services into 14 languages.

Download case study ›

Image Annotation

Using a specialist team of annotators and the Lionbridge AI data annotation platform, we labelled images with over 40,000 keypoints for a government agency. This dataset helped our client to build video tracking models for sports analysis.

Download case study ›

Proofing Tool Development

We develop and check the quality of textual proofing tool components in partnership with a large technology company. Our services have helped the client to improve their proofing tool’s accuracy rate in 16 languages.

Download case study ›

Social Media Ad Review

We review over 1 million ads per month for one of the world’s biggest social networking platforms. Employing 4,000 experience evaluators in 10 geographic markets, we’re helping to maximize our client’s campaign performance and boost their ad relevance across the globe

Download case study ›

Text Data Collection

For this machine learning project, we collected and annotated over 30,000 conversations from specific scenarios in English and French. Our client used this data to refine several natural language processing models, and has expanded the program into more languages.

Download case study ›

Speech Data Collection

Our crowd created, collected, and tested speech data in 30 languages for our client, including studio-quality recordings and hours of speech samples in each of these languages. This enabled our client to develop and improve their speech recognition algorithms.

Download case study ›

Why Lionbridge?

Quality

Through rigorous contributor selection and quality assurance, we ensure that all of our data provides a solid ground truth.

Expertise

Our project managers and contributors bring 20 years of company experience to each and every project.

Customization

Our project managers and contributors bring 20 years of company experience to each and every project.

1,000,000+ Contributors

300+ Languages

20+ Years of experience

Read more about
our machine learning projects

  • LINGUISTICS

    ASR System Development

    For a multinational telecom company, our experts created multilingual language models, collected audio data, and conducted in-market testing for ASR software in 10 languages.

  • DATA CREATION

    Sentiment Analysis Data Collection

    We created over 10,000 unique sentences in 13 languages for a major multinational technology firm. Our crowd then tagged each sentence with a category and sentiment.

  • DATA CREATION

    OCR System Training

    We collected samples of handwritten Japanese characters by native speakers from our community. The data was used to train an OCR engine to read handwritten documents.

  • DATA ANNOTATION

    Entity Annotation

    Our Japanese community reviewed thousands of texts taken from articles and newspapers for an AI solutions provider. Each text was then sorted into one of five different categories.

  • DATA VALIDATION

    Machine Translation Retraining

    Using a team of native speakers in the Japanese to Chinese language pair, we helped fine-tune machine translation output for one of Japan’s largest telecommunications firms.

  • DATA CREATION

    Data Collection & Annotation

    Our team collected over 5,000 unique conversations from thousands of contributors in 16 target languages. This data was annotated with speaker information and sentiment tags.

  • DATA CREATION

    Content Summarization

    We built a customized team and designed a solution that could select the three most informative comments across hundreds of forum posts for a social news and discussion website.

  • DATA ANNOTATION

    Sentiment Analysis

    We assembled a team of Arabic language specialists to the sentiment of social media posts, classifying each piece of content as either positive, negative, or neutral.

  • DATA VALIDATION

    Data Validation

    For a multinational mass media company that required around-the-clock support, we sourced a team of international contributors to enrich image and text data in English and German.

  • DATA CREATION

    Audio Dataset Creation

    For this client, our team of project managers created a richly detailed dataset of Japanese voice recordings that was transcribed by our crowd of native speakers.

  • DATA CREATION

    Translation Corpus Licensing

    We licensed over 200,000 Japanese to English text segments to a mobile messaging company. They used the data to build deep learning models for machine translation.

  • LINGUISTICS

    Chatbot Training Data

    Our team of linguists helped train a chatbot to recognize and respond to a variety of native and non-native sentences for a leading virtual assistant software company.

  • DATA ANNOTATION

    Ad Relevance Evaluation

    For an ongoing project for a leading technology company, thousands of evaluators across 10 global markets rate the relevancy of ads displayed during an online search.

  • DATA VALIDATION

    Audio Speech Analysis

    Our crowd evaluated hundreds of machine-generated speech samples across multiple languages. They analyzed pronunciation and flagged errors to determine overall naturalness.

  • DATA VALIDATION

    Machine Translation Quality Evaluation

    Our community of language specialists analyzed and rated a large volume of Chinese to English machine translations for a global ecommerce company.

  • DATA VALIDATION

    Content Moderation

    We assign content moderation tasks to hundreds of moderators in over 40 markets for a major media company. Our moderators rate videos for relevance and flag inappropriate content.

  • DATA VALIDATION

    Geo-Local Data Evaluation

    For a leading navigation app, we’ve hired thousands of in-country contributors to complete millions of search relevance and data verification tasks across 40+ global markets.

  • DATA CREATION

    Chatbot Training

    Using a preferred pool of our English-speaking contributors, we created 10 unique queries for each of our customer’s intents. These encompassed both formal and casual language.

  • DATA ANNOTATION

    Text Categorization

    Using a custom, multi-tier taxonomy we developed with our client, a team of our contributors classified a dataset of thousands of companies according to their range of services.

  • DATA ANNOTATION

    Search Relevance Evaluation

    To optimize global search results for a leading search engine, we’ve hired 100,000+ local contributors in over 100 markets to evaluate search queries for accuracy and relevancy.

  • DATA VALIDATION

    Speech Recognition Training

    Our team helped build ASR software for a global software company by validating transcriptions, checking pronunciation, and creating new speech data.

  • DATA ANNOTATION

    Data Collection & Classification

    For an AI solutions provider, our team sourced and classified a large amount of text data from various social media sources into 29 categories, ranging from automotive to fine art.

  • DATA ANNOTATION

    Entity Recognition

    For one of the world’s top tech companies, our contributors identified and labeled entities contained in thousands of voice commands to train a market-leading virtual assistant.

  • DATA ANNOTATION

    Data Annotation

    For a long term project, our team of experts completed thousands of data enrichment tasks per week on CRM data provided by a multinational software corporation.