What is Image Transcription?

Image transcription is the task of processing and digitizing text that is pictured in an image, such as a photograph of a receipt or handwriting. Accurate image transcription is essential for building training datasets for machine learning models based on image recognition, such as optical character recognition (OCR) models.

Image transcription is also used to refer to image captioning (the process of writing descriptions about the content of an image) and image annotation (the process of associating images with identifier labels). Image annotation is the first step towards building training datasets for computer vision models, such as autonomous vehicles and medical imaging. Lionbridge’s image transcription services include image annotation, image captioning, and OCR transcription services. We’ll help you build machine learning models that can recognize and describe images with high accuracy.

Why choose Lionbridge’s Image Transcription Services?


Lionbridge offers image transcription services in 300 languages. Our competitive pricing is based on the volume of transcription services that you request, but we also offer extra solutions such as faster turnaround time, multilingual transcription services, and a wide range of file types and image annotation types, including bounding boxes, 3D cuboids, polygons, lines and splines, and semantic segmentation.

  • 500,000+ Contributors
  • 300+ Languages
  • 20+ Years of experience


Lionbridge has access to 500,000 qualified contributors around the globe, so we can quickly provide image, audio, and phonetic transcription in large volumes.


Lionbridge’s quality assurance system includes a rigorous review process to ensure that we provide accurate, high-quality transcription services.


With 20 years of experience in the translation and localization industry, language-related tasks are Lionbridge’s strength.


Lionbridge’s Image Transcription Services

Image Annotation

Lionbridge provides a range of image annotation services to match your needs: bounding boxes, 3D cuboids, lines and splines, semantic segmentation, pixel-precise segmentation, polygons, image classification, and more

Image Captioning

Lionbridge provides training data to build both image-based and language-based image caption generators. Image captioning is useful for describing images to people who are blind or have low vision, and rely on sounds and texts to describe a scene. In web development, providing a text description for images that appear on the page makes the content more accessible.

Optical Character Recognition (OCR) Transcription

Optical character recognition (OCR) is the technology that converts images to text so that computers can extract text data from large files. Lionbridge offers OCR transcription for invoices, receipts, business cards, menus, forms, and more.

How do Lionbridge’s Image Transcription Services work?

how to crowdsource data

1. Project set-up

Our team will work with you to develop a custom solution based on your project objectives and timeline.

how to crowdsource data
how to crowdsource data

2. Production

Our crowd of multilingual experts get to work creating, annotating or validating your data.

how to crowdsource data
how to crowdsource data

3. Delivery

Our project management team checks, packages and formats the data before being sent to you for final approval.

how to crowdsource data

Success Stories


Lionbridge transcribed hundreds of handwritten documents dating back hundreds of years, to help a non-profit organization train and build an optical character recognition model.


Lionbridge’s contributors annotated 17 visible body parts in 1,000 photos of people playing various sports, to help an early-stage venture fund train their computer vision model to analyze video frames. The image annotation was done by plotting about 15 anatomical key points per photo.


For a leading full-stack AI company, our multilingual experts drew bounding boxes around Japanese text in 1,000 images, then sorted the images into cascading categories.


Image Transcription Pricing

How much does image transcription cost?
The Lionbridge platform streamlines much of the image transcription process, allowing us to offer the most cost-effective solution in the industry. Contact us to get a free estimate for your project.

  • Account Manager
  • Project Management
  • 24/7 Support
  • API
  • NDA
  • Volume pricing
  • Custom reporting
  • Enterprise-grade SLAs
  • Custom invoicing
  • Consulting services
Get in touch with our team today

Multilingual Image Transcription Services

Lionbridge provides custom image transcription services in 300 languages. Some of our most popular languages include:

  • Chinese image transcription services
  • Dutch image transcription services
  • French image transcription services
  • German image transcription services
  • Italian image transcription services
  • Japanese image transcription services
  • Portuguese image transcription services
  • Spanish image transcription services

Learn more about Image Data Services

Wondering which image annotation types best suit your project? In this article, we introduce five types of image annotation and some of their applications.
Power your computer vision models with high-quality image data, meticulously tagged by our expert annotators.
Optical character recognition (OCR) is the technology that converts images to text and enables computers to extract text data from image files.