The Chinese Speech Recognition Industry: A Voice-Activated Future

Article by Meiryum Ali | May 31, 2019

Stand in any crowded place in a major Chinese city, and you’ll notice something unusual. Whereas in most places, people will be texting on their phones, in China it’s the opposite: most people will be sending voice notes for all sorts of things. Basically, the Chinese speech recognition industry is booming, and for good reason.

This technology isn’t just for casual consumption. Chinese call centers use voice synthesis technology to generate automated replies, court systems rely on the software to transcribe lengthy proceedings, and Didi, a popular Chinese ride-sharing app, uses it to broadcast orders to drivers.

“Speech recognition and the understanding of language is core to the future of search and information”

Ben Gomes, Head of Google Speech

Voice technology, voice recognition and human speech AI is miles ahead in east Asian countries than in the west. The $55-billion voice recognition industry has been forecast to grow at an annual rate of 17% from 2018 to 2025. The speech and voice recognition market is dominated by artificial intelligence based software, with a 71% market share in 2017, and is expected to grow at at an astronomical rate of almost 30% in the same period.


How does voice recognition work?

Voice recognition software works by analyzing the sounds you make. It filters what is said, digitizes it into a format it can “read”, and then analyzes it for meaning. Based on algorithms and previous input, it can then make a highly accurate educated guess as to what the speaker is saying.

One of the challenges that makes voice recognition in China so interesting is that Chinese natural language processing is particularly complex, with 130 spoken dialects and 30 written languages. This has led to a rise in demand for cutting edge tech solutions and giant amounts of processed data.


Rapid growth of Chinese Speech recognition due to AI market boom

Chinese companies have increased investments in AI technology to capitalize on the economic potential of what has been referred to as the fourth industrial revolution. In fact, China’s government has set itself a target to become a global AI innovation center and build an AI industry worth US$150 billion by 2030.

Within the past few years, hundreds of Chinese companies from early stage startups to established players like Tencent and Baidu have joined in on voice recognition, flooding into market segments such as intelligent in-vehicle, smart home, and wearable devices. The Chinese smart home market alone was valued at just over US$7 billion in 2018 and is likely to grow by five-fold over the next six years, with smart speakers and AI voice assistants becoming an integral part of that.


Source: iProspect


iFlytek: China’s national champion in voice recognition

It’s hard to talk about Chinese speech recognition without mentioning iFlytek, an AI company that controls over 70% of the Chinese voice recognition market. Over 500 million people depend on iFlytek Input to send voice messages which are transformed into clear text for the recipient. Its AI-enabled tech, including voice-activated smart speakers, earphones and home companion robots, make it one of the world’s most valuable companies.

The fact that iFlytek’s voice recognition technology is everywhere in China is exactly what makes the technology smarter every day. Founded in 1999, Flytek started applying natural language processing to its software in 2010, when it developed China’s first voice input product: the second of its kind after Google. The system boasts an accuracy level of more than 98 percent and supports 22 different Chinese dialects.


Why is Chinese Speech Recognition so far ahead of the United States?

There are a few main reasons why AI market growth in China is so skewed:

  • Lack of foreign competition: Neither Amazon Echo nor Google Home have penetrated China. US tech companies also face extremely strict regulations. Even among US tech, only Apple’s Siri supports Mandarin on the iPhone. This leaves room for local players to capitalize on China’s giant consumer market.
  • An abundance of data: China’s sheer market size generates a treasure trove of audio training data to fuel AI development. Tencent’s WeChat platform alone has over one billion monthly active users. Take mobile payments spending: China outstrips the US by a ratio of 50 to 1. Chinese e-commerce purchases are almost double US totals. All this rich data is used to make Chinese companies’ AI work better.
  • Heavy government funding: The Chinese government also has the advantage of a uniquely close collaboration with tech companies like Tencent and Alibaba, meaning technology companies can take advantage of the funding and data that the government has to offer. In return, the government can rely on the ability of tech companies to quickly turn data into AI products.


Once disregarded as a market of copycats looking to other countries for inspiration, China’s AI innovation has grown tremendously within the last couple of years. Today, China has the world’s most valuable companies not only in speech recognition, but also in computer vision, machine translation and other machine learning sectors.

Driven by heavy funding from the government, access to an abundance of data, smart infrastructure, and leading AI research, China is quickly catching up to the US, and they aren’t slowing down anytime soon.

We hope this article on the Chinese voice and speech recognition industry was informative. If you’re looking for more articles on AI and machine learning, be sure to check out our related resources below and subscribe to our newsletter.

Want to train your own voice recognition software?


    Sign up to our newsletter for fresh developments from the world of training data. Lionbridge brings you interviews with industry experts, dataset collections and more.