The Importance of Machine Learning Security for Chatbots

Article by Limarc Ambalina | July 31, 2020

Artificial Intelligence is a growing industry powered by advancements from large tech companies, new startups, and university research teams alike. While AI technology is advancing at a good pace, the regulations and failsafes around machine learning security are an entirely different story.

Failure to protect your ML models from cyber attacks such as data poisoning can be extremely costly. Chatbot vulnerabilities can even result in the theft of private user data. In this article, we’ll look at the importance of machine learning cyber security. Furthermore, we’ll explain how Scanta, an ML security company, protects Chatbots through their Virtual Assistant Shield. 


Why is Machine Learning Security Important?

Protecting machine learning models against cyber attacks is similar to making sure your vehicle has passed safety inspections. Just because a car can move doesn’t mean it’s safe to drive on public roads. Failure to protect your machine learning models can lead to data breaches, hyperparameter theft, or worse.  

A great example is how one of Tesla’s autonomous vehicles was hacked by McAfee technicians. An earlier model of Tesla’s road sign detection system left it vulnerable to very simple attacks. Technicians were able to trick the Tesla vehicle into misreading a 35 MPH sign just by adding a few inches of black tape. This caused the vehicle to interpret it as an 85 MPH sign. As a result, the car accelerated past 35 MPH until the McAfee tester applied the brakes. 

Vulnerabilities in autonomous vehicles could lead to fatal accidents. For chatbots and virtual assistants, these machine learning hacks could lead to large breaches of private customer data, phishing attacks, and costly lawsuits for your company, which is exactly what happened to Delta Airlines. 

In 2019, the company sued their chatbot developer for a passenger data breach that occurred in 2017. Hackers gained access to Delta’s chatbot system and modified the source code. This allowed them to scrape data entered by users. The fallout was costly for Delta, resulting in millions of dollars investigating the breach and protecting customers that were affected. 


Machine Learning Security Vulnerabilities in Chatbots

ML Security Vulnerabilities in Chatbots

Chatbots are particularly vulnerable to machine learning attacks due to their constant user interactions, which are often completely unsupervised. We spoke to Scanta to get an understanding of the most common cyber attacks that chatbots face.

Scanta CTO Anil Kaushik tells us that one of the most common attacks they see are data poisoning attacks through adversarial inputs. 


What is Data Poisoning?

Data poisoning is a machine learning attack in which hackers contaminate the training data of a machine learning model. They do this by injecting adversarial inputs, which are purposefully altered data samples meant to trick the system into producing false outputs.

Systems that are continuously trained on user-inputted data, like customer service chatbots, are especially vulnerable to these kinds of attacks. Most modern chatbots operate autonomously and answer customer inquiries without human intervention. Often, the conversations between chatbot and user are never monitored unless the query is escalated to a human staff member. This lack of supervision makes chatbots a prime target for hackers to exploit. 

To help companies protect their chatbots and virtual assistants, Scanta is continuously improving their ML security system, VA Shield (Virtual Assistant Shield).


What is Scanta?

Scanta CEO Chaitanya Hiremath
Chaitanya Hiremath (CEO and Founder of Scanta)

Founded in 2016 by Chaitanya Hiremath, Scanta is a tech company that began as a developer of augmented reality games and social media apps. Their success in the AR industry even garnered an appearance on the Discovery Channel. However, Scanta recently pivoted into providing machine learning security services for chatbots and virtual assistants.   


How Scanta Protects Chatbots and Virtual Assistants

Scanta’s VA Shield is a machine learning security system that protects chatbots at the model, dataset, and conversational levels. “VA Shield uses ML to protect against ML attacks. We do behavior analysis for each user and flag any anomalous behaviour,” says CTO Anil Kaushik. “Behavior analysis is done for the end user as well as the chatbot. All input, output, and input-output combined entities are analyzed to detect any malicious activities.”

Virtual Assistant Shield

On the conversational level, Scanta evaluates the chatbot’s output to both block malicious exploits and capture business insights. “Contextual analysis is a simple concept where response from the chatbot is viewed in context to the request,” says Kaushik. “Also, the next request in a conversation is seen in context to the previous request. To do these analyses, we use historical data. For example, we look at the user’s historical request characteristics and responses from the chatbot, as well as the chatbot’s response characteristics.”


Why can’t regular IT teams handle these attacks?

IT Team Image

When speaking to Scanta CEO Chaitanya Hiremath, I asked him why companies with their own IT teams would bother outsourcing machine learning security services. Couldn’t those IT teams incorporate ML security protocols on their own? “We’ve spoken to many companies and I was quite surprised to learn that these ML threats are something that most people are unaware of,” says Hiremath. “The reality is many people don’t even know that this is something they have to protect against.”

“Most IT teams and security solutions offer things like network security and web application firewalls. This type of security is different from what Scanta provides. What we are talking about and introducing is on a different level. It goes far beyond removing bias from training data.”

Delta Airlines Chatbot

In the Delta Airlines example mentioned previously, someone hacked their chatbot and modified the source code. This hack gave them access to private customer data. “This is because no one was monitoring what was going into the chatbot and what was coming out,” says Hiremath. “This is the result of the way machine learning technologies are built these days. However, it is imperative to have a mechanism that can interpret if there is any malicious intent. We call this system a zero trust framework. You have to make sure that all aspects are protected. This is just as important as protecting your database or network.” 

Our daily lives and our personal data are becoming more and more intertwined with computer systems. The increasing digitization of modern society makes heightened data security a top priority. Particularly with the data laws brought out by organizations like the GDPR, it is more important than ever that companies protect their private data and their client’s data at every level.


The Future of Scanta and Machine Learning Security

“We want to be the leader in ML security and help enterprises in various sectors protect the ML systems they have created,” says Hiremath. “We don’t see this as merely a plugin or add-on to an app. In three to five years we see this becoming its own industry and we want to become one of the market leaders in this space.”

“There is a vast ocean of use cases for machine learning security. Right now, we want to focus on virtual assistants and chatbots. This is where we’ve started and we want to be a leader in not only security for chatbots, but other ML systems. We are currently doing R&D and are talking to companies to figure out what other areas we can help them protect.” 

Heightened security for machine learning models will benefit both the data science community and regular everyday users of AI technology. In the first half of 2020, we saw IBM boycott facial recognition technology due to evidence of inherent racial bias and possible misuse by law enforcement. It is important that more large companies like IBM, Delta, and Tesla take a step back and put security and social impact before development. 

Hopefully, more companies like Scanta will emerge in the machine learning field to create safer AI systems for the companies that develop machine learning technologies and the people that use them. 

Keep up with all the latest in machine learning
The Author
Limarc Ambalina

Limarc writes content for Lionbridge’s website as part of the marketing team. Born and raised in Canada, Limarc’s love of Japanese pop culture brought him to Japan in 2016 and living in Japan has been his dream come true. Apart from Lionbridge content, you can catch Limarc online writing about anime, video games, and other nerd culture.


    Sign up to our newsletter for fresh developments from the world of training data. Lionbridge brings you interviews with industry experts, dataset collections and more.