AI Security Risks

This is a guide on AI Security Risks.

Check out Audible on Amazon and listen to the newest books!

Today we are going to talk about some of the more common AI security risks, including AI model attacks, data security risks, code maintainability, and supply chain complexity. These security risks  are why we are starting to see more emphasis on creating secure AI models and implementing privacy preserving AI systems. These AI systems are designed, created, tested, and procured with security and privacy in mind.

 

They are specifically tailored to reduce the likelihood of such a risk being actualized into an attack. In terms of risks around our data security, we are going to have some vulnerabilities with our AI pipeline. Any kind of software is going to have some sort of vulnerability present within it and AI based software is no different. This means that all of the pipeline operations around our AI model such as collection, storage, and usage of our data are going to be subject to various vulnerability risks. And this includes our production data. Compounding this risk factor is the fact that AI models and their associated data often make use of cloud based services which come with their own set of risks and complications in terms of making sure that your data and usage of those cloud services are secure.

 

This means that when it comes to data security, we have a very wide attack surface that we need to protect and try to reduce. There are also attacks that can target the AI model itself, and these attacks often aim at compromising some combination of the AI model's integrity, its reliability, and its security. The common attack vector for AI models is through the inputs that get fed into the model. Malicious actors will try to use inputs that deliberately mislead or confuse an AI model in order to get the model to produce inconsistent or erroneous results. They can also try to use malformed inputs to perform things similar to SQL injection attacks in order to try and exploit software vulnerabilities within the AI model itself. So, let us take a look at some specific attack types that represent risks to our AI model.

 

The first is data poisoning. A data poisoning attack is when a malicious actor tries to manipulate or change training data in some way to alter the behavior of the model. An example of this would be a malicious actor altering the training data of an anomaly detection system to reduce the accuracy rate of that detection system in order to have a piece of malicious software bypass the detection system. Where data poisoning tries to actually alter the behavior of the AI model fundamentally through the training data, input manipulation is done on a production AI model where malicious actors try to feed erroneous or malformed inputs into the AI model to get it to act incorrectly or in an inconsistent way. This is another significant risk to the AI model itself because we do not want our AI model to be behaving in a way in which we have not tested it or in which we have not validated it in production. Another attack type is model inversion. A model inversion attack is where a malicious actor tries to reverse engineer the outputs from an AI model in order to extract personally identifiable information about a subject based on that output.

 

Effectively, an attacker trains a new AI model using output from your AI model as the input to their AI model. And in this way, they try to train their AI model to predict the inputs that produce a given output for your model, which can lead to a compromise of your data privacy. Along the same lines, we also have the attack type of membership inference. Membership inference is where a malicious actor also tries to figure out if a given person's information has been used to train a model using related known information from the AI model itself. Beyond specific attacks, there are other AI security risks that we need to look out for. The first is AI code reuse. A huge number of various AI projects available today, all rely on a small group of the same publicly available libraries. This means that a huge number of AI models are all sourcing from the same code base.

 

As such, if there are any security or privacy problems with that shared code base, those problems are going to extend to every AI model making use of that code base. So, if a given model's creator does not do their due diligence to ensure that the code libraries they Are making use of are secure and free from any critical vulnerabilities, then the AI models themselves will be subject to those critical vulnerabilities and represent a security risk in your environment. The complexity of the supply chain surrounding AI models also represents a security risk. This is because the supply chain surrounding AI models typically draws from a wide variety of different sources for all of the different factors that go into creating an AI model and increasing the supply chain complexity increases the opportunity for malicious actors to perform a malicious activity at some point along the chain and inject a piece of malicious software or hardware into your AI model. These supply chain attacks are particularly difficult to defend against because so much of the supply chain ends up out of your direct control and you have to rely on a third party vendor's security.

 

This is where good public auditing can come into play and doing your due diligence in investigating the security track record and proofs from all of the third party vendors involved in every aspect of your supply chain. On the software development side, we also have a risk around the AI code maintainability. The inner workings of an AI model can quickly become very complex  to the point that it can be difficult even for the people who designed the model to explain what is happening inside of the model's decision making process. As such, going forward in the code's life cycle, as new developers cycle in and older developers cycle out, it can be difficult to perform updates or understand at all how the AI model  is coming to its decisions.

 

This represents a risk because the more difficult a codebase is to update, the more likely it is to become out of date and subject to new vulnerabilities. So, as we have seen, there are many AI security risks that we need to account for surrounding data protection.