Hardened Cloud Security

This is a guide on hardened cloud security.

C++ is among the best languages to start with to learn programming. It is not the easiest, but with its speed and strength, it is one of the most effective. This small study book is ideal for middle school or high school students.

Hardened Hosts and Guests

Now, when you install an operating system, by default there will be lot of services that would be running. There will be a lot of features that would be installed, and there would be lot of ports which would be left open. So now, it depends on your requirement whether you want these services and ports to be either running or kept open. Now in normal scenario, a user would just leave them out as is. So there will not be any change in the operating system configuration. But when you're talking about hardening any operating system, you would want to stop the unnecessary services. You would want to patch up your operating system. You would want to take away the administrative rights from the users.

You would want to close out the open ports that are not required. Now in this video, we are showing the hardening scripts for Linux and Windows. Note that these scripts will run in the on-premise environment in the same way as they will run within the cloud instances, which could be Linux or Windows. These scripts do not have any different format for the cloud environment. An administrator can simply execute a script to harden an operating system, be it Windows or Linux. It is important to understand that the scripts can be run anywhere from within an operating system, and it can do the hardening based on whatever has been added within the scripts. For instance, if you decide to shut down a few services or disable them, then you can add that in the particular scripts that we are showing.

And when you run that script, it will disable or shut down certain set of services that you had mentioned within the scripts. Again, just to reiterate, this script can be run on your local system as well as within the cloud instances, which could be running either Windows or Linux. To start with the Linux operating system, so to be able to log on,

[Video description begins] A Linux login screen appears. It displays a username and a "Password:" field. The: Password field contains a text box. The: Cancel and Unlock buttons display underneath the: Password: field. The: Log in as another user option displays underneath the two buttons. [Video description ends]

we need to enter the password and click unlock. Now, see there is already a script that is open.

[Video description begins] A terminal window titled: root@localhost: ~ displays. The header contains the following options, namely: Applications, Places, Terminal. The menu bar contains the following options, namely: File, Edit, View, Search, Terminal, Help. The windows pane displays various lines of commands. [Video description ends]

Now, in this script, notice that there are a lot of functions that are mentioned for this particular server. So there is a web server and a mail server. There is a control panel, which is known as cPanel. There is a web server, VoIP server, mail server. Now, do we want to keep all these ports open? Do we want to have all these features installed in this particular server? So let's scroll down a little bit. So we have these standard ports that we require for our web servers and mail servers. So we have the TCP and UDP ports open for them. When we scroll down a little bit, we have certain TCP ports mentioned. Then we come to the password section. We need to safeguard the password.

So we also ensure that the password is encrypted. So moving down, then we need to ensure that we need to install the useful packages. Now, by default, an operating system may not be updated. So you can update the operating system using the yum-y update command. Now, this particular command will update your operating system to the up-to-date packages. It will also install the required dependencies for these packages. So we need to do that. So this script will automatically do that. And then when we are talking about the password policies, we will have something like the number of days in which the password expires. And moving down, we have this particular section on the cron jobs. Do we want the administrator to run the cron jobs? So now, let's switch over to Windows, and now we have this script in Windows

[Video description begins] He closes the terminal window. [Video description ends]

that we are going to use and disable certain unnecessary services.

[Video description begins] A Microsoft Word document displays on the screen. The top ribbon displays the following tabs, namely: Home, Insert, Design, Layout, References, Mailings, Review, View. Currently, the "Home" tab is selected. It contains the following options, including: the formatting toolbar which facilitates change of the font-style of the text, add bullet points or numbered lists, format text indention, add or modify headings in the text and many more. The content pane displays various lines of text. [Video description ends]

We'll also disable certain registry settings. So when you look at it, there is Microsoft Windows Explorer autoplay, which is not disabled. So we are going to go ahead and disable this using this script. Now, there are a lot of registry keys that will get modified because of this script. Then let's move down and there are a lot of other protocols that need to be disabled, so we have TLS 1.0, we have TLS 1.1. And then we also need to disable the RC4 Protocol. And then we are talking about patching up particular vulnerabilities in Windows, which can lead to a denial of service attack. And then further moving down, we are also applying strong cryptography for .NET version 4.0. So these are some of the things that you can do through a script.

And you need not do all this manually, because if you do it manually, you may miss out doing certain things on one particular system if there are more than one systems on the network. So therefore, it is always good to create a script or use a automated tool that can help you close certain vulnerabilities. And of course, automated tool would also detect the vulnerabilities and help you close them. But a script is one way, which is free. You can create your own scripts and close the vulnerabilities by running this particular script. Just to recap, so we saw how to hardened a Linux host and also a Windows host using two different scripts. One was designed for Linux, and other one was designed for Windows.


Physical Security

When you talk about physical security, it is the responsibility of the public cloud service provider. Now if you're running a private cloud, then the physical as well as the logical security becomes your responsibility. But that is not in the case of the public cloud. So no matter what cloud deployment model you're using, be it platform as a service, software as a service, or infrastructure as a service, the physical security will remain the responsibility of the cloud service provider. Now threats to the cloud system, they can be physical as well in nature, so they can be environmental also. And be it physical or environmental, both of them can destroy the physical systems present in the data center.

Now just like the logical security, when you're talking about the physical security, they have to be driven by security policies and processes. You need to have security policies that are in place. They must be detailed enough to ensure that the processes are guided toward how the physical security must be handled. So therefore, you cannot say that our physical infrastructure, be it on-premise or in the cloud, is secure if you do not have the security policies referring to them. So you need to have the security policies and processes that help you understand how to apply security to the physical infrastructure. Then you're talking about nondescript building, which means that these buildings do not have any kind of signage.

Now if you talk about AWS data center, it will not have Amazon written on it. It will be a building which will not have any kind of sign that refers to the company's name. So now this is done for security purposes and the building remains unnamed, therefore, it reduces the risk of any kind of physical theft. Then we move on to screening employees now because the cloud is used by several consumers who happen to add private and confidential information on these storage systems. Now for the cloud service provider to ensure that information remains confidential and private to their respective customers, the cloud service provider needs to ensure that they screen each and every employee during the hiring process.

This means they have to do the background check, they have to check for these people's criminal records. So all this needs to be checked when these employees are being onboarded. Because the reason I'm emphasizing on this is because these employees are going to be managing the physical systems, and they will have access to the data residing on these systems. Therefore, it is utmost critical to do a background check on these employees before allowing them inside the data center. Now even if you have done the background check, background screening, and everything, and you found employees are perfect fit for the organization and they do not have any kind of criminal record.

That does not mean that everybody gets into the data center, so you need to have selective access given to only the required individuals. So you have to give selective access only to the required individuals. Now when you're specifically talking about the data center, there are critical systems that it would hold. So every individual who needs to get into the data center must cross multiple access points. So for instance, there should be one access point at the main gate. Then before you get into the main data center where the servers are housed, there should be another access point. So the multiple access points should be there just to ensure even if somebody gets into one access point, he or she should not be able to go through the second access point.

Then you also need to ensure that your data center has uninterrupted power supplies, which means each server and storage box must be equipped with redundant power supplies. The data center should be equipped with uninterrupted power supplies. That means that you have the main power supply, then you have the UPS, or the uninterrupted power supply, that is running in the back-end. And there could also be further level of redundancy that could be built through having different set of power supply that becomes activated when the primary and the UPS goes down. So you need to have 100% uninterrupted power supplies running in the data center. You also need to ensure that you have alarms and CCTV cameras.

Now these are detective physical controls, which means they can detect any kind of unauthorized movement that takes place within a given premise. Along with them, you can also have motion sensors installed, which can detect any kind of movement that is taking place in a given area. Now these security controls are quite handy in the odd times. For example, if you find there is a movement detected in the data center in the middle of the night, you know for fact that at this time no one should be there. You have a reason to suspect a physical breach.

Then another thing that you must do to ensure physical security is installing air and particle filtering. Now why would you need to do that? Because this point is related to environmental factor. Now the data center can be prone to dust and moisture, and therefore, it is essential for you to install air and particle filtering. You also need to ensure that there are enough fire extinguishers installed in the data center to protect any kind of fire breakout that happens. Just to recap, we looked at different types of methods for ensuring physical and environmental security in the data center.


Data Confidentiality

Confidentiality or data confidentiality. Now, this is a method of protecting the information from any kind of unauthorized access. What it means is if you have some data, no unauthorized person should be able to access it. What we are trying to explain here is that any kind of sensitive information or confidential information is not disclosed to any unauthorized entity. Which means this entity could be a user, a process, or even a device.
In most organizations, access control is practiced to provide appropriate level of access to the users on a given data set. Now, for example, you have access to a particular folder on the network. Now, if somebody attempts to access and gains access to that particular folder, this means the confidentiality of that particular folder and the data inside is breached. So we don't want that to be happening. So how do we protect data confidentiality? So one of the main methods to maintain data confidentiality is encryption. Now this is something that will prevent any kind of unauthorized access to a particular data.

So you may also need to classify the data based on sensitivity of that particular data set, which can be based on the level of confidentiality that the data has. For instance, the data can be marked as public or it can be marked as private or it can be marked as confidential. Now, there is going to be differences between the types of data you have. Accordingly, you can apply the method of encryption or access control based on the type of data you have. So let's assume, you have classified the data appropriately, the server hosting the data has been compromised.

In this scenario, hacker who gets into the server will easily walk away with the data. You cannot prevent a hacker from taking the information after the server has been compromised. But what you can do is still maintain the data confidentiality. And that is done by applying a strong level of encryption on that particular data set. Now there are two types of data, the data that is at rest or data that is in motion. Data at any given point of time will either be in storage or it will be in motion which means it will be in transmission. It is getting transmitted from one endpoint to the other endpoint, be it within the network, it could also be outside the network.

So now let's first look at the data at rest, which means the data is in storage. So when you talk about the data at rest, it can be short term data, it can be long term data. Now, in the cloud environment, specifically when you are in the multi-tenant environment, you must protect your data. And how do you do that? Again, you have to apply encryption, because there is going to be high probability of data leakage. There could also be various reason how data is leaked. So it could be one of the internal employees working with the cloud service provider. It could be that the server itself is breached or it could be that the application that uses the particular data set is breached. So there could be various reasons why your data can be in danger. So therefore, you should always apply encryption when the data is at rest. Now what happens when your encrypted data is taken away by somebody?

Now this somebody, who's an authorized user, will need to make some sense out of that data. And to be able to do that, the person needs to decipher the data or decrypt the data for which this person needs the decryption key. Or it could be any other method that this person can attempt and trying to break the encryption. Which, if you have applied strong encryption, can probably be impossible to break. Now let's talk about data in motion. So when we talk about the data in motion, it is the data that is being transmitted from one endpoint to the other endpoint. Now, both the endpoints could be within the network or they could be on the Internet as well. Now, depending on where you are transmitting your data, whether it is within the network or outside the network, you need to apply the appropriate level of encryption.

For example, if you have a web server that holds the data, you could use TLS encryption which is known Transferred Layer Security. Now if the data is being transmitted from one endpoint to the other endpoint over the Internet, you could always use VPN, which will use a strong encryption protocol known as IPsec or IP security. So this is the way you can protect the data in motion. So just to recap, we talked about data confidentiality and we also talked about data at rest and data in motion and the methods that can be used to protect the data.


Vulnerability Analysis

In this particular video, we are going to show you how to conduct a vulnerability scan. Now, when you talk about vulnerability scanning, it is an automatic method that you have to use to perform a scan. So, you will need to use a particular tool. Like in this scenario, we are going to be using beyond trust network security scanner on Windows Server 2016. Now the intent here is to show the vulnerability scanning of a web application. That is public Internet facing, and therefore you can use these tools to do a vulnerability scanning. Even though there are tools available within AWS. Our intent is to clearly show the use of more than one tools. And therefore we have decided to use beyond trust, which is one of the most renown tools in the industry. It is widely used with the web applications that are public facing, and provides better results than most of the tools available.

While beyond trust is a commercial tool, then we are also going to be using an open source tool which is OWASP ZAP. Again one of the most renown tools in the industry. It is more critical to scan public-facing web applications than the internal ones which cannot be exploited till the time someone gets into the network. Therefore, it is good to use external tools to scan public-facing web applications using external tools to build the confidence that there are no vulnerabilities.

[Video description begins] The screen displays a BeyondTrust Network Security Scanner Community tool. This tool shows three tabs, Audit, Remediate, and Report. Currently, the Audit tab is active. This tab has three sections, Actions, Scan Jobs, and Scanned IPs. The Actions section shows a Targets sub-section, which displays a Target Type menu, and three field bars to enter IP address, Filename, and Job Name. Also, the Target section has a Scan button. The Scan Jobs section shows an Active sub-section, which displays a table with five columns, Job Name, Status, Start Time, Data Source, and Scan Engine. Also, the Scan Jobs section shows three buttons, Pause, Abort, and Refresh. The Scanned IPs section shows various scan-related results, such as Credentials, OS Detection, Domain name and so on. In the menubar, there are five menus, File, Edit, View, Tools, and Help. [Video description ends]

This is where we have installed this particular tool. And now we are going to go ahead and scan skillsoft.com website. Now once we do that it is going to provide us with the results and the IP that is being scanned and then the operating system that it has detected. And it is also going to provide us the domain name that it has detected during the scan. And it will also provide us different types of ports which are closed ports, filtered and TCP ports. Now, when you talk about the filtered ports, those are the ports that are behind the firewall.

[Video description begins] He scrolls down the Scanned IPs section to show the scan-related results. [Video description ends]

So let's just scroll down now, it has done several audits on the web services. So it has scan through Port 80 and Port 443. So let's just continue to scroll down. And then if there is any kind of issue that is detected, it will let you know, but look at it. It has detected a lot of information. So it has detected the IP address, it has detected the trace route. So we just go up back and it has also found one particular issue, which is the SSL Certificate Domain Name Mismatch. Now these kind of issues will be caught when you do a vulnerability scan. So now what you have to do is you have to go back and fix this particular issue that you have caught during the vulnerability scan. So there might also be more issues which are of no importance, which may have very low importance or which might be just some false positive, but you have to evaluate each and every issue that has been detected using the automated tool.

So now you have to evaluate each and every issue that gets caught during the vulnerability scan. Because if you happen to overlook a particular issue, but it might be of high criticality. So you have to be very sure of the issues that are being caught in the vulnerability scan and you evaluate each and every one of them. So now, let's switch over to ZAP which is a tool from all OWASP.

[Video description begins] The ZAP tool screen displays. This tool shows various options, such as a field bar to enter the URL to attack, a Use traditional spider check box, a Use ajax spider check box along with a menu, an Attack button and so on. In the bottom section, this tool shows various tabs, such as History, Search, Alerts and so on. Currently, the History tab is active, and it shows a table with various columns, such as Id, Timestamp, Code, Method, URL and so on. [Video description ends]

So, in the URL to attack field, we just type in the IP address or the URL that we want to scan. So in this case, we have entered the URL and we click the Attack button.

[Video description begins] As he clicks the Attack button, a Spider tab opens in the bottom section. This tab shows a table with four columns, Processed, Method, URL, and Flags. Above the table, this tab shows various kinds of scan-related information, such as Current Scans, URLs Found, Nodes Added and so on. [Video description ends]

Now, once we do that, notice it has shown the progress which is now using traditional spider to discover the content. So, it is just a deep scanning on the particular website, which we have entered as a URL and it will find all the issues that it can. So we'll just let the tool run for a while and once it finishes the scan, we will go back and check the result. Now when we start evaluating the results, we'll see the results that are of key importance. Now for instance, if these are some of the things which are flagged, you will have to go back and see whether these are false positive, or they are accurately tracked errors. Now, if you notice just about the scanning field which is marked as URLs, there is a URL found field, which is showing you the number of URLs it has been able to scan under the skillsoft.com website.

It will continue to run till the time it scans all the URLs that it has found. Now, this is going to be something like skillsoft.com //industries or skillsoft.com/wp/json. Now the scan will continue to run as long as it keeps on finding the URLs or the sub URLs within the given website. And finally, once it is done it will give you the exact results of what kind of vulnerabilities it has been able to locate. So whenever you're doing vulnerability scan, it is also a good practice that you use more than one tool. And that is necessary because one tool may just happen to skip over some of the vulnerabilities which you can detect through the second or the third vulnerability scanning tool that you are using.

Now once the scan is complete, you should go back and carefully analyze the results that have been detected. You may have to close out the vulnerabilities or they may be just some false positive which have been detected and therefore, no further action is required. Again, I'm emphasizing on the fact that you have to evaluate each and every single result that is found in the vulnerability scan. Now, just to recap. We used two different tools in this particular video and we saw how you can find vulnerabilities running these tools.


Security Decoys and Techniques

Decoy technology is used to detect unauthorized access to a system or a set of files that reside within a system. Now when you talk about the decoy technology, it is basically to fool a hacker or attacker and divert their attention to a system which is completely bogus, which is decoy. So why would you want to do that? Now, that is the main question when you're setting up a network, why would you want to have a decoy technology or a decoy system within that network or sitting on the edge of that network? You want to divert the hacker or the attacker's attention from the real systems to a system which holds the bogus files, fake files, and it does not hold any kind of real data. So these kind of systems are basically called Honeypots.

Now, when you are putting up a single system it is called a honeypot, when you have a set of these systems, they're called the honeynets. Now, you want to divert these attackers' or hackers' attention from the real network to something which is pretty obviously a fake system, a bogus system, then you have to set up a honeypot. Now, if your network is large enough and there are hundreds of servers running, you may also end up setting up a honeynet, which will comprise of multiple honeypots, and this honeynet will act like a complete independent network. But of course, there will not be any real data or real applications holding data within that honeynet. Now, when you have a honeypot, there are going to be certain set of data files, there are going to be certain applications.

Now, none of these are supposed to be accessed by anyone, and if someone does, you know that it is an unauthorized access. Now these files within that honeypot are kept for a reason. And that reason is you want to detect, deflect, or counteract on any unauthorized attempt to gain access to these files or the information that you have within this system. Now, honeypot can play various roles, it is mainly used to detect and prevent an attack. It is also used for gathering information about a hacker's movement or methods that have been used to conduct a particular attack on that particular honeypot. Basically, a honeypot looks like a real system on the network or it might be sitting on the edge of the network. It might be in the demilitarized zone, depending on what kind of honeypot you use, what kind of configuration you do, it will all depend on that.

But whatever honeypot you configure, be it the external or the internal, what you want to see is if somebody connects to this system and tries to access the data or the applications within this particular system. If anybody does, you know it is an unauthorized access. So from your end, what you have to do is you have to leave that system as is on the network or outside the network, depending on what kind of honeypot you have configured. It could be internal, to detect internal employees' actions, or it could be outside the network to detect any kind of attack happening from the public network, which is the Internet. So once you set up the honeypot, you're not supposed to be doing anything except you have to continuously monitor this particular system. Remember, no one is supposed to be accessing this honeypot.

Now, if you find any internal or external entity connecting to a honeypot, you can conclude that this is an unauthorized access, therefore, a honeypot can only be useful when it is compromised. So if nobody accesses a honeypot, it is of no use, it is just a fake system, or it is just a bogus system that is lying on the network. But it is waiting for somebody to make a connection to it. So therefore, unless or until somebody connects to this particular honeypot, it is of no use. It will become useful only when somebody will connect to the system, access the system, and tries to manipulate, or tries to copy the data from within the system. Now, that is where honeypot actually serves its purpose.

Now, one particular caution that you have to take with the honeypot is that if you put in too much of data and put too many applications, then an expert hacker will probably realize that this is a honeypot. So therefore, it is depending on your configuration of that particular honeypot that you will have to make it and configure it in way that anybody who tries to access just does not simply recognize the honeypot itself. So therefore, you can configure it but be very cautious while configuring this so that you do not make it look like an obvious honeypot system. Now, just to recap, we talked about honeypot, we also talked about the internal and the external honeypot. And we also talked about we should be very cautious while setting up the honeypot, because we do not want to sort of caution the attacker that it is a honeypot.


Secure Query Execution

So in a typical scenario, when you send a query to an application that has encrypted data in the backend, the moment that query is executed and data is returned back, the data is no longer encrypted. And in fact, if a query itself is not encrypted, so what you need to do is to use a method that protects your query and that is what you call a secure query. So you need to ensure that even the cloud server does not know what is inside the data. Now when you're doing that, when the data is being queried, it is possibility that the cloud server or unauthorized user can gain access to the data. Now you want to protect that data confidentiality when the data is being accessed and being returned back to you using that query.

So in that entire scenario, you need to ensure your query is not being sent in the clear text. Which means if it is in clear text, anybody can interpret that query and understand what data you are requesting. So you want to protect that. So how do you do that? You basically use a method to encrypt the query itself and send it to the cloud server to fetch that data. Now, what happens when you send an encrypted query? It works in the same way as the normal query except the operators that you use within that query are performed on ciphertext, which means that those operators are encrypted. In reality, the server or anybody else will never come to know what data you are requesting for.

So you can also use modified operators in the encrypted queries. When you use an encrypted query, what you are doing is you're keeping the encrypted content which is stored in the backend of the application as encrypted. This means that data is not decrypted when you use the query. This query will search the encrypted data without decrypting it first, and then bring the data back which is in the encrypted form itself. So let's now look at a scenario. So let's assume a user has encrypted data in the cloud and you need to get that data. Now if you send a normal query, which is in clear text query, the data will be first decrypted and sent back to you. However, it is pretty much possibility that cloud server may know what is being sent to the user. Now this means the confidentiality of the data might be compromised.

You want to protect that. You want to protect the confidentiality of the data. To be able to do that, you need to send a query which is encrypted. Now when you send an encrypted query, the response that you get back is also encrypted, which means the query processed on the cloud server and the response returned to you is in the encrypted form. Now there is no way data is being decrypted first, because you had executed a query to fetch the data. So therefore, you send an encrypted query. The response you get back is also encrypted that only you can understand to protect the confidentiality of the query from the cloud server authorized clients, which is you in that case.

So therefore, whenever you want to ensure data is not tampered when you are sending the query, you just use the secure queries and ensure the data confidentiality. Now in this scenario, what happens is that response that you're getting back from the server is also encrypted. And in this entire scenario, the cloud server will never gain knowledge of the data, the data patterns, query itself, or the query results. Just to recap, you can use the secure query execution to ensure that data is not tampered with or read while the data is being requested through the query from the cloud server.


Privacy and Information Systems

When you talk about the information systems, these are the systems that you use to draw a decision out of the raw data. Now the data in itself is pretty much raw. So you need to basically do some bit of analysis to get something out of it. Now many organizations have several information systems that can contain data for example. Now if we talk about your organization may have a database server that contains terabytes of customer data. However, that data does not really help your organization because it is in raw form. You have to first decide, how do you want to use this raw data? How do you want it to convert into useful information? For example, you may want to find out about specific customers in a given region in the United States of America. So you have to do that bit of analysis.

Now think about without doing that analysis, you have to scan through millions of records and trying to find out that particular information which is about the customers from a specific region. Now information systems will ease out this process for you by filtering the required information that you want. Now when you have the data, it is raw. So you have terabytes of records but nothing is of use if you do not know what you want from this data. When you filter out this data using an information system, you know now this is the information that you wanted. Let me give an example. Like I said, your system, your database server has terabytes of customer data. Now, what do you do, you simply run a query, you filter out that from the east region of the United States, so many customers have bought your product.

Now this is the information that you were looking for. So it depends on what you are looking for from the given data set. Accordingly, you can search your data, you can filter out, you can query your data. Now let's talk about privacy. Privacy is all about keeping the data private to a specific person. So if you have some information which is confidential and which you do not want to reveal to other people, that is your private data that you want to maintain the privacy of that particular data set that you have. So now when you talk about data privacy, it is all about protecting the information of one or more individuals. So when you talk about yourself that private data, you want to protect the privacy.

Now if you talk about an organization, it wants to protect the privacy of its employees and the customer information. Now how do information systems and privacy relate with each other? Information systems are the ones that maintain the customer information for an organization. Therefore, it becomes the responsibility of the organization to maintain the data after it has been collected through the information systems which could either be online or it could be a simple web application that is present on the Internet and acts as a agent for collecting information from the customers. Now in any case, does not matter from which medium do you collect that information. It needs to be protected. It needs to be kept private.

Now when you collect the information from customer it could include various components like name, mobile number, social security number, and so on. It is the organization's responsibility to maintain the data and its privacy which can be lost if somebody breaks into the organization systems and accesses the customer personal information. Now when we have the information, the personal private sensitive does not matter what kind of information we have. It is stored in the information system, let's say a database or otherwise, it must be processed properly or must be protected from falling into the wrong hands. Whether the information is stored on the on-premise system or in the cloud, it must be protected at all times. The organizations holding the data must ensure its privacy all the times.

Several countries have various data privacy acts. In generic sense, these privacy acts define the guidelines as to how to maintain the customer information, how to keep it private and how to ensure that it must be protected from any kind of data breach. Let's now talk about some of the examples of the personal data. So when you have name which contains your name, middle name, surname, then you have the identity information which can easily identify you. So this could be your social security number, driving license, biometric identity, such as fingerprint. Now, then you have the demographic information which can include your religion. Then you have occupational information which can contain information such as industry, your organization's name and the job role that you're performing.

Then we also have some more information which is the contact information that can contain your residential and your office address, mobile number, phone number, or the email address. Then you have the healthcare information which can contain your health details such as the blood group or the medical history that you may have had. Then you have the financial information, which can contain your bank account information, debit card, credit card information, or the historical information about your online purchases. Then finally, you have the online information which could be your IP address of the system, it could be the cookies or the login credentials. Just to recap, we talked about the information systems, how data is processed to convert into the useful information. And then we also talked about the privacy of the data and why it is important to maintain the privacy of the any given data set that you may be holding.


Data and Media Sanitization

When you talk about the on-premise data center, there are various options that you can use to securely wipe data from a hard drive. Now, all these options are not available when you talk about your data in the cloud environment. When dealing with data on the on-premise system, you can do various things. Such as you can physically destroy the media, or you can wipe the hard drive, or you can use methods like degaussing. Now, when you talk about the physical destruction, if you use this method it is nearly impossible for anybody to put together the broken or the shredded media. Let's say if you have a DVD drive and you have shred it into thousands of pieces, it is going to be impossible for someone to pick up those pieces, put them together, and retrieve the data.

Now you can also use these methods like degaussing methods, which applies magnetic field to the media and destroys whatever the content on that media is stored. Now, you have to be cognizant when you're dealing with the SSD drives, which are solid-state drives. Now, methods like degaussing do not work with the SSD drives. This method only works on the normal hard drive. Now, when you're dealing with the on-premise data you are in control of the hardware that stores the data. So therefore, you can do whatever you want to do with that data. You can burn the hard drive and destroy the data. You can smash the hard drive, break it, cut into pieces, or you can even shred it.

However, this is not the case when you are dealing with the cloud environment. You own the data, but you do not own the hardware. Therefore, methods like physical destructions and degaussing do not apply to you when you deal with your data in the cloud environment. So for the cloud environment you have other methods. Now, these methods are cryptographic erasures and data overwriting. So let's now look at each one of them. So when you're talking about cryptographic erasure, it is also known as crypto-shredding. In this process, you can encrypt the data using an encryption algorithm. Now, when you have the decryption key generated, this is the key that you can use to decrypt the data.

You can further encrypt the key using another encryption algorithm. Now, when you finally encrypt the key, you can delete the encrypted key from your system. Now you can do this on the same system where you had encrypted the data. However, it is always better that you take the encrypted key to another location, another server, and then delete the key from there. Now if somebody attempts to recover the key from the same system where the data is encrypted, obviously, they are not going to be able to find the key over there. So it is always better to move the key to another system and basically delete it from there so it can't be recovered from the same system where the data was decrypted.

Now, the second method that we are going to talk about is data overwriting. In this method, you keep overwriting the data with random data, and then overwrite the random data with zeros and ones. Now once that happens, it becomes nearly impossible for someone to go back and undelete and recover the data from your hard drive. So recovering the data becomes pretty much impossible if you have overwritten the data multiple times with the random data. So therefore, it is necessary that you use this method to ensure the data cannot be recovered. Just to recap, we talked about two different methods that we can use in the cloud environment. And these methods were cryptographic erasure and data overwriting method. We have to just remember when we are moving out of the cloud, we have to ensure that we apply one of these methods to securely wipe out the data from the hard drive.