Server Storage

This is a guide on server storage.

C++ is among the best languages to start with to learn programming. It is not the easiest, but with its speed and strength, it is one of the most effective. This small study book is ideal for middle school or high school students.

Hard Disk Drive (HDD) Storage

A hard disk drive or an HDD is a mechanical device for storage. So this contains a number of hard disk platters that spin. So these are kind of old school hard disk storage technology that use magnetism to store data on these spinning metal platters. So these drives are actually vacuum sealed units that contain these multiple disk platters.
 
And each of these platters has a read and write head on an actuator arm that goes in and out over the disk as it spins to read and write data. So these can be direct attached storage, otherwise called DAS, D-A-S, which means that the HDD is directly inside the server or you can use hard disk drives that are accessible over a network.
 
Now there are a number of form factors to be aware of with hard disk drives. The first is the Large Form Factor, which is 3.5 inches. Now this measurement actually refers to the diameter of the disk platters themselves.
 
And Large Form Factor hard disk drives are normally what you would find in server equipment. Now, in smaller devices, such as with laptops, you'll find Small Form Factor or 2.5 inch type of diameter disk devices. The other thing to think about with hard disk drives are the various characteristics which can, in the end, translate to performance.
 
The first characteristic is capacity, such as whether we're talking about a 500 GB drive or even an 8 TB drive. Modern server storage usually consists of a collection of direct attached storage locally in the server for physical servers, as well as storage accessed over the network. And it usually consists of multiple disks working together to increase the amount of available capacity.
 
You can also group physical disks together to form a logical disk for use with various RAID levels, and you might do that for performance reasons or for fault tolerant reasons, or for both reasons. Just a bunch of disks or JBOD refers to a collection of disks that you would group together so that you can have more space for your disk volume.
 
But you could also get a little more sophisticated and configure RAID levels, which we'll talk about later. Another characteristic of a hard disk drive is the number of rotations per minute, or RPM. So in a laptop, for example, you might be looking at 5,400 RPMs, whereas for a server based hard disk drive you might be looking at 15,000 RPMs.
 
The more RPMs, the quicker data is accessible from those rotating disk platters. But because hard disk drives are mechanical and have moving parts, they will wear out over time. It's just a matter of when. I've had many clients in the past that are surprised that hard disk drives fail out of the blue without any kind of warning.
 
Sometime you might get a warning. You might hear a clicking sound from a hard disk drive if you're near it. But we always need to make sure we are prepared by having continuous backups of relevant data. Hard disk drives are not designed to last forever. In the cloud, we can also work with a storage account.
 
In this example we have a screenshot of a Microsoft Azure cloud storage account. [Video description begins] A screenshot labeled Create a storage account displays on-screen. There are six tabs, from left to right: Basics, Advanced, Networking, Data protection, Tags, Review + create. The screenshot is currently at the Basics tab. Storage account name is set to storacctwebapp172. Region is set to (US) East US. Premium account type is set to Block blobs. Redundancy is set to Zone-redundant storage (ZRS). [Video description ends] Now a cloud storage account doesn't get directly attached to a virtual machine. However, a virtual machine can use Azure cloud storage account maybe to store files that might be custom code running within that virtual machine that talks to a storage account.
 
In this case, the Performance of the storage account is set to Premium, which is the lowest disk latency. In other words, the best throughput, but you get that at a cost. The other consideration for cloud storage is replication. [Video description begins] A screenshot of a map displays on-screen. Replication is set to Read-access geo-redundant storage (RA-GRS). Last failover time is not set. Under Storage endpoints, there is a link labeled View all. [Video description ends]
 
In this example we have replication between two endpoints, one on the West Coast of the United States and one on the East Coast. Now, the idea is that we want to be resilient to failure in one region so that we have an up-to-date replicated copy of data in the storage account in an alternate region. In this particular example, we're now looking at Amazon Web Services or AWS.
 
Specifically, we're looking at what's called an EBS volume. This is an actual disk resource created in the AWS cloud that can be attached directly to a virtual machine. And so here the Volume Type can be set to the old school magnetic type disk, in other words, hard disk drive. But instead here what's been selected is a Volume Type called Provisioned IOPS SSD, solid state drive.
 
Now, the idea with Provisioned IOPS is not only can you specify the Size down below in gigabytes, but you can also specify the IOPS value where there is a Minimum and a Maximum IOPS specified. Now IOPS is very important when it comes to the throughput of a disk subsystem. IOPS stands for Input/Output Operations Per Second.
 
And essentially the more IOPS, the better the disk throughput. But as usual, the higher the IOPS value and using SSDs means you're going to pay more than if you were simply using a magnetic hard disk drive type of solution. What's a great thing about creating a disk volume like this in the cloud is that you can attach it and detach it to and from virtual machines as required.
 
Now in this case, if you do attach this volume to a cloud virtual machine, it shows up as just another disk device in the operating system as if it were a physically attached disk drive in the server. In other words, direct attached storage.
 

Solid State Drive (SSD) Storage

Let's take a few minutes to talk about the difference between solid state drives or SSDs and hard disk drives. We want to make sure we understand the difference because an important part of the selection of storage devices for a server would be whether or not you're going to use SSDs or HDDs.
 
So solid state drives are called solid state, just like any electronic component, because it does not have moving parts. Now think of a traditional hard disk drive. A traditional hard disk driver or HDD has a number of metal platters that spin and an actuator arm with read and write heads that move in and out over the spinning platters.
 
So there's a lot of mechanical motors that move these things around, but with solid state you don't have any of those moving parts. So anything that has a lot of moving mechanical parts means more wear and tear over time. So as a result, solid state drives are a little bit more expensive than hard disk drives. That can change over time, but that's the state right now.
 
So what are the benefits of using solid state drives? Well, one of them then is that it has less power consumption than an HDD. Why would that be? It doesn't have to drive motors that move things, so less power is consumed. The other benefit as a result of that is that there's less heat generated.
 
That also lends itself to the fact that we have less fan noise. You have fans available for disk drives and devices like that, that generate heat to pull the heat away from them and exhausted out the back of a server, or ultimately the back of an equipment rack.
 
But with an SSD because you have less power draw, less heat generation, it means less fan noise. Some SSDs don't even need a fan at all. It also ends up meaning that you have quicker access time because instead of waiting for a physical hard disk platter to spin, and for the actuator arm to move read and write heads in and out over the correct cylinder on that platter to read or write data; instead of doing that solid state drives use essentially flash memory, so that results in quicker access time.
 
It also means that there's no file fragmentation like there would be on a hard disk drive where a large file might be broken into smaller pieces, each stored on different parts of the disk. So as a result of that it also lends itself to quicker access time with SSDs.
 
Now, this is true not only on servers, but even on a laptop. If you currently have a laptop with the hard disk drive and the boot time for your operating system now when you power it up, let's say, is 30 seconds after replacing it with an SSD, you might realize that booting up your machine now takes four seconds. That's the dramatic difference.
 
It can be when you use SSDs versus HDDs. So here we have an example of purchasing a 1 TB hard disk drive. Now, interestingly, this is a 7200 RPM, so revolutions per minute, or rotations per minute. Now often the higher performing disks in servers are in the 10,000 to 15,000 RPM range. [Video description begins] The drive is WD Blue 1TB PC Hard Drive. [Video description ends] But the listing here also shows that it's a SATA 6 Gb/s type of drive.
 
It's got a 64 MB Cache for frequently accessed files. And it's 3.5 inches. That's large form factor, which often you would see in standard desktops or server equipment. But what we're interested in here as well, is the price. So the price for this 1 TB HDD is $48.99.
 
Now let's move on and take a look at a listing for a 1 TB solid state drive; is the same capacity. [Video description begins] The drive is Silicon Power 1TB SSD 3D NAND TLC A58 Performance Boost SATA III 2.5'' 7mm (0.28'') Internal Solid State Drive SU001TBSS3A58A25CA. [Video description ends] Now this one is also a SATA type of interface, but it's only a 2.5 inch form factor, so small form factor, Internal Solid State Drive.
 
Now it's the same size or capacity as the hard disk drive, but instead of being approximately $49, this one is $112. So when you start talking about arrays of disk, disk arrays for storage, you're talking about a multitude of drives, whether they're HDDs or SSDs. And so the price differential can add up very quickly when you start dealing with stored arrays that contain many terabytes or even petabytes of available storage.
 
There is another option. It's called a hybrid drive, and as you might guess it combines the benefits of a hard disk drive as well as the benefits of solid state drives. What does that mean? Well, hard disk drives generally have a large capacity compared to SSDs, a larger capacity per disk.
 
And so we get the benefit of a large storage device with HDD. However, it also combines some SSD storage which is used only for frequently accessed files. So those frequently accessed files then are accessible as quickly as possible through SSD storage. So less frequently accessed files will be stored with HDD.
 
Now you can combine this in a single drive, but you can even configure storage tiers at the enterprise level in your storage arrays in the same way so that SSDs are used for frequently accessed files, but long-term storage, or cold storage, as it is sometimes called, even archiving would be done on the cheaper and slower HDD type of disk devices.
 
Now it's not just for data files, it can also be for operating system files. So your computer, whether it's a server, or laptop, desktop, would have the potential for booting and launching apps much quicker using a hybrid drive, even though you don't directly install anything in that space yourself. The drive really handles that alone.
 
So then you can have a dual drive system where you have at least one SSD and at least one HDD in the same machine, that's in the same server. So what you would do in this particular case to optimize this situation is you would use the SSD as a boot or OS drive. You might also decide that, well, it's more important that I have quick read or write time, so you might use the SSD for that purpose instead.
 
But generally you might use the SSD to speed up boot and OS operations and use the hard disk drive because it's got an enormous capacity and it's cheaper MB per MB than it would be with solid state. You might use the HDD for data file storage.
 

Storage Cables and Interfaces

Over the years, some disk interface standards have stood the test of time and have evolved, while others have been replaced completely by newer technologies. An example of an older standard that is no longer being used often is the old IDE hard disk interface standard. IDE stands for Integrated Drive Electronics, and this was a big one back in the 80s and the 90s.
 
So if you were working with servers at that time you would definitely be working with IDE disks. However, now we're going to be talking about some of the newer, more common standards. The Small Computer Systems Interface, or SCSI is an old standard that does stem from the 1970s, but it's one of those ones that stuck around. It kept evolving, and there are many branches off of the original SCSI standard.
 
So the original SCSI standard used parallel bit transmission, and this means when transmitting data bits down a data cable to and from a storage device, what would happen is we would have multiple wires in the cable and different bits would be transmitted down the different wires. Now newer SCSI standards would use serial type of transmissions where we send each bit one after another down a single wire.
 
Now an example of a newer SCSI standard that does this is Serial Attached SCSI or SAS, which we'll talk about after. You might think the parallel bit transmission should always be faster, but it's not because the problem is the timing; making sure that the bits arrive at the other end on the separate wires with parallel transmission at the same time.
 
But with serial transmission there is no such worry. We just blast the bits down the wire, so to speak, at high speeds. Fibre Channel is a storage interface that's used with enterprise class storage area networks or SANs. Now you can have an iSCSI SAN which uses standard network equipment like Ethernet switches and network interface cards and cables.
 
But a Fibre Channel SAN is designed for nothing but transmitting disk IO commands over a dedicated high speed network. And that's where the Fibre Channel SAN comes in. So in order to participate in this, servers have to have special hardware to interface with this storage fabric. And that interface is called a host bus adapter or an HBA.
 
So it's a card that has to be installed in the server if it's not built-in to the server motherboard. Now the HBAs allow connectivity to Fibre Channel switches and in turn the Fibre Channel switches allow connectivity to network storage arrays and ultimately that is how servers will see the network storage, although to the server OS, it just looks like more local mass storage.
 
Serial ATA; ATA stands for Advanced Technology Attachment. The Serial ATA interface is one of those standards that use a serial bit transmission. So serial ATA allows transfer speeds from 3 up to 6 Gbps. Notice the b is a small B for gigabits, so that's why we're saying gigabits and not gigabytes. There is a big difference between the two, of course. And then we've got external SATA otherwise called eSATA.
 
Now the SATA standard itself has been around since about the year 2000, and eSATA followed shortly thereafter. So it's similar to SATA except that the interface connector is external to the device. So, for example, some laptops have a built-in eSATA port that allows you to plug in storage that way.
 
Then we have the Peripheral Component Interconnect Express or PCIe Interface. PCIe is one of those standards where it's so common it's built into many motherboards. There are slots on motherboards to allow for expansion cards to be plugged in. If not, then you have an add-on card you can add to a motherboard to allow for PCIe cards to be plugged in.
 
So often PCIe is used to connect mass storage devices to allow them to make a connection to the server motherboard. And then of course we have Universal Serial Bus or USB. Most people are aware of the USB standard. It's incredibly versatile in terms of what it lets you plug in to a computer, including at the server level.
 
So as the name implies, USB uses serial bit transmission. It's very convenient to connect things externally just by plugging them in; you don't have to open the case. Some devices if they have a small power draw, will actually draw that power through the USB cable itself. But in other cases such as with larger hard drives, you might have an external power source requirement.
 
Now you can have USB external storage encasements that house disks that actually use SATA connections on the inside. So USB 4 is one of the newer USB standards that supports transmission rates up to 40 Gbps. And finally we have Serial Attached SCSI or SAS. So again this one uses serial bit transmission.
 
It is one of the newer SCSI standards because wherever SCSI has been around since the 70s. It's hot-pluggable, and that means that we can swap out devices while server operating systems are running and so SAS is commonly used in server environments.
 

Configuring HDD and SDD Storage In the Cloud

Managing server storage is a crucial skill for a service technician whether you're dealing with that physically, on-premises with physical servers and drive interfaces and storage arrays, or whether you're doing it in the cloud, which is what we're going to do here. The first thing I'll do here in Amazon Web Services is I will deploy a Windows virtual machine.
 
And when you deploy a virtual machine, or in AWS parlance, when you launch an instance you get to determine beyond the operating system disk, if any additional data disks will be created. And I will create one during launching of that instance. But then we'll create another separate EBS volume disk after the fact and then attach it to our Windows VM.
 
So let's get started here. I'm going to search for ec2 in the search bar at the top because that's where I go to work with things like virtual machines. And in the Instances view I'm going to click Launch instances. And in this particular case I'm interested in Windows Server 2019 and I know if I just scroll down a little bit, I'll come across the Microsoft Windows Server 2019 Base AMI, the Amazon Machine Image.
 
So I'm going to click Select for that. It's going to make a suggestion for the Instance Type. So 1 vCPU, 1 gig of Memory. That's not going to be quite enough for what I want to do, so I'm just going to go ahead and bump that up, let's say to 2 vCPUs and 4 gigs of Memory. [Video description begins] The host chooses t2.medium. [Video description ends]
 
So the next thing I have to do is deploy this virtual machine instance into a VPC and a Subnet [Video description begins] Network reads: vpc-27ea875a (default) and Subnet reads: subnet-0e20db7b25d9f5e39. [Video description ends] and determine if I want to assign a Public IP, which I will for testing purposes here, so I can get to that VM over the Internet. That's really all I'm going to do at this point. I'll click Next for Storage. So we have our Root disk volume about 30 gigabytes.
 
General purpose SSD. And that's fine. That will show up as a storage device within the virtual machine operating system. [Video description begins] Device reads: /dev/sda1, Snapshot reads: snap-09b8797765fb66586, IOPS reads 100/3000, Throughput (MB/s) reads N/A, Delete on Termination is selected, Encryption is set to Not Encrypt. [Video description ends] I'm going to click Add New Volume. And let's say that we're going to set it to be 30 gigabytes in Size and I'm going to use Provisioned IOPS SSD (io1) where I can specify the IOPS value.
 
A higher IOPS value gives you better read and write performance because IOPS is a reflection of the disk subsystem throughput. And I'm not going to encrypt that disk. So the point here is that we've added a data disk while we are launching the instance. So I'm going to go ahead and click Review and Launch, then I'll click Launch and I'm going to use an existing WindowsKeyPair and acknowledge that I've downloaded the private key part, because I need that to decrypt the admin password and I'll launch the instance.
 
OK, so in the Instances view, our newly created instance is in the midst of Initializing. So while that's happening I'm going to scroll down on the left, because I want to create another disk. I'm going to go to Elastic Block Store or EBS Volumes, where any existing disk volumes will be shown. For example, the second one in the list here is 30 gigabytes in Size, 1500 IOPS.
 
That's the data disk we just created as we were launching the instance, and maybe I'll click on the pencil in the Name column and call this DataDisk1. And I'm going to create a new volume, so I'll click Create Volume. I can specify the Volume Type. If I need the utmost in performance, Provisioned IOPS SSD for io2 is what I really want and I specify the Size and the IOPS value.
 
But of course I'm going to be paying more than if for example, I had selected Magnetic (standard); magnetic spinning disk platters as per hard disk drives. So I'm going to go ahead and go with that just to be different here because I don't need the utmost in performance for this one. I'll have 30 gig set as my Size. And essentially, that's all I'm going to do. I'm not going to encrypt it or do anything like that.
 
I'll add a Name tag and I will set this to DataDisk2. We could also do that back in the column within the view, as I did with the first data disk. [Video description begins] The host clicks on the Create Volume button and then on Close. [Video description ends] OK, so now we've got DataDisk1, DataDisk2. If I were to select DataDisk1, and if I were to go down and look down under the Description, we see that the State is set to in-use. It's been attached to a virtual machine. So this is already attached to a VM. If I go to the Actions menu, notice I don't have the Attach Volume option.
 
I do have the Detach Volume, but not Attach. If I were to go and select DataDisk2 and then go down and take a look, well, the State here is showing now as available for DataDisk2. It's not attached. So therefore I could go to the Actions menu and choose Attach Volume [Video description begins] A pop-up dialog appears that contains three fields: Volume, Instance, and Device. [Video description ends] and I'm going to click in the Instance field and choose my running instance which is shown here with the Instance ID, and I'll click Attach.
 
So let's go back then and look at this from the perspective of the virtual machine instance. If I go to the Instances view and let's just give this a Name. So I'll just click in the column here under Name and let's call this WinSrv2019. Now if I select that instance, down below I can view the Details related to it, including by going to Storage.
 
Now when I go down under Storage, this is where I'm going to get a list of the block devices that are associated with this virtual machine instance. Currently, all I really see are the two 30 gig volumes, the OS and the data disk that I created when I launched the instance. But what about the one we just attached?
 
It's just timing, nothing more; so let's just click refresh to make sure that we can then go back and select the instance, check Storage to see if it's up-to-date. And now it is. So now we've got our other volume that was attached here. We have three in total. So the next thing I want to do is I just want to RDP, remote desktop protocol, into this instance so we can see how those show up within the virtual machine guest operating system itself.
 
So having that VM selected, I'm going to go to the Actions menu, Security, Get Windows password and I'm going to have to Browse to my key pair file for the WindowsKeyPair, which I specified when I launched this instance. So I have that file with the private key and so then I can Decrypt the Password. So there's the Administrator name and there's the Password I'm going to need to RDP into this virtual machine.
 
At the same time if I select that virtual machine back here, it also has a Public IP. So I can connect to its Public IP using an RDP client from on-premises, which I will do right now. OK, so now that I'm in that server virtual machine, I'm going to go to the start menu and I'm going to search for disk. I'm going to choose the Create and format hard disk partitions option. There are many ways that you can work with disk subsystem. But the point is simply this: we've got our operating system Disk drive (C:) here, 30 gig, and we've got our two other 30 gig drives.
 
So whether you're on-premises with physical local storage or SAN storage, or in our case cloud-based virtual machines with attached disks, it's all the same at the operating system level. The disks will appear as they normally would otherwise, and here they're showing as Offline. So of course you would right click, bring the disk Online. You would right click, Initialize the disk as either MBR or GPT. MBR is limited to four partitions, four primary partitions, each being no larger than 2 TB. So if you want to go beyond that you can use GPT, but I'll just leave that.
 
So you would then allocate that space by creating, for example, a new disk volume and formatting it accordingly, such as with the NTFS file system format, [Video description begins] A dialog appears labeled New Simple Volume Wizard. The host accepts all the defaults in the sections: Specify Volume Size, Assign Drive Letter or Path, Format Partition, Completing the New Simple Volume Wizard. [Video description ends] and then all of a sudden it is a usable disk volume. So it's business as usual once you've dealt with the cloud level, attaching of disk devices to virtual machine instances.
 

Configuring Cloud Storage Tiers

In this demonstration, the focus is going to be on storage tiers. When we talk about storage tiers what we are talking about are different levels or different types of storage. Essentially, you can think of it as being in a hierarchy. What we really mean is that we want frequently accessed data to be on the fastest possible storage, and you might say that that would be Tier 1, or hot storage, hot access tier.
 
But data that's used less frequently can be stored physically on slower moving disks, like hard disk drives as opposed to solid state drives because they're cheaper. And then long-term archiving might be an even slower storage media. So this is where storage tiers come in. And there are tools that will let you do this completely and entirely on-premises.
 
Often it's called HSM, which stands for hierarchical storage management. And in the cloud we can also do it using command line tools and GUI tools. So here in Microsoft Azure we're going to do it within an Azure storage account. First we're going to create a storage account, so I'll click Create a resource here in the portal and I'll type in storage account, instead of going through and selecting from the list. I'll click on Storage account and then I'll click Create.
 
So I'm going to deploy this Storage account into a Resource group [Video description begins] The host is at the Basics tab of the Create a storage account page. There is a template to be filled in with the Project details. Subscription is set to Pay-As-You-Go. In the Resource group field, the host chooses Rg1 from the dropdown menu. [Video description ends] and I'm going to give this Storage account a unique name, [Video description begins] The host types storacctyhz172abc in the Storage account name field. [Video description ends] and once I've done that, I'm going to specify the Region where I want the Storage account deployed. Ideally, that will be deployed where it will be most frequently accessed, but you could of course enable Geo-redundant storage where it replicates the contents of the storage account to a secondary region.
 
And that's really for high availability purposes. [Video description begins] Region is set to (US) East US. Performance is set to Standard: Recommended for most scenarios (general-purpose v2 account). Redundancy is set to Geo-redundant storage (GRS). [Video description ends] So I'm just going to go ahead and Review + create. There's really nothing else that I want to do here within the storage account other than create it.
 
So I'll click the Create button to make that happen. And after a moment the deployment is complete, so I'm going to click Go to resource to go into the storage account. And the first thing I really want to do is I want to make sure that I get some content in here. So I'm going to go to Containers on the left. I'm going to add a Container.
 
The container here will be called projects and I'll click Create. And then I'll open the projects folder and I'll click Upload and I'll get some sample files up here. So I'll go ahead and click Upload to get them into the cloud. So I've got a number of sample files then here in the projects folder. Now what we're looking at here is the default Access tier which is set to Hot or Inferred.
 
Now I could select one item here in the list and then I could choose Change tier just like I could click on it and open it up [Video description begins] The host clicks on CardData.db. [Video description ends] and choose Change tier. If I want to select multiple items, notice Change tier is unavailable.
 
So I have to do it on a one by one basis after the fact. When I click on it and go into Change tier, this is where I can select the Cool Access tier for less frequently accessed data. It means less charges in the end because it's on slower storage media than the Hot (Inferred) Access tier. And of course we could also enable long-term archiving. However, I'm just going to go to the Cool Access tier for this one. I'll click Save. I'll close out of this with the X and it's reflected now in the Access tier column.
 
But when you actually initially upload content, you can also determine the access tier. If I click Upload and let's say I select a sample file, so I've got an on-premises sample file I've selected but this time I'm going to open up Advanced. And this is where, under Advanced, it determined that the default Access tier when we upload content into the storage account was going to be Hot.
 
Of course, I could have changed that at this point and determined, for instance, that I want this to be treated as an archived item. OK, so I'm going to go down and I'm going to choose Upload and once I've done that, the new file I've uploaded, which is index.html, is now showing as an Access tier of Archive.
 
So therefore if I were to click directly on that file, notice I can't Download it. That's interesting. If I were to click Edit, it says I'm not authorized to perform this operation. Well, wait a second. [Video description begins] The host clicks on the storage account name on the breadcrumb trail. [Video description ends] Because if I go to another file, let's say, something that's not in the Archive tier; how about Project_A.txt, which is showing here in the Hot access tier.
 
If I click on it, well, the Download button is available. If I click on Edit, because it's a known file type in the browser txt, it opens up. How come I can edit that one and not my index.html? Well, it's because the index.html file is archived. And archived files need to be essentially hydrated or brought back to another access tier before they can be accessed. So if I were to select the index.html file and click Change tier, currently it's Archive. This is where we might be able to select, for example, the Cool Access tier; use a standard rehydrating policy and Save.
 
Now I might have to click refresh and it might even take a little while until we have the option of going in and accessing index.html from the Cool Access tier. We can always check our little bell notification icon in the upper right and here it does state that it did successfully update the access tier for our index.html. Also, if I were to click directly on index.html, I have a message about this blob, this binary large object, that's currently being rehydrated and it can't be downloaded yet. So it's really just a matter of waiting and that's the thing about archiving.
 
When you put things in the Archive tier, and this is the case with most systems, whether on-prem, or in the cloud; you can't expect to have it back necessarily one moment later. Sometimes it takes even hours, depending on how you've archived something before it becomes available again, before it becomes rehydrated and the data is actually there. OK, so the archive data will come back. It just might take a bit of time.
 
The next thing I'd like to do is to go into the properties of the storage account itself. Because in the storage account properties on the left, if I go down under the Data management section, there's something called Lifecycle management where I can click Add a rule. I'm going to call this Rule1 and it will apply to all of the blobs in the storage account, although I could have it applied to only a subset by adding a filter. Down below what I'm going to do is click Next and now what I can do is say OK, if I've got a blob that was modified more than a certain number of days ago, let's say 90 days.
 
So after 90 days of a file being modified, what should we do? Well, we can Delete the blob. We could also Move it to archive storage or Move it to cool storage. Let's say in our case we want to Move it to cool storage. OK, so that's all I'm going to do, although I could add other conditions and I'll click Add, and it's done.
 
We've got our lifecycle rule added. So the point is that we can automate the movement of data between access tiers based on rules, and in our case it was based on the number of days that have passed since that blob or that file was modified.
 

How Network Attached Storage (NAS) Is Used

Before we jump into the details of network attached storage, let's first talk about network storage and file servers. With network storage what we're really talking about is centralized storage that servers connect to over the network, which also means we have centralized backup potential and centralized security such as permissions for files as opposed to setting it up per server.
 
The other thing to think about is that we also have a number of different types of items we can store centrally over the network, like files which are often called binary large objects, or BLOBS, databases, cluster, or load balanced app files. All of this can be stored centrally and is accessible over the network by servers that have permission to do so.
 
Now that's not to say that servers won't have any local or direct attached storage because they could have that in addition. Maybe they boot up off a local disk, but then the storage for files and databases and whatnot is accessed over a network.
 
The other thing to think about with network storage, just as you would with local direct attached storage in a server, is whether you're going to encrypt data at rest and which tool you're going to use to do that, be it hardware or software-based. And there are times that you absolutely must encrypt sensitive data, once you've identified the sensitive data to remain compliant with laws or regulations.
 
You should also be monitoring data storage patterns over time to maximize your cost and storage efficiency. For example, if you realize that you have a large number of frequently accessed files, but your current high speed storage capacity cannot accommodate that volume of frequently accessed files, maybe you need to invest in more high speed storage, maybe solid state drives.
 
That way you get the performance you need for the files that are accessed frequently. On the file server side of thing, file servers can use a number of different file sharing types of protocols to allow access over the network. A file server could be configured with Network File System or NFS shares which traditionally stems from the Unix world.
 
Or it might support Server Message Block or SMB. This is normally what's used by Windows file and print sharing. You could even use other protocols to transfer files over the network, like the Secure Shell or SSH file transfer protocol, or secure FTP otherwise called SFTP. Now let's get into network attached storage, or NAS. What we're talking about is having centralized network storage, which we've already talked about.
 
But essentially what we're talking about here is a NAS device that you plug into your network. A NAS device is really, you could think of it as being a watered-down file server with an embedded operating system. So normally you manage the NAS device by connecting to a web page.
 
It's got a web interface ideally accessible over HTTPS and not HTTP. Now, a traditional file server, so not a NAS device but a traditional file server is really more powerful and flexible than a NAS device because it has more potential capabilities. But a NAS device is designed to be a file-sharing device and that's pretty much it.
 
So then NAS device plugs directly into your network. Normally it'll have a network connector such as an RJ45 connector where you can plug in your cable to connect it to your network. NAS devices, because they act kind of like file servers, support standard file sharing protocols that we've already talked about, like NFS and SMB. The NAS can also serve as a centralized backup location.
 
And the idea is that when we look at the layout of a NAS environment, the NAS appliance just connects to an Ethernet switch for example as anything else on the network would. So you would have file servers connected to your network, laptops, maybe even smartphones through a wireless connection. At the end of the day, the NAS appliance serves up files.
 
Now while it can be configured as its own watered-down file server, some NAS appliances can also work as an extension to storage for an existing file server. So you would use an existing file server, whether it's Linux or Windows-based and set permissions that are applied to NAS storage.
 
So it would look like you're accessing storage through the file server where the file server is really transparently accessing data on the NAS appliance. Here's an example screenshot of what it might look like to configure various protocols on a NAS appliance through an HTTP interface.
 
Here the file service for Microsoft Networking is being enabled, and we can put in a description and the name of a Workgroup, or we might link it to an existing Active Directory domain for access to the files on the NAS appliance. [Video description begins] A screenshot displays on-screen with three tabs: Microsoft Networking, Apple Networking, NFS Service. Server description (Optional) is set to HQNAS1. Workgroup is set to WORKGROUP1. Standalone server is selected. The options AD domain member (To enable Domain Security, please click here.) and LDAP domain authentication (To enable Domain Security, please click here.) are not selected. [Video description ends] Now, a NAS, a network-attached storage device and a storage area network, or SAN, are not the same thing.
 
We'll talk about SANs in a bit more detail soon, but for now a NAS solution provides access to network storage using common file sharing protocols like SMB or NFS. A SAN does not do that. NAS solutions run on an existing TCP/IP network. Again, a SAN does not. A SAN has its own special connectivity equipment.
 
So with a SAN we have a dedicated network for storage traffic, doesn't use file sharing protocols, but it does instead allow block-level access to storage. Block-level means that servers think that they are talking to a local mass storage device and they're just simply issuing block-level disk commands to read or write to and from that storage.
 

How Storage Area Networks (SANs) Are Used

A storage area network or a SAN provides centralized storage that is accessible over the network by servers. So ideally we would have a dedicated network for this transmission between the servers and the storage devices. The other thing to think about is that this is block-level access to that storage.
 
It's not accessing that storage over the network using file-sharing protocols like NFS or SMB. That's what a network attached storage appliance or a NAS appliance does. This is different. This is low-level block-level commands to read or write to and from storage devices over a network. The first aspect related to a SAN is an Internet Small Computer Systems Interface or iSCSI SAN.
 
Some people call this the poor man's SAN, but it is effective. Now, some people will call it that because an iSCSI SAN uses standard network hardware, that means standard network cards you already have in your servers. It also means Ethernet network switches, existing network cabling, all of that stuff. You can use it for an iSCSI SAN.
 
So what happens is that SCSI disk commands get embedded within IP packets. Of course, those IP packets are transmitted over an IP network and that allows the communication between servers and storage. Technically, the storage side could simply be high capacity disks available in a server on the network that's configured as an iSCSI target. So it doesn't have to be an enclosure like a disk array, although it certainly can be.
 
Now if you're going to use iSCSI, you should have a dedicated VLAN for iSCSI traffic only. And there are a few reasons for this, one of which is security. You want to make sure that disk IO commands and related traffic is kept separate from your normal TCP IP network. But at the same time you also want to make sure it performs well.
 
You don't want it affected by network congestion with people checking email and so on. And so that's why it should be on its own isolated VLAN. If we were to plot this on a diagram, it would look like this. So we've got at the top iSCSI initiators, whether they're Windows or Linux-based.
 
Now an iSCSI initiator makes the connection to storage over an iSCSI network. And the target, of course is the iSCSI target, where the storage actually exists. So an iSCSI initiator can be software within an operating system. You can also actually have iSCSI initiators embedded in firmware, so either on a motherboard or normally as an add-on expansion card. But either way, the iSCSI initiator makes a connection to the iSCSI target over the IP network using an IP address or using a name that gets resolved to an IP address through DNS and usually that connection happens over TCP port 3260.
 
And it uses standard network equipment such as an Ethernet switch. And we already said that that really should be configured as an isolated data zone VLAN, so a separate VLAN for iSCSI traffic only. So to the servers to the iSCSI initiators, once they've made a connection to the iSCSI target storage, that storage appears as if it were local to the server. So it's business as usual when it comes to initializing that disk device and partitioning and formatting and so on.
 
Now, a Fibre Channel or an FC SAN is more used in larger enterprises where you have specialized equipment. So in our diagram on the left, we have Windows and Linux servers that are installed with Host Bus Adapters, otherwise called HBAs. These are specialized cards that must be plugged into the server in a slot, such as a PCIe slot, and that Fibre Channel SAN card or an HBA is what allows connectivity to Fibre Channel switches; not standard network Ethernet switches.
 
These are specialized for storage area networking. And so in our diagram we have two lines stemming from each Host Bus Adapter where one connection goes to one Fibre Channel switch, the second host adapter connection goes to a different Fibre Channel switch and this is for redundancy in case of a failure of a Fibre Channel switch. The Fibre channel switches in the end connect to a Fibre Channel disk array that is configured with what are called LUNs.
 
Now let's talk about logical unit numbers or LUNs. This is a number that gets assigned to a logical storage volume. So a single physical storage device can have multiple partitions if it's a large capacity disk, and each of those can have its own unique LUN. So servers are connecting then to LUNs over the network. You can also enable LUN masking and this is normally done on the server side at the Host Bus Adapter or the HBA level. The idea is that LUN masking limits which hosts can access which LUNs out on the storage area network.
 
Now let's not forget about security. There are some steps that can be taken to harden network storage; the first of which is to always make sure you apply hardware and software updates. So if you're using iSCSI initiators in the OS, then make sure the OS is up-to-date. If it's firmware, make sure the firmware board is up-to-date, and so on. Periodic access reviews to make sure that only the required permissions are in place.
 
We want to make sure we are adhering to the principle of least privilege. We don't want to have extra permissions granted beyond what is really needed. Use multi-factor authentication or MFA to control user access to file systems. Encrypt data at rest when it comes to sensitive data. Enable LUN masking to control access to specific logical unit numbers. And if you're using iSCSI, enable password authentication.
 

Deploying a Windows iSCSI SAN

iSCSI is a storage area network solution that uses standard network equipment to allow connectivity over a standard TCP IP network to transfer SCSI disk commands. So it allows servers then to use network storage without the added expense of specialized hardware and a dedicated Fibre Channel network for a SAN.
 
So an iSCSI SAN just happens over a regular TCP IP network; ideally a network segment kept separate from normal TCP IP traffic for performance and security reasons. So here we're going to be setting up iSCSI in a Windows virtual machine running in the cloud. But whether it's running in the cloud, whether it's an on-premises, physical or virtual server, it makes no difference at all.
 
What matters is we're going to be configuring an iSCSI target, which hosts the storage over the network, and then we'll be connecting to it from an iSCSI initiator which is a client consumer as a client, but really it's a server consumer in our case over the network that will use that network storage, but it will appear to that server OS as if it were local storage.
 
So let's get started here in AWS. I'm going to go to the EC2 management console where I've already got a server running and I've named iSCSITarget, at least that's how I tagged it here in the cloud. Now what I want to do is I want to take a look at the configuration inside of that iSCSITarget virtual machine. So I've remoted into that AWS virtual machine, and what I want to do is start by going into the Start menu and looking at the disk layout.
 
So I'll search for disk and I'm going to choose Create and format hard disk partitions. The reason is because I've got a 30 gig NTFS formatted file system here and locally on this host, it's called drive (D:), and I've labeled it for clarity for this demo as For_iSCSI_Initiators. Initiator makes the connection to the storage. So I'm going to share out this space over the network.
 
It's only 30 gig, but the example is still going to be valid. [Video description begins] The host closes Disk Management. [Video description ends] And so what I need to do then is make sure that I configure this as an iSCSI target. For that I'm going to go to the Start menu and go into Server Manager to make sure that I install the appropriate components to allow that to happen. So here in the Server Manager under the Dashboard, I'm going to click Add roles and features.
 
And what I want to do is continue on through the wizard to the point where I get to Select server roles. [Video description begins] The previous steps are: Before You Begin, Installation Type, and Server Selection. [Video description ends] I'm interested in File and Storage Services. I'm going to open that up and I'm going to open up the File and iSCSI Services because what I want to do here is install the iSCSI Target Server. When I select it, it asks if I want to include any additional management tools. I'll say Add features, I'll click Next and I'm just going to continue accepting the defaults to continue through that particular wizard.
 
So we are installing the iSCSI target software. It's not been configured, but at least we're getting it installed. So once that's installed, I'm going to click Close and then I'm going to click File and Storage Services on the left, and then iSCSI. Because what I want to do is I want to configure an iSCSI virtual disk and then ultimately an iSCSI target.
 
So I'm going to go to TASKS here on the right, under iSCSI VIRTUAL DISKS and I'm going to choose New iSCSI Virtual Disk. [Video description begins] The steps for a new iSCSI virtual disk are: iSCSI Virtual Disk Location, iSCSI Virtual Disk Name, iSCSI Virtual Disk Size, iSCSI Target, Target Name and Access, Access Servers, Enable authentication and the rest is not visible, Confirmation, Results. [Video description ends]
 
OK, so now what I have to do is choose the Free Space I want to make available out over the network for other servers to consume. Here that's going to be local drive D: on this host, the iSCSITarget, which is about 30 gig. It's going to create an iSCSI virtual disk in a path specified down below. Although I could type a custom path, I'm OK with the default selection. So for the name I'm going to call it, iSCSIDisk1 and I see the full Path shown there.
 
So then I'm going to go ahead and click Next. So I'm going to specify 30 gig as the Size [Video description begins] The full path reads: D:\iSCSIVirtualDisks\iSCSIDisk1.vhdx. [Video description ends] and I'm going to assign this to what's called an iSCSI target. The nomenclature here is unfortunate because we are configuring an iSCSI target. What Microsoft really means here is who should have access to this iSCSI disk space you're making available over the network.
 
And when I click Next I'm going to call this, let's say, Target1Access. I'm going to click Next and I'm going to click Add and this is where I can specify, for example, IP Addresses of hosts that should be allowed to make a connection to this ISCSI target. [Video description begins] The host fills in the Value field: 172.31.2.4. [Video description ends] So once I've filled in that item, I can click Add and I can keep clicking Add to add which other servers on the network should be allowed to access this space.
 
I'm going to go ahead and click Next. I could also enable CHAP authentication. I'm not going to do that in this particular case, I'll click Next and then I'll click Create. So what we're doing then is completing the configuration for the iSCSI target on our server. So the next thing I'm going to do is just notice down below that for the Target Status it says Not Connected.
 
Now let's look at this from the iSCSI initiator perspective. I'm going to use the actual same virtual machine as the initiator, but it doesn't make a difference when you're testing this out. Normally it would be a different server over the network, whether it's physical or virtual. So from the Start menu I'm going to go under Windows Administrative Tools where I can then go into the iSCSI Initiator software and it says the iSCSI service isn't running. Well, let's say Yes to start it and to keep it running.
 
Now, the iSCSI Initiator can also come in the form of firmware. It doesn't necessarily have to be software like we're doing here, such as the case where you might have a server that needs to perform a boot over the network using iSCSI. However, in this case what I need to do is specify the Target IP address of the iSCSI host where the storage is available. I'm going to go ahead and enter that in and I'll choose Quick Connect. [Video description begins] The host types 172.31.2.4. [Video description ends]
 
And it says I'm connected to that target. Excellent, so I'll click Done and OK. So now what I would do on this server which means the iSCSI Initiator that connected to the network storage is you would manage that storage as you would normally manage any storage. Let me show you what I mean. I'm going to go into the Start menu and search for disk to go into the Disk Management tool.
 
Because what's going to happen is that when you connect to iSCSI storage over the network, it simply shows up as another disk device. And so from here it is business as usual. You right click on the disk, you bring it Online, you Initialize the disk. You go ahead and you right click on it and you make some kind of a disk volume out of it so that it's actually usable. So to the local operating system this Disk, in this case number 4 in our example, looks just like a regular local disk, although really it was accessed over the network via iSCSI.
 
Now again we're using the same server as the iSCSI target and Initiator, but we can see beyond that. The steps don't change in terms of how we've configured it. If I were to go back up to the top where we have iSCSI VIRTUAL DISKS on the right, click on TASKS and click Refresh down below, we would now see that the Target Status is connected and we can see the Initiator's IP address. So that's how easy it is to really set up iSCSI at a basic level.
 

Redundant Array of Inexpensive Disks (RAID) Levels

One very important aspect of planning storage subsystems for servers is dealing with redundant array of inexpensive disks, otherwise called RAID. Sometimes you'll see the definition listed as redundant array of independent disks. It really makes no difference as long as you know what it is and when you should use a specific RAID level given a particular scenario.
 
So the first thing about RAID is that all it is, is a group of physical disks that work together. Now they work together to either improve read and write performance, reading and writing to and from disk, or to provide high availability in case of a disk failure.
 
So you can either go with hardware RAID that means you have a hardware RAID controller. You have RAID firmware that can either be built into the server motherboard which is common in servers, or it could be an add-on expansion card. When you have a hardware RAID controller, when you start up the physical server, the RAID controller normally injects itself during the power-on self-test or the POST.
 
So you might press a certain keystroke on the keyboard in order to enter the configuration of the RAID controller at that time. And we'll talk about some of the configurations you might be interested in enabling. Software RAID is the other option where you configure all of the RAID settings within the operating system.
 
Hardware solutions are always preferred because they are more specialized. They are generally considered to be much more stable and perform much better than their software counterparts because that software is running within a multi-purpose operating system that is doing thousands of other things. And so the likelihood of it causing a problem or failing is greater than with dedicated firmware.
 
So at the end of the day with RAID, we've got a bunch of disks working together and the operating system will see what it thinks for example, is one logical disk when in reality under the hood it could be a bunch of disks working together. so let's start looking at RAID level 0, which is called disk striping.
 
What happens here is data gets broken down into smaller chunks or "Stripes", and each of those "Stripes" gets written to a separate disk in the array. In our example, we have three physical disks in our RAID 0 array. But to the OS it will look like one logical disk. Now you don't have to have three disks, although you do need to have at least two.
 
So the purpose here is to improve disk IO performance. You've got three disks doing the work instead of one. If one disk fails, none of the data in the RAID 0 striped array is available at all. RAID 1 is called disk mirroring and it requires at least two disks.
 
You can have more than two, but minimal of two. So what happens is that data gets written to a disk partition on one disk and then is also written to a disk partition on a second disk. So you always have an up-to-date copy of information written to a RAID 1 disk stored on the second disk. However, it also means that you're only ever using 50% of your disk space.
 
So if you bought two 1 TB drives to the operating system, you don't have two terabytes of disk space; if you enable RAID 1, you have 1 TB of disk space. So this means we can tolerate a disk failure. If one of the disks in the mirrored array fails, then you've got the other one that can be used which has an up-to-date copy of everything.
 
But don't think that this replaces traditional backups. It does not. It can supplement it, but it shouldn't replace it. So this is common in servers, certainly for the operating system or boot disk to have it enabled with RAID 1. RAID 5 is called disk striping with distributed parity. Parity is error recovery information or error rebuilding information. So in this diagram we've got data stripes.
 
Now we know about RAID 0 striping and the data strikes here are notated on each disk in the diagram there are three disks with D's: so D1 is data stripe 1, D2 is data stripe 2, and so on. Now the related parity information, which is tied to each data stripe, is stored on a different disk than its data.
 
For example, notice that data stripe 1, D1, is on the first disk in the array, but parity 1 labeled here as P1 is not on the same disk. That's by design. It's designed to be stored on a separate disk from the data. Because if the data is unavailable, we want the parity to be there and available to rebuild that missing data. [Video description begins] The first disk contains D1, P3. The second disk contains D2, P1. The third disk contains D3, P2. [Video description ends]