Security Architecture and Design Concepts

This is a guide on security architecture and design concepts.

This is the Ankr Power Bank I have. It has been great and reliable when I go on trips or when I get on my laptop to write somewhere away from home.

The Three-Tier Design Model

Now, one thing you must do before you actually start implementing and deploying to the cloud or using the cloud for solutions, whether it be system solutions, application development, research, a big data analysis, whatever it is. You need to have a plan, you need to have some type of life cycles and methodologies and some design before you just jump in and start putting data in the cloud or building out virtual private clouds. Or developing applications or containers or microservices.

So a common three-tier design model to be aware of, since a vast majority of traffic is going to be HTTP and HTTPS, is a three-tier, web tier, application tier, database tier model. So, the web tier most likely would be in a public subnet, a front-end to the application. This could be, the application could be Apache web servers, IIS, it could be Microsoft SharePoint, it could be a wide variety of middle-ware or business tier. But the web tier will accept the user request.

Often, in the web tier or the front-end public subnet, you'll also have a network address translation gateway and possibly a bastion, or a jump host. Which you're actually going to use to do secure shell or RDP access to instances and VMs in the application and database tier. In the application tier, you'll have your business logic middleware, whatever is running in between the front end web tier and the database tiers. And remember, from a security standpoint, when you develop your security firewalls, you want the application tier to only be allowing in traffic from the web tier.

And then from the application tier, you're only going to open up a whole or whitelist traffic back to the database tier. So for example, you may only allow traffic out of the application tier that is going to be for SQL or NoSQL, or whatever ports and services your databases are running on. For example, Oracle, so the database tier obviously is your back-end Oracle SQL, NoSQL or object-based storage. It's going to process request from the application tier, and they may also be managed by the bastion host or the jump host in the web tier. So the three-tier model is a very common design.

A lot of organizations expand the three-tier model into a four-tier model to get more control and to get more granular visibility and security at the middle layers. So let's take a look at that. Let's expand this out a little bit. Notice that we have our users out there on the Internet. They're going to be coming through our Internet gateway. Now, obviously, they think they're connecting to a web server, okay. For example, they may be thinking that they're connecting to the front end SharePoint 2013 web server, okay?

But they're not. They're actually connecting to the elastic load balancer. And as we've talked about in the previous course, course one, the elastic load balancer will be an SSL TLS listener. It may also have your certificate services on it. It will run your web application firewall, your WAF. It may be doing health checks of the instances that are being autoscaled in target groups. It'll have flow logs and other visibility tools running on it. So the elastic load balancer which, by the way, is running in the infrastructure of Amazon, or Google, or Microsoft, or IBM, or Oracle, is highly available, redundant, highly durable, and multi featured.

And then, of course, it's going to connect back to the public subnet with elastic IP addresses. Those are going to be elastic, public, routable IP addresses, possibly. Notice that in the public subnets, we may have our front end, we may not, okay? In this case, we have a public subnet and we have our reverse proxy or our NAT gateway. We may have our jump host or bastion host, and then we're connecting from the public subnet, which is really what we call a public access zone or a DMZ.

And we have our front end SharePoint servers in private subnets. So the point of this is, your front end web services or your front end application services may be in the public subnet, exposed that way. Or it may be in a private subnet, and only allowing traffic to it through a reverse proxy or from the elastic load balancer through the public subnet using firewalls to that that front end.

Notice that we have a front-end SharePoint server. We have a back-end SharePoint server that does batch processing, and then we have a database zone, which is our SQL server. We have a primary replica and availability zone one, and a secondary replica in availability zone two, and then our fourth tier is actually our directory service, okay? So we have our Active Directory, our domain controllers.

And notice that we have high availability within one region across two availability zones. So think of those availability zones as one or more data centers and some other services that may be used here. For example, from an Amazon standpoint is ElastiCache, where you've actually got elastic caching of information from the front-end SharePoint servers.

You may also be doing autoscaling, autoscaling front-end instances, autoscaling batch processing instances, or autoscaling the databases on an as-needed basis. So let's say that this, for example, is a retailer and it's during the holidays, you may want to have that ability to autoscale up between the months of October, November, and December.

So just remember, before you dive in, from a security standpoint, you want to make sure that your developers, your system admins, your DevOps people, are making sure that they have a good design, a three-tier design, a four-tier design. Because your security strategy, your security implementation, your security planning and design, has to map and has to be tightly coupled with the actual application or system design.


The Shared Responsibility ModelĀ 

In this short video, I want to remind you of the importance of understanding the shared responsibility between your organization, your data center, your assets, your server farms, your development code and your provider. So again, this is where the different types of deployment models come in, IaaS, PaaS, SaaS, and others, knowing exactly what your responsibilities are.

And I would say the main thing to remember here, we're going to talk about this later on in this training, but it's over here with the customer IAM and the AWS IAM. In this example, we're looking at Amazon Web Services shared responsibility. They have their own identity and access management.

And I can tell you from the research I've done in the data centers, they're going to have at least two levels of multi-factor authentication. They have very strict least privilege principles. Let's say you go on YouTube and you watch different seminars or webinars from, let's say, re:Invent, most of those people will never set foot in an AWS data center. So the same type of strict security they have there and governance, you need to consider that also for your assets.

And so the customer will be responsible for the identity of access management in a pure IaaS environment. Now, once you start to get into platform as a service, where you have more managed services, then obviously the cloud provider will start providing more of the security. They're the ones that are actually updating and upgrading your code or your operating systems and doing the patching.

With a managed service, they may be automatically doing AES-256 encryption. So it really depends upon the service you're using. Maybe they'll protect the network traffic for you if you're deploying some type of templatized solution to spin up your virtual private clouds or your applications. So the key thing here is that before you start to move a data or move code up to the cloud provider, understand the shared responsibility between the two partners as well as your service-level agreements, and what is your risk tolerance? What is your risk treatment or risk appetite?


Performing Cost/Benefit Analysis

Now obviously for most organizations, moving your data or your applications or your development up to the cloud, is going to obviously be a positive cost/benefit analysis in almost every situation. However, we have to remember that some of the managed services, especially security services, can be quite expensive, especially for the small to medium size business.

So you want to think about some of the techniques to perform cost/benefit analysis. Remember, cloud computing uses on-demand pricing. So it's important to first calculate the cost of maintaining your IT infrastructure in-house or on-premise and then comparing that to what it will cost you to do cloud computing at the service provider. Amortization's important. You need to understand the contribution of IT infrastructure cost to the monthly rental structure in an organization.

So we need to calculate amortization for our servers, our firewalls, our sensors, our other appliances, so we can get a fair attribution of cost for different resources. Typically, we do a monthly depreciation or amortization cost. But some organizations, like the ones I've worked with, will actually do annual amortization. You can also work with your comptroller, your accounting department, and your service desk to get that information. Servers are typically mounted in racks and most servers have the same configuration.

So this will make it easy to compute the cost for your servers. And we're talking web servers, directory servers, FTP servers, productivity software servers, cyber mining servers, perhaps, okay? So what is the cost/benefit of spinning up and scaling up servers in the cloud versus your own on-premise solution? Obviously, the network infrastructure, the cost of the network interface cards, the switches, the cables, the ports, maintenance. This is a real big plus, IaaS provides all that infrastructure for you.

Also, it's very likely that the provider's IaaS will give you much higher bandwidth. We're starting to see 100 gigabyte Ethernet being used and deployed at cloud providers. Obviously, power and cooling, that's a huge experience for the on-premise infrastructure. Software and licensing cost, you have to look at the cost/benefit analysis there. That's probably the least advantageous reason to go up to the cloud. Software cost and software licencing. You're still going to be paying those licensing cost and fees even if you're using a cloud provider. And then ongoing support and maintenance. This is really where IaaS and PaaS really come in handy.

So the bottom line is, before you start moving the assets, before you start moving data and code up to the cloud, you must perform a cost/benefit analysis. Here's another example of some ways to approach this. Up at the top we have our base cost estimation, server cost, software cost, network, maintenance, power, real estate, facility, cooling. Then we're going to do a cost analysis of our data. So we have a time analysis, we have demand analysis, as far as data in storage, and data in transit, and data in usage. Time analysis basically takes into consideration the amount of data being processed by your organization for all of the combined operations.

So you're going to determine the equivalence of computational ability in-house or on-premises versus the cloud instance. And so the final result will reveal comparison of the computational time between the two. So you can determine if you want to shift to cloud computing based on computational time, or perhaps you'll only do this for certain data or certain assets.

Demand analysis basically looks at the important problem of provisioning servers based on the demand from the end user or the client. Typically, organizations will provision servers based on the maximum amount of demand on a daily basis. However, if the average on-site, or the average on-premise demand, is less than one-third of the peak demand, then it's probably going to be much more efficient to move those computing resources up to the cloud.

And then project specific cost analysis. This is where your project managers, your PMPs, your PRINCE2, your agile project managers can come in and help you do a cost/benefit analysis on a project-by-project or program-by-program basis.


Common Development Lifecycles

As a security practitioner, it's also important to be aware of the different devops life cycles that are being used in your organization, different architectures, different life cycles. You yourself, as a security engineer, or a member of a steering committee or some type of a security team, will also be involved in some type of life cycle, for example, the NIST systems development life cycle. Realize these can be for applications, they can be for systems, it can be for your security initiative.

But basically, you have the initiation phase, where you do information gathering and reconnaissance and planning. That usually comes down from governance, from a steering committee, from executive management. Then you'll determine if you're going to acquire the system or the software or develop it yourself. That's acquisition development, that leads to Implementation and assessment, then ongoing operations and maintenance, which involves continual improvement.

For example, ITIL version three, ITIL version four, and then finally sunset, or disposal of the application or the system. And that can be an upgrade, it can be a replacement, one of several different solutions. This is also a good life cycle for your security policies and your security initiatives. Here's a more elaborate example of applying this, including multiple phases, going beyond just five phases, where phase one and phase two would be the initiation phase.

You've also got design, construct, test, product release, and post implementation, phase seven. This is a software development life cycle. Also includes training, documentation, a change control process, design and review, prototyping, code review, testing, and a wide variety of other components.

[Video description begins] A diagram displays. The software development lifecycle consists of the following seven phases: Formation, Requirement/planning, Design, Construct, Test, Product release, and Post implementation. Phase 1 includes Project initiation, which is a Rough Order of Magnitude (ROM) project development estimate. Phase 2 includes Requirements definition, which involves the following activities: Project management plant (charter), Functional requirements, Technical requirements, Requirements review and approval, Statement of work, and Change of scope document. Phase 3 includes Design, which involves the following activities: Internal/external, Design review, and Detailed project development. Phase 4 includes Construct, which involves Prototype and Code review. Phase 5 includes Test, which involves System test and Test summary. Phase 6 includes Product release, which involves Operational acceptance and Acceptance document. Phase 7 includes Enhancement/maintenance which involves Project implementation notice. The Training process spans from phase 4 to phase 7. The Documentation process spans from phase 3 to phase 6. The Change control process spans from phase 3 to phase 6. After phase 6, a change control log displays. [Video description ends]

Historically, organizations have used a waterfall software development. Basically, a sequential approach typically measured in years. The output of each phase is the input to the next phase. It offers little flexibility and adaptability. It's very predictable, and testing is done at the end. So the chances are, you're going to have more problems, more issues and challenges. Since the testing doesn't happen until after the implementation phase.

[Video description begins] The Waterfall Software Development lifecycle displays. It consists of the following phases: Requirements, Planning, Design, Implement, Test, and Turnover. [Video description ends]

Spiral is also a method that's very popular. This is a model that combines elements of both design and prototyping in stages. Basically, combining the advantages of a top-down and bottom-up concept. When originally envisioned, the iterations were typically six months to two years long. The advantages are that the estimates get more realistic as work progresses. You typically find important issues earlier, you're more able to cope with inevitable changes that software development entails. And engineers, who are often restless with a waterfall design process, can actually get their hands in and start working on a project or a program earlier.

[Video description begins] A diagram displays. It has four quadrants. The first quadrant shows Cumulative cost, Objective identification, and Review. The second quadrant shows Alternate evaluation with an arrow pointing towards it, labelled as progress. The third quadrant shows Product development. The fourth quadrant shows Next phase planning and Release. A spiral is drawn in the center spanning across the quadrants. [Video description ends]

Agile software development's very popular, it's an evolutionary approach that's measured in weeks. It involves the collaboration of cross-functional teams. It's very flexible and adaptable, not predictable, and testing is done during the development process. Agile relies on a very high level of customer involvement throughout the entire project, but especially during reviews. The customer gets a strong sense of ownership by working extensively and directively with the project team throughout.

Continuous integration's also popular. It's a development technique that forces developers to integrate code into a shared repository several times a day. That repository could be an Amazon S3 bucket or a Google Cloud storage bucket. Each check-in is then verified by an automated build, allowing teams to discover problems early. And the ultimate goal of continuous integration is to detect and locate bugs and security flaws quickly.

[Video description begins] Continuous Integration is abbreviated to CI. [Video description ends]

So to wrap up this brief lesson, remember, as a security practitioner, it's important to understand the different life cycles and architectures that are used for DevOps in your organization. As you may be responsible for implementing security in every phase of the life cycles.


Basics of Risk Management

As a security professional, dealing with on-premise, private cloud or public cloud solutions, you may be involved in risk management, risk analysis and risk assessment. Early on in the life cycle, you'll identify and classify information and assets. Sometimes because you're having government contracts, you're a government agency, or you're using particular access control architectures. Like mandatory access control or attribute-based access control.

So in the early stage of the information security lifecycle, you'll be involved with identification, classification, and assessment or valuation of your assets, both tangible and intangible. In a cloud security environment, especially with data and code, that's going to be stored and developed in the cloud. Physical assets and facilities will be involved in on-premise or private cloud solutions, or converting your hardware data center to a virtual data center.

Obviously assessing data assets and having valuation. Intangible assets, intellectual property, trademarks, copyrights, formulas and the like. And of course, assessing the value of your human resources and their available skill sets.

[Video description begins] Intellectual property is abbreviated to: IP. [Video description ends]

Many security practitioners will use qualitative analysis, where you basically have a scale of 1 to 5 or 1 to 10 of likelihood, and a scale of 1 to 5 or 1 to 10 of impact. Obviously, you're going to base this on a particular asset or a risk scenario over a certain time frame. The result, for example, if the likelihood is likely and the impact is critical, in the heat map, you'll have a result of High. That'll help you determine the amount of resources to allocate to protecting or providing countermeasures for that particular threat.

[Video description begins] A table for classic qualitative analysis displays. The impact displays vertically and the likelihood displays horizontally. Impact has five categories starting from left to right: 1-Negligible, 2-Minor, 3-Moderate, 4-Critical, and 5-Disastrous. Likelihood has five categories starting from bottom to top: 1-Improbable, 2-Seldom, 3-Occasional, 4-Likely, and 5-Frequent. If likelihood is frequent and impact is negligible or minor, the result is medium. If likelihood is frequent and impact is moderate, critical, or disastrous, the result is high. If likelihood is likely and impact is negligible, minor, or moderate, the result is medium. If likelihood is likely and impact is critical or disastrous, the result is high. If likelihood is occasional and impact is negligible, the result is low. If likelihood is occasional and impact is minor, moderate, or critical, the result is medium. If likelihood is occasional and impact is disastrous, the result is high. If likelihood is seldom and impact is negligible or minor, the result is low. If likelihood is seldom and impact is moderate, critical, or disastrous, the result is medium. If likelihood is improbable and impact is negligible, minor, or moderate, the result is low. If likelihood is improbable and impact is critical or disastrous, the result is medium. [Video description ends]

A semi-quantitative analysis will actually add value, okay, or assign actual numbers to those qualitative values. So for example, for impact, major of 3 would be greater than or equal to $1 million. This would vary based on the size and scope of your organization. The same thing with likelihood, some type of semi-quantitative value for your subjective numbers.

For example, likely would have a value of 4, and that would mean once in the last year. Then you could take the risk of an event or a scenario and give it a value of 12. And that could have more meaning when it comes to making decisions to implement technical controls, administrative controls, and physical controls.

[Video description begins] The impact and likelihood parameters for classic semi-quantitative analysis display. Impact has the following five parameters: Insignificant is equal to 1 or no impact, Minor is equal to 2 or less than 1 million dollars, Major is equal to 3 or greater than or equal to 1 million dollars, Material is equal to 4 or greater than or equal to 100 million dollars, and Catastrophic is equal to 5 or complete. Likelihood has the following five parameters: Rare is equal to 1 or almost never, Unlikely is equal to 2 or not in 5 years, Moderate is equal to 3 or once in last 5 years but not in last year, Likely is equal to 4 or once in last year, and Frequent is equal to 5 or several times a year. Risk of event is equal to 4 round bracket open material impact round bracket close multiplied by 3 round bracket open moderate likelihood round bracket close is equal to 12. [Video description ends]

Classic quant analysis is often used, where ALE is the annualized loss expectancy, AV is the asset value, EF is the event frequency, SLE is the single loss expectancy. In other words, it is the asset value times the event frequency. ARO is the annualized rate of occurrence. So we have here are the same elements we had in the qualitative analysis. We have impact, single loss expectancy.

We have likelihood or probability, and that's in loss expectancy or event frequency. And again, it's on an annual basis. The main difference here is that you're going to have hard numbers or a lesser degree of estimation or guessing.

[Video description begins] A formula displays. ALE is equal to round bracket open AV multiplied by EF round bracket close multiplied by ARO. If SLE is equal to AV multiplied by EF, then ALE is equal to SLE multiplied by ARO. [Video description ends]

Another popular solution is factor analysis of information risk, where the two main elements of risk are loss event frequency, or likelihood, and loss magnitude, or impact.

[Video description begins] A flowchart for Open Factor Analysis of Information Risk (FAIR) Quantitative Methods displays. [Video description ends]

Those get further broken down, or decomposed, into primary loss and secondary loss. And for event frequency, the threat event frequency and vulnerability factors. The goal here is to have quantitative numbers that can be used and combined with Monte Carlo simulations, PERT functions, and other mathematical algorithms.

[Video description begins] Secondary Loss is broken down into Secondary Loss Event Frequency and Secondary Loss Magnitude. [Video description ends]

Other risk analysis methods may include OCTAVE, Operationally Critical Threat, Asset and Vulnerability Evaluation.

[Video description begins] Operationally Critical Threat, Asset, and Vulnerability Evaluation is abbreviated to: OCTAVE. [Video description ends]

Building asset-based threat profiles. Bayesian analysis and Monte Carlo. For example, the techniques used in Open FAIR. The Delphi method. Event tree and fault tree analyses. And trend analysis. The bottom line? As a security practitioner dealing with cloud computing, make sure you understand what risk analysis and risk management techniques are being used, as well as your role in those processes.


Deployment and migration strategiesĀ 

When you're looking at a technology workload migration, there's different ways of migrating that workload out of your current environment, for example, from a data center to a cloud provider. We have the six Rs. The six Rs are remove, retain, re-platform, rehost, repurchase, and refactor. These are all potential migration types, and they're important to consider. These six types were selected based on innovation.

So the higher-up options, typically, are the latest technology solutions. So for example, repurchasing and refactoring. Now remove, basically is removing the service. So you're basically just turning off the workload and doing a decommission, or removing the deprecated technology from your service catalog. Retain is when you retain some of the server aspects or some of the IT services, because they can't go to the cloud and you're going to deploy a hybrid cloud solution.

Re-platform is often more desirable than retaining. If you have legacy applications that you can't migrate to an IaaS cloud platform, you could run those on a modern cloud server using emulators, or you could consider using containers like Kubernetes or Dockers. Rehosting is the fourth level, this is a lift and shift migration, it's very popular. It involves moving your existing physical and virtual servers into a compatible Amazon, Google, Microsoft, IBM, or Oracle IaaS solution.

All of the providers will offer some type of tool, ways to templatize your existing data center or migration accelerators. Repurchase and refactor can be very valuable. Repurchasing involves niches change the licensing model, changing the billing model, going to an entirely new product or vendor. When you refactor an application, basically the solution involves redesigning the solution to take advantage of the latest platform as a service and software as a service-based technologies.

This is one of the most modern and influential technology solutions for migration. There's two common application upgrade methods, the in-place upgrade involves performing updates and upgrades on existing VM instances. Whereas the replacement upgrade involves provisioning new VM instances and then redirecting traffic to the new resources.


The CSA Cloud Data Lifecycle

One thing you might want to explore is the Cloud Data Lifecycle, and this comes from the Cloud Security Alliance, or the CSA. We'll take a look at them in a second, they're at cloudsecurityalliance.org. And the cloud data life cycle involves creating, then storing, then using, sharing, archiving, and then destroying. It's a six-phase lifecycle, I want to highly recommend that you go check out cloudsecurityalliance.org.

You can actually look into membership, and they actually have some certification. As a matter of fact, they show the ISC squared Certified Cloud Security Professional, CCSP. It is the domains of CCSP that this actual training is based on. Now, this is going to be much, much less in-depth, but I am still using those six domains as my template. You can also go up here to education.

You can look at webinars, for example. They have different research groups, they have a wide variety of different working groups, and different categories from artificial intelligence, to cloud vulnerabilities, to Dev Sec Ops, to IoT, Internet of Things, security as a service, top threats, and virtualization. You can also go check out some of their downloads and I recommend the CSA security guidance version 4.0. They also have events and different chapters throughout the globe.


Storage Management Lifecycle Basics

Now in the previous lesson, we talked about the cloud data life cycle or the cloud data security life cycle, and realize that that life cycle may also play in to your storage management strategy. And as we saw earlier, we have object storage in the cloud and we have block storage in the cloud. So let's talk about the object storage tiering strategy. For example, the most readily available, the most necessary and mission-critical data, the most highly accessed data may be stored, let's say in a regular Amazon S3 bucket.

Where you have eleven 9s of durability, four 9s of availability and low-cost throughput. But maybe of other data that is maybe three months old or six or nine months old. And you can move that using automated workflow to S3-RRS, where you have reduced redundancy only over two locations. You still get four 9s of durability, but it's cheaper than the standard S3.

And then data that needs infrequent access can be moved to S3-IA, where you can have two retrievals or less per month or less than 20 times per year. Now that's not a hard limit, but they're going to charge you if you retrieve more than that. It's kind of like setting up a savings account and they give you five free withdrawals a month, that type of thing. There's also lower throughput on the S3 infrequent access.

And then archive data can go to Glacier, where you can have data archiving and store for as little as 0.004 cents per gigabyte per month. That also has retrieval policies, and the good news is all of this is going to be subject to server-side encryption using AES-256.

[Video description begins] Glacier has eleven 9s of durability and provides data archiving with flexible access options. [Video description ends]

Now, the same thing applies to block storage tiering. You may have the fastest SSD available to you to use for your frequently accessed high IOPS, mission critical data. And then a second tier, which is not the fastest SSD, but maybe a type two. And then the third tier could be fast SSD, but it would be the slowest of the three categories of the three types of solid state drives.

And then your archive tier, similar to, let's say Amazon Glacier, could be hard disk drives or other media, like tape or optical disks. So realize that as a security practitioner, you may be involved when applying the cloud security data life cycle, you may also be involved in storage management and different storage tiers for block and object storage.