A Testing Strategy for AWS Cloud Native Applications

Outlining testing strategies fit for your AWS cloud-based applications

Josh: Good morning and welcome to our webinar on a Testing Strategy for AWS Cloud Native Applications. My name is Josh and I will be your host today. You know your business can benefit from the Cloud Native Technology, but what does this mean for your testing strategy? In this webinar, we will share a testing strategy approach and methodology we developed and implemented at many organizations using AWS Cloud-Native Stack as well as the infrastructure factory model. We’ll also explain how you can build your Cloud-Native Stack highly secure and completely error-free. Today’s presenters are Anjali Vijayakumar from AWS and Siva Anna from Apexon. Anjali is a partner solutions architect at Amazon web services. In her role, she advises customers and partners on how to architect reliable and scalable solutions in the cloud. Anjali has a master’s in computer engineering from the University of Texas in Dallas. Siva has more than 20 years of experience developing and managing QA strategies for fortune 500 companies. In the nine years with Apexon, Mr. Anna has been instrumental in leading strategic enterprise engagements which have resulted in significant value for clients such as Kaiser Permanente, The Body Shop and minted.com. Before we begin, let me review some housekeeping items. First, this webcast is being recorded and will be distributed via email to you allowing you to share it with your internal teams or watch it again later. Second, your line is currently muted and third please feel free to submit any questions during the call by utilizing the chat function on the bottom of your screen. We will try to answer as many questions towards the end of the presentation that we have time. If we do not get to your question, we will reply to it via email to make sure we get you the adequate answer. We will do our best to keep this webinar to the 45 minute time allotment. At this time, I’d like to turn the presentation over to Siva. Siva: Thanks a lot, Josh for the introduction and thanks Anjali for being part of this session with me. Hello everyone, let’s get started. Today’s session we’ll start with highlighting the key aspects of the Cloud Native Applications. We’ll do a quick overview because that’s going to be the baseline score for our discussions today, so we covered many of the Cloud Native components, so that’s what I would like to start with. You can also see that the Cloud Native components cover the entire spectrum of application in terms of the architecture and the different solutions. For many of the Cloud-Native components does come from AWS offerings also, so you will see the reference of how does the AWS or components of the solutions fit into the Cloud-Native application or solution? Then we will deep dive into the testing requirements, that’s the core of this particular presentation. We’ll talk about the approaches and the specific area that you need to be focused based on your application design. Again, the testing strategy that we shared today would be a generic one, but that needs to be customized based on what specific applications are there in terms of the design and the requirements that you might have for your need. Towards the end, we’ll walk through a case study, what we discussed today, you will see how we successfully implemented and what’s the outcome that we have achieved in the particular case study that we’ll be working towards the end of the session. You will see how we use both functional and nonfunctional aspects to ensure that we are able to build a robust application and how that testing has been implemented successfully. You will see that as part of the case study. Moving on to the next slide, just to like I said to give the specific the Cloud Native definition, so this is what is outlined in the Cloud-Native community forum. It’s a very simple statement, it’s basically empowering organizations to build and run scalable applications. This has become as a mantra for all the enterprises that are starting to migrate to the cloud platform. Just to give some background, the CNCF was founded in 2015 with the mission for many, the technology leaders became part of the CNCF foundation and have been contributing to the successful implementation of many of the solutions that are available as the blueprint from the CNCF foundation. AWS joined CNCF community in 2017, it’s been two years. They have been continuously participating and providing solutions to make the adoption to make the CNCF forum a successful one. We kind of little bit took that definition and added a few more keywords because based on our experience working with many of the customers, we have seen a few more attributes also become as a need. We added a couple of keywords. It’s basically how you build a robust application but that requires a very strong testing approach as well. That’s what you will see in the subsequent slide. It’s basically, what you’ll see in this additional statement there, aligned to the business needs that we have seen with many of the customers, they look out for a secured way of testing the application as well as make sure that they’re able to distribute the application to meet the demands from the customer base. Moving along, one more slide before we go deep dive into the actual- the testing approach. Like I mentioned before, AWS became a critical partner for the Cloud Native community and providing solutions in the pathway. To give an example, AWS and jointly created a solution so you will see in AWS portal and there is lot of– Solutions like this has made adoption of the Cloud platform much more easier. Apexon has been working on many AWS Cloud implementations. We work with many of the customers to provide the solutions. With this experience, we have designed, developed and implemented many of the testing approach. That’s what I’m going to talk about, how we have used and how we have successfully tuned our testing approach that you will see the real benefit of these solutions implementation. Let’s get to the actual next slide where I talk about the different components, the Cloud Native components and how the testing fits into the landscape here. As you can see in this slide, these are all the various components. You find the ways to design and develop the Cloud Native application. The application may or may not have all the components that are listed here but these are all the different components that you will end up choosing to build your application. Keep in mind this is the set-up component keep evolving. This is going through a continuous change because there is so much that is happening day-by-day. I think you need to keep watching what are the components available, what solutions available. For example, the set of solutions that are available for- just to take an example of databases, there are 20 or 25 solutions available. You need to constantly watch out what is available, what would meet your needs or business requirements. You can select those components to build your application design. These components are stacked horizontally. You can see those components, for example App Definition and Development is one product component that’s the first layer. The Orchestration and the Management is the second layer. You will see these components are horizontally stacked against the activities that you will end up doing as part of the development life cycle. In the next slide, what we have done is to transform that into the various Cloud Native component stack, how we kind of put it into five vertical tracks. These are all the tracks that you will end up seeing in your application for your need. Our testing approach is kind of more tied towards these five vertical tracks. That is going to be the– That I will be covering, right? To put these components, they are divided into five tracks. I think to give an example for any application development you will start with some of the design principles or to give an example here the 12-factor Apps. You want to make sure that the application that is being designed and developed for a Cloud Native platform, you want to make sure that that is adhering to this 12-factor App policies or the guidelines. To give a simple example for 12– One of the principle is the stateless processes. The processes that you build for your Cloud Native platform to be stateless, right? The guideline is basically saying that that should be state, that should not share information between the processes. That’s one of the guidelines. The testing approach that you will see in the subsequent slide does talk about how to validate, how to ensure that these policies are kind of enforced correctly and how does your microservice architecture that is being tested whether it is– How the contract testing is happening, how the data privacy is being validated. That’s what you will see and that is what is the core, I would say a big differentiating factor between an application like a monolithic application that goes through the migration to your cloud native applications, right? That’s what is a big data that you will see and that we’ll catch up on it in the next few slides. Moving on to this one. This is I think the first slide I talked about. What we have seen the previous five tracks that you saw, the 12-factor App, microservices, DevOps and Everything As-a-service, then the distributed deployment that you will have for the Cloud Native applications. We divided this into two categories, what we call it as pre-deployment phases that basically has this 12-factor verification and the microservice architecture verification. We bundled them as part of the pre- deployment. That basically includes all activities that you will do before the deployment phase designing development review. All these activities that you will end up doing before the application deployment, right? These are all the things that we kind of call it as a pre deployment. In our experience going through multiple implementation as well as kind of verifying these implementations. We defined two models. One, what you see here is the static security factory model. What we do here is basically to do the static security services that basically as you can see here the list of activities that we do whether it’s a vulnerability or the compliance code vulnerability or making sure that any patch upgrade happens for various component that you will use as part of the cognitive applications. Those are all the things that you want to make sure that get tested successfully as part of this phase. Again, this all would be part of– Most of the times that we have seen many of these activities becomes part of the lifecycle. It’s not that you only explicitly perform, but it is important that you follow the model to ensure that we are able to validate these aspects, right? The factory model primarily the static factory model is primarily used to perform the foot printing of the network or the application or the code such as the SQL injection. Same goes for even application certificates or effective or there is any expert certificates you want to bring it to notice. All that would be part of the static security testing, right? This definitely takes place as part of the pre deployment phase. It’s run against that code that is being deployed. Most of the time there are tools and solutions that are available to run against, let’s say, GitHub. It will be just scanning through the code and perform these activities to ensure that these things are being done automatically. The next part of it, what we call it as either the post-deployment or we call it as dynamic security. This is where the application functionality is tested more from the end-user perspective. You make sure that both the functional aspects as well as the non-functional aspects have been getting tested successfully here. The things that we cover as far as this one is like, I said, function and non-function is one brand, but we also do the acceptance test that are coming from the user’s use cases. All that becomes part of the dynamics, right? Some other things that we do here, basically any change that happens in the core we want to verify as part of the deployment, the deployment especially maybe in a multiple location, to make sure that the distributed deployment is successfully validated. The idea of the Dynamic Security & Testing Model is to classify the entire production team in terms of the visibility from the end-user perspective. Are they working correctly or not? That’s the primary objective of the dynamic security. Data processing under session management is another key aspect that we usually cover as part of the dynamic testing. What I mean here is I can site one of our customer scenarios that they see there was a data privacy issue. We saw that in the specific load condition. The data goes from one session to another session. We have identified and in that particular instance we did combine both the functional testing with the load testing to verify that when a specific load is being generated, how does the system functionality especially the data privacy or the caching, cache refresh, or risk conditions. All that gets tested which may or may not be there in your monolithic applications. That’s one of the key things that we have seen and we have enhanced our testing model or the framework to incorporate this thing. This is the combined view of both the static and the dynamic security model. You’ve seen the pre and post-deployment phases for any typical Cloud Native application lifecycle. Just to summarize the static security model is designed to pick appropriate tool from third-party toolsets. Like I said, there are a bunch of tools available. AWS does provide quite a few components or solutions for any cloud native application design. We strongly recommend to look at the AWS toolset to see so that you will be able to leverage these solutions for your application. This is primarily to identify the scope and the tools involved and also find out are there any gaps the tool may or may not be supporting today. It might be in the roadmap, identify the gaps and see do you want to extend that tool. That’s another activity that you will do. Also the identify the vulnerable activities that are reported as part of the tool report, and see how to mitigate those gaps, what are the solutions that are available there to address those gaps. You need to put this in your approach. One of the key thing that we also have seen, especially for the dynamic security model, we have built an automation framework that validates both the functional as well as the non-functional aspects, what we call the cyber-secured automation, QMetry automation framework that supports many of the testings that I talked about here like the caching conditions or the session management or the data privacy. All that we have a mechanism to generate those specific conditions and validate that through the automated function. The QMetry Automation Framework is supporting these or testing needs that we see in your typical Cloud Native applications. Moving along, this is a simplified view for your typical testing life cycle that you will see in any application. What you will see are from pure LDLC perspective. This basically covers the set of testing activities that you will perform at any given stage of the application development. Again, you will see that this highlights what kind of functional testing you will do, which stage of your application, what kind of non-functional testing that you will do. All that is being covered in this simplified diagram. With the help of static, the security or factor model that we have and both the dynamics of coding, it is possible for delivery team to identify resources. In terms of what skillset that you need, what kind of training that you may need to do for your team members to achieve the right level of testing. Also, as part of this one, you will see what kind of the risk and the proposal the mitigation plan as part of this one. Like I said, the automation framework that we use for this kind of an engagement does come with a lot of pre-built accelerators. We use it for, like I said, that there could be a situation that a specific condition, it could be a risk condition or it could be a cache refresh happening on a specific load condition or time condition. We have actually just to mimic the load and also making sure that the component has been running for a certain timeline. We have a mechanism to inject those specific conditions so that we can mimic those production situations. That’s what we cover as part of our framework. I think I will hand it over to my partner in this session, Anjali. She will cover few slides to talk about what does AWS offer in this particular space. Anjali? Anjali: Yes. Siva: Thank you. Anjali: Guys you hear me okay? Siva: Yes, Anjali I can hear. Anjali: Thank you, Siva. That was a lot of great information. Hi, everybody. This is Anjali, a partner solutions architect at AWS. I will talk to each of the bullet points mentioned on the slide here. I’ll touch upon some core AWS services that fit well here with Apexon’s Cloud Native Stack. To give you an idea of the breadth of the AWS platform specifically with programming support, we have a lot of service offerings that automatically deploy and manage applications and systems. We have a rich set of capabilities for application-centric development. We support Java, Node, Python, Ruby, PHP, .NET and docker workloads on AWS Elastic Beanstalk and Amazon EC2 Container Service. Talking about elasticity in the cloud, before, customers used to over-provision to ensure they had enough capacity to handle their business operations at the peak level of activity, but now, they can provision the amount of resources that they actually need knowing that they can instantly scale up or scale down along with the needs of their business. This also reduces cost and improves the customer’s ability to meet user’s demands. Next, talking about user authentication, notifications and queueing systems, we have a service called Amazon Cognito, which lets you add user sign up, sign in, and access control to your web and mobile apps. For notifications, we have a service called Amazon Simple Notification Service. It’s highly durable secure fully managed pub/sub messaging service, and we have a service called Amazon Simple Queue Service, SQS, which is also a fully managed service. It’s a message queueing service. Both Amazon SNS and Amazon SQS enable you to decouple microservices, distributed systems, and serverless applications. Next, touching on storage and networking in AWS, Amazon Simpe Storage Service, S3 is an object storage service that offers industry-leading, scalability, security, and performance. There’s Amazon S3 Glacier for archival. For networking, there’s Amazon Virtual Private Cloud VPC using which you can provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. Next we also have a complete set of tools to automate code builds. We have cloud formation, code pipeline for deployment this code deploy and for API management, there’s Amazon API Gateway. Josh, if we can go to the next slide, thank you. In this slide we’ll talk about how we can apply the 12-factor App methodology to serverless applications. So, with reference to the 12-factor App methodology if we quickly go through these best practices here. For code base for a single serverless application, your code should be stored in a single repository in which a single deployable artifact is generated and from which it is deployed. This single code base should also represent the code used in all of your application environments like development, staging, production, et cetera. For dependencies code that needs to be used by multiple functions should be packaged into its own library and included inside of your Lambda deployment package. For config both Lambda and Amazon API Gateway allow you to set configuration information using the environment in which each service runs. For backing services because Lambda doesn’t allow you to run another service as part of your function execution. This fact is basically the default model for Lambda. Typically, you reference any database or data store as an external resource via an HTTP endpoint or DNS name and these connection strings are ideally passed in via the configuration information. For build release run, there’s AWS code deploy, code command and code pipeline that can be utilized for this factor. For port binding, this factor also does not apply to Lambda as execution environments do not expose any direct networking to your functions instead of a port Lambda functions are invoked via one or more triggering services or AWS APIs for Lambda. For concurrency, Lambda was built with massive concurrency and scale in mind. Lambda automatically scales to meet demands of invocations that you send to your functions. For disposability, shutdown doesn’t apply to Lambda because Lambda is intrinsically event-driven so in vacations are tied directly to incoming events or triggers. For Dev prod parity products within the AWS serverless platform do not charge for idle time, which greatly reduces the cost of running multiple environments, and you can also use the AWS serverless application model or AWS SAM to manage the configuration of your separate environments and AWS SAM allows you to model your serverless applications in greatly simplified AWS cloud formation syntax. With SAM, you can use cloud formations capabilities such as parameters and mappings to build dynamic cloud formation templates. For logs, API Gateway provides two different methods for getting log information, there’s execution logs and access logs and both are also made available to you in cloud watch logs. Siva: Okay. Thanks a lot Anjali. Thanks for the detailed walkthrough of the AWS offerings on this the Cloud Native application stack. Continuing on the same like we talked about the previous two slides that Anjali talked about the AWS offering so continuing on the same thing. What you will see is a set of components that are available for the various tracks. You will see the different solution that are available for each of the tracks right from the application database technologies, then you will see that for the CA Technologies what is being available from the AWS perspective, you will also see the infrastructure and the deployment aspect right? This is basically a little bit loaded slide. You will see lot of solutions and I just want to highlight if not all somehow the solution that are out from the AWS. I think Anjali did touch upon some of the law components and the database components. We continue to see that more and more components are filling up from AWS space on these different activities, right. We basically the static and the dynamic model that you saw and the testing model that you saw in the previous slide does support all these components, right. Irrespective of what components that you choose for your application development the framework does support coming up with a testing approach for those components. Again I want to highlight or repeat that this space is getting filled up with components day by day so you will see new components popping up more and more components coming from AWS also to support this Cloud Native, the stack. Just continuing on I think I have a couple of slides to talk about the approach that we have for both the static security as well as the dynamic security model. In this particular next two slides or about at least two models. In the case of static model, we perform basically a set of activities right from validating the readiness of the infrastructure all the way setting it up then activating the controls or the quality gates monitor those controls or enforce during the life cycle of the application. That is what we do as part of the static security model definition, so this is the set of activities we’ll go through, we identify what are the things that is available today and what we need to adopt from the say, for example from the AWS platform and we putting the necessary controls like, for example, the twirl factor ABS what controls that you want to have again depending on your application design requirements we pick and choose what the testing approach that you want to have. All that will happen as part of this approach that we defined here. We’ll definitely see an output coming here in terms of certain artifacts that would be as a reference document would continue to maintain going forward also because that would be a living document to ensure that any time there is a policy enhanced, that document would be updated and that would be used against both the monitoring as well as controlling aspect, right. On the dynamic part of it, pretty much the same set activities that go through but it’s little variation more from what we dynamic in terms of what testing that you do, what infrastructure monitoring that we do as part of these systems. I think especially for the dynamic part there is a lot of log analysis happens because their data whether it is application execution logs or whether it’s the test execution logs we want to use the two set of tools that are available to automatically validate anything that can be inferred from those a lot. In this particular approach one of the output that we produce as part of every engagement is some kind of a risk indicator, it’s a low medium high-risk indicator that we produce to constantly evaluate the state of the application more focused on the security aspect. Just to give you a flavor of the different testing that we do for typical Cloud Native applications. We look at from four angles in terms of whether it is business-facing or if it’s a product, the criticality or the customer-facing like for example, I talked about the auto-scaling and data swap validation. This is something that we do see for many of the enterprises when they move from monolithic to cloud applications, we tend to see these problems not being taken care during the design phase, so this is something that we do as part of our cloud-native application testing specific. Then the second part is about the contract testing. We do see that there is a big adoption of the microservices architecture as part of the Cloud Native applications. We do see that the contract testing is one of the key things and we have mechanism and automation framework does support this contract testing. One of the last slide before I complete my session then we can go into Q&A session. This is about the case study that I talked about that in the beginning. This is the case study for one of our current customer engagement. This customer is in the healthcare industry, they are basically a platform provider. They provide a platform for the pharmacy benefit management. Basically, this particular implementation or the customer environment requests to process high volume of data or transaction base. Their being in the healthcare industry and being active as a platform, there is going to be a high volume of data transactions. Here basically the requirement was to kind of how we can reduce? Obviously, the cost was one of the key factors for us to kind of while coming over with the solution and also coming out with a solution that would reduce the maintenance sufficiently to make sure that we are able to easily reduce maintenance and increase the deployment of the applications in easier ways. What we are seeing is that in the final solution that you will see here, I think some of the components hopefully you can read it here. To give an example, for example, the database that was used before we did the migration was the SQL server. We adopted based on the data growth that was identified. The requirement was that it should support six to eight times in the next three months with the volume of data that is being available today. Based on that, we did look at it and I think our engineering team did go with the AWS a router database they support the database, that’s what we used for this particular engagement. Also for login mechanism, I think Anjali also highlighted– We used the AWS cloud watch for the login analysis. That’s what we have seen and like it in terms we heavily used our framework, we did use both the static and the dynamic testing. We did put both the static as well as dynamic testing part as part of the CICD process. As you can see there was a huge cost saving that’s been highlighted here both of just moving from the local monolithic application to AWS cloud applications and also adopting some of the other components. There was a huge cost saving that we were able to realize in I think close to nine months of engagement that we have done it. This is something that we are currently working and it’s an ongoing engagement but we were able to see some of the benefits in the nine months duration. Almost and then just to recap what we have discussed before that, I think we do have a lot of events coming up, I think you can visit our infrastructure website to see some of the upcoming events. You are more than welcome to sign up. We have DTV channel also so you will see that a lot of videos also being posted over there. To summarize, what is the Cloud Native Stack that we talked about and what we bring to the table in terms of the testing framework? The Cyber-Secured QMetry Automation Framework is something that we all use and that’s what we recommend for any AWS Cloud Native development that supports both the functional as well as non-functional aspects. There’s the last slide before I hand it over to Josh. I think you will continue to see that in terms of the dynamic and the static security model. I think it is definitely that we have seen. I think we have been working in this space for last three years or so. We have evolved our framework to support the demands or the needs that we have seen across many of our customers. Anyone is looking for any quick assessment or validation, please reach out to Apexon, we’ll be more than happy to have a quick discovery call to share what we’ve talked about here, what we do as assessment exercise and we’ll be more than happy to exchange ideas to help you overcome the challenges that you might have. So with that, I will hand it over to Josh. Josh: Great, thanks Siva. At this time, we’d like to get your questions answered. As a reminder, you can at anytime ask a question utilizing the chat feature at the bottom of your screen. First question is for Siva. Looks like we’re getting a couple and I want to get this one out the door real quick. Do static and dynamic models work in a cloud-agnostic microservice based setup? Siva: Yes, the answer is yes. It is fully support any kind of cloud that you might end up using. The customer implementation, we have used AWS, but we do our framework that support any other cloud platform also. Josh: Okay, great, thank you. Looks like we’re getting a couple more questions. Looks like someone’s asking, on average, how many vulnerabilities and exploits do you come across for a single app? A single server type setup? Siva: It depends on various factors. Especially in the beginning of the application life cycle, we do see many of the vulnerabilities especially if they are not following the 12-factor App policies and other things. We do see many vulnerabilities in the beginning, but the good thing is the framework and the tools that are available, they do tend to catch very early in the cycle. We will be able to fix, come up with a fix for agnostic. Especially the version compatibility is one of the things because we do see components coming out with new versions very often, and there are patches available, patches being released. So you want to make sure that you are up to date in terms of the patches or the latest patches that are available. Josh: Okay, great. We have another question coming in. Do you have automated scripts for non-functional maturity model and static security factory models as well? Siva: Let me answer the question. In most of the cases, yes, we do have, like I mentioned, the framework does have be able to support both the static as well as the dynamic part of it. There are current situation that the automated solutions may or may not be applicable. The answer is yes, I think I would say good 80% to 90% of testing that we do today in our engagement is all 100% automation. Josh: Excellent. All right. It looks like unfortunately, we’re running short on time. Like I mentioned earlier, if you want to submit your questions, please go ahead and do so. Continue to do so and we will respond to them via email. I believe that’s all the time we have left for today. As Siva mentioned, if you’re interested in a discovery call with us, please submit that request via info@apexon.com. You can also reach us at 408-727-1100. I want to thank both Siva and Anjali from AWS and Siva from Apexon for taking the time today. It was a really interesting discussion and we look forward to hearing from you if you have any questions. Otherwise, enjoy the rest of your day. Thank you very much.