#CloudGuruChallenge Improve Application Performance using Amazon ElastiCache
My Take on the latest #CloudGuruChallenge
A Cloud Guru is one of the leaders when it comes to cloud education. Not only are their classes and hands-on labs top-notch, but they also have an incredible community. Last year, A Cloud Guru started a monthly community project called the #CloudGuruChallenge. Each month, an A Cloud Guru instructor introduces a project that solves a specific problem one may encounter on the job. These projects are helpful to introduce community members to new technologies and allow them an opportunity to work on a problem outside of their comfort zone. This month's challenge can be found here.
This month's challenge is to improve an application's performance using a Redis cluster from Amazon ElastiCache. The application is written in Python and uses Flask as a web framework. AWS RDS is used to provision a postgres database. The source code for the project can be found here.
How I Solved The Core Challenge
My GitHub repo for this project can be found here.
Before I explain my solution, I must admit that my solution does have room for improvement. If I had more time, I would work at securing my RDS database more. As of now, I have hardcoded the password and username into the Terraform script for simplicity's sake. I am also new to Terraform and I am 100% positive that I made some mistakes. But I also know a whole lot more about Terraform than I did a few days ago.
Most of my experience with Python comes from working with data science projects in Jupyter notebooks. Since I was unfamiliar with all of the Python packages used in this project, I spent some time reading up on Flask and setting up python environments using virtualenv. For the technical aspects of utilizing Redis in my project, I learned everything I needed from this ACG hands-on lab. This lab gives you most of what you need to solve this project.
I first solved this challenge by using the console to provision my RDS, ElastiCache, and EC2 instances. I logged into my EC2 instance, set up the needed dependencies, downloaded my source code from GitHub, and got my initial page load times. I got my load time by running the curl command on localhost 5 times in a row.
As the image shows, the response time for hitting the specified route and querying the database five times in a row was about 25 seconds.
The next step was to refactor our application to use Redis as a cache and see how this would affect our application performance. I refactored the
fetch() function of my
app.py file to first check our Redis cache to see if our database response is saved there. If not, then we would connect to our database, get our query, then save the results to our Redis cache. To make it easier, I wrote two files,
app.py which doesn't utilize Redis, and
redis-app.py, which does use Redis. The code to utilize Redis is below.
How would our application perform once we implemented our Redis cache? Incredibly fast!
Of course, our first query took significantly longer than the rest since it had to be retrieved from our database and cached in our Redis cache. All subsequent queries were retrieved much quicker. As the above image shows, it only took 5 seconds to run our application and get our database response.
By using our Redis cache, we were able to bring down our application performance from an average of 5 seconds per page load time to less than 1 second.
Taking It To The Next Level
After completing this challenge, it almost felt too easy. I provisioned everything in the AWS console and ran all of my commands in the command line of my EC2 instance. I wanted to try implementing the infrastructure of this project as IaC, or Infrastructure as Code. Since I have a lot of experience with AWS CloudFormation and I had recently begun to learn Terraform, I wanted to try implementing the infrastructure for this project in Terraform.
Honestly, I haven't done more than deploy an EC2 instance using Terraform. So I spent quite a bit of time researching how to implement the necessary infrastructure for this project in Terraform. Using Terraform, I deployed a new VPC with all of the necessary public and private subnets and security groups for my RDS, EC2, and ElastiCache cluster. Using EC2 user data, I was able to download all of the necessary dependencies and source code to my instance. All I needed to do was start the application and get my load times. My results were the same as above.
Why I Took the Challenge
I am currently looking for a job in Tech. I currently work in education and want to make the transition to a Cloud Developer role. This seemed like a fun project that would help me broaden my skill set and would be good to add to my portfolio.
Challenges I Faced
The first challenge I faced in this project was setting up my python environment and dependencies. Since most of my Python experience is from Jupyter notebooks, I had to learn how to provision a virtual environment and properly download dependencies. After much research and practice, I now feel confident setting up a Python project.
My next challenge was setting up nginx. I don't have any experience with nginx and using it as a proxy. A quick google search showed what is needed to set up nginx on an EC2 instance. However, I ran into some trouble with setting up the proxy. Applying the Pareto principle, I decided that setting up the nginx proxy wasn't necessary to solve this project and could get the results using the command line easily. I decided to spend my time studying up using Terraform and implementing Redis. This was a good opportunity to make a mental note to read up on nginx for the future.
My third challenge was using Terraform to provision my infrastructure. I took my first deep dive into the Terraform documentation. After a few hours and a dozen open Chrome tabs, I finally came up with my Terraform file. As a self-professed "serverless enthusiast", it has been a while since I had to think about VPCs and subnets. This project helped me refresh on these topics.
Previous to this challenge, I knew about Redis from a conceptual standpoint. Now I understand how to provision and use a Redis cluster to improve an application's performance. My biggest takeaway is how great Terraform is. I love using CloudFormation, AWS SAM, and the Serverless framework to provision infrastructure but I am very excited to add Terraform to this list.