#CloudGuruChallenge - Improve application performance using Amazon ElastiCache

#CloudGuruChallenge - Improve application performance using Amazon ElastiCache

The purpose of this challenge is to implement a Redis cluster using Amazon ElastiCache to cache database queries in a simple Python application. More details on the challenge here.

I decided to deploy all the resources as part of a CloudFormation stack with an EC2 Instance preloaded with user data to make sure that my application could be replicated easily. When it comes to CF templates, you need to visualize the whole resources engaged for the application as well as the required prerequisites i.e. AZ, network, security groups, ACLs... Also think about all the parameters like database account and password, environment variables for EC2...

I wrote a simple CF template which creates below resources in a stack:

  • Amazon S3 bucket to retrieve and store application code from GitHub repository. Using only a GitHub repo may be an easier and better approach
  S3Bucket: #S3 Bucket
    DeletionPolicy: Retain
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "S3bucketname"
      Tags:
        - Key: key1
          Value: value1
  • EC2 Instance profile associated with a role with read and write permissions on the S3 bucket - to be able to pull the code on the EC2 instance from S3
  EC2Role: #IAM Role for EC2 instance profile (S3 bucket access)
    Type: AWS::IAM::Role
    DependsOn: S3Bucket
    Properties:
      Tags:
        - Key: key1
          Value: value1
      Description: "description"
      AssumeRolePolicyDocument: 
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - ec2.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: "policyname"
          PolicyDocument:
            Version: "2012-10-17"
            Statement: 
              - Effect: Allow
                Action: 's3:*'
                Resource: 
                  - arn:aws:s3:::s3bucketname
                  - arn:aws:s3:::s3bucketname/*
  EC2InstanceProfile: #EC2 Instance profile
    Type: AWS::IAM::InstanceProfile
    DependsOn: EC2Role
    Properties:
      Path: /
      Roles:
        - !Ref EC2Role
  • Security Group for EC2 instance
  EC2SG: #SecurityGroup for EC2 Instance
    Type: AWS::EC2::SecurityGroup
    Properties: 
      GroupDescription: "SecurityGroup for EC2 Instance - Allow all"
      GroupName: "ec2securitygroupname"
      SecurityGroupIngress:
        - 
          IpProtocol: tcp
          FromPort: 0
          ToPort: 65535
          CidrIp: 0.0.0.0/0
      Tags: 
        - Key: key1
          Value: value1
  • Security Group for RDS database and ElastiCache cluster - referencing the EC2 instance SG: Only allow traffic from specified EC2 instance
  DatabaseSG: #SecurityGroup for database
    Type: AWS::RDS::DBSecurityGroup
    DependsOn: EC2SG
    Properties:
      GroupDescription: "description"
      DBSecurityGroupIngress: 
        - EC2SecurityGroupName: !Ref EC2SG #referencing EC2 instance SG as allowed inbound traffic
      Tags: 
        - Key: key1
          Value: value1
  ElastiCacheSG: #SecurityGroup for ElastiCache
    Type: AWS::EC2::SecurityGroup
    DependsOn: EC2SG
    Properties: 
      GroupDescription: "description"
      GroupName: "securitygroupname"
      SecurityGroupIngress: 
        - IpProtocol: tcp
          FromPort: 0
          ToPort: 65535
          SourceSecurityGroupName: !Ref EC2SG #referencing EC2 instance SG as allowed inbound traffic
      Tags: 
        - Key: key1
          Value: value1
  • RDS instance with PostgreSQL as engine
DBInstance: #RDS database Instance
    Type: AWS::RDS::DBInstance
    DependsOn: 
      - DatabaseSG
    Properties:
      Engine: "postgres"
      DBInstanceIdentifier: "identifiername"
      DBInstanceClass: db.t2.micro
      AllocatedStorage: 20
      DBName: "dbname"
      MasterUsername: !Ref DatabaseAccount #referencing the master user account parameter
      MasterUserPassword: !Ref DatabasePassword #referencing the master password parameter
      StorageType: gp2
      MaxAllocatedStorage: 20
      DBSecurityGroups: 
        - !Ref DatabaseSG
      Tags:
        - Key: key1
          Value: value1
  • ElastiCache for Redis cluster for caching
  ElastiCache: # ElastiCache for Redis Cluster
    Type: AWS::ElastiCache::CacheCluster
    DependsOn: ElastiCacheSG
    Properties:
      ClusterName: "redisclustername"
      CacheNodeType: cache.t2.micro
      Engine: Redis
      VpcSecurityGroupIds:
        - !GetAtt ElastiCacheSG.GroupId #referencing the ElastiCache SG created above
      NumCacheNodes: 1
      Tags:
        - Key: key1
          Value: value1
  • EC2 Instance with bootstrap code to install prerequisites and run the python application.
  EC2: #EC2 Instance for app
    Type: AWS::EC2::Instance
    DependsOn: 
      - EC2SG
      - EC2InstanceProfile
      - ElastiCache
      - DBInstance 
    Properties: 
      ImageId: ami-0aeeebd8d2ab47354
      InstanceType: t2.micro
      KeyName: keyname #specify an existing key name
      SecurityGroups:
        - !Ref EC2SG
      IamInstanceProfile: !Ref EC2InstanceProfile
      UserData: 
        !Base64 |
        #!/bin/bash
        sudo mkdir /home/app
        sudo aws s3 cp s3://hpf-acg-elasticache/app/ /home/app --recursive
        sudo yum -y update
        sudo yum -y install python3
        sudo yum -y install postgresql
        sudo cp /home/app/config/.pgpass /home/ec2-user/.pgpass
        sudo chmod 0600 /home/ec2-user/.pgpass
        sudo chown ec2-user:ec2-user /home/ec2-user/.pgpass
        export PGPASSFILE='/home/ec2-user/.pgpass'
        export REDIS_URL=redis://redisendpoint_URL:6379
        psql -h rdsendpointURL -U postgres -f /home/app/install.sql databasename
        cd /home/app
        python3 -m venv /home/app
        source /home/app/bin/activate
        python3 -m pip install --upgrade pip
        pip install -r requirements.txt 
        python3 /home/app/app.py
      Tags:
        - Key: key1
          Value: value1
        - Key: "Name"
          Value: "ec2instancename"
  • As parameters, I provided the RDS database account username and password plus Redis endpoint URL as environment variable
Parameters:
  DatabaseAccount: #DB master account
    Description : "The database admin account. Default is postgres"
    Type : String
    Default: postgres
    MinLength : 1
    MaxLength : 41
    AllowedPattern : ^[a-zA-Z0-9]*$
  DatabasePassword: #DB master account password
    NoEcho: True
    Description : "The database admin account password"
    Type : String
    MinLength : 1
    MaxLength : 41
    AllowedPattern : ^[a-zA-Z0-9]*$

CloudFormation template diagram CF.jpg

To be able to access the app via the public IP of the EC2 instance, add the code below to the user data and copy the /app/config/nginx-app.conf file to the /etc/nginx/conf.d/ folder of your EC2.

        sudo amazon-linux-extras install nginx1
        sudo chmod -R 755 /home/app
        sudo chown -R ec2-user:nginx /home/app
        sudo cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf-orig
        sudo cp /home/app/config/nginx-app.conf /etc/nginx/conf.d/nginx-app.conf
        sudo systemctl start nginx
        sudo systemctl enable nginx

This CloudFormation template is not perfect and not complete either. A lot more parameters can be added as well as mappings.

Now that the app is running, you can see the Elapsed time is always above 5 secondes: 5.09140s, 5.08107s, 5.06325s...

Lets add the caching layer with the following settings:

  • Check the Redis cache before querying the database.
  • If a cache miss occurs, query the database and update the cache with the results.

I updated the app code by first isolating the RDS query process in a separate function with sql code as parameter:

def query(sql):
    # connect to database listed in database.ini
    conn = connect()
    if(conn != None):
        cur = conn.cursor()
        cur.execute(sql)
        # fetch one row
        retval = cur.fetchone()
        # close db connection
        cur.close() 
        conn.close()
        print("PostgreSQL connection is now closed")
        return retval
    else:
        return None

Next step is the Redis cache initialization:

# Read the Redis credentials from the REDIS_URL environment variable.
REDIS_URL = os.environ.get('REDIS_URL')

# Initialize the cache
cache = redis.Redis.from_url(REDIS_URL)

# Time to live for cached data - 10 seconds for this example
TTL = 10

Finally, I added a fetch function to check Redis cache first and update Redis in case of a miss:

def fetch(sql):
    result = cache.get(sql)
    if result:
        return json.loads(result)
    result = query(sql)
    cache.setex(sql,TTL, json.dumps(result)) # if result not found in redis, update redis cache
    return result

Let's run the app:

  • The first request also take above 5 seconds to complete.
  • From the fourth request we notice an elapsed time of 0.00150 second and lower
  • After waiting 10 seconds (which is the value of the cache TTL), the time elapsed is back to 5+ seconds

Before Redis before redis.png

After Redis after redis.png after redis2.jpg

You can find more information on improving application performance with ElastiCache on AWS docs.

This challenge was very fun to work on. I had a good time practicing on CF templates to deploy the app. I will further my learnings by trying to implement ElastiCache for session-store.