Serverless Django: Exploring the State of the Art on AWS

As web developers, we have barely scratched the surface of what’s possible with Amazon Web Services (AWS) Lambda and its cousins at Google and Azure. As deployment options for web apps have multiplied in the past few years, AWS Lambda continues to stand out as a powerful, simple way to run a web app with minimal maintenance and cost — if there is low traffic.

If you’re new to AWS Lambda — but not new to web development — it can be helpful to think of a Lambda function as the combination of:

a packaged server program,
environment settings,
a message queue,
an auto-scaling controller,
and a firewall.

I’m impressed by the people who decided to combine all of these ideas as a single unit, price it by the millisecond, then give away a reasonable number of compute-seconds for free. It’s an interesting 21st century invention.

Now that you understand the basics of Lambda, I’ll explain a solution to host a Wagtail app using Lambda at its core. Wagtail is a framework used for building Django apps focused on content management. I’ll also explore how to deploy and manage the app using Terraform, an infrastructure configuration management tool. AWS Lambda alone doesn’t expose a website or provide a database; you’ll need Terraform (or a similar tool) to deploy the rest and connect the pieces together in a repeatable and testable way.

There is already a demo of deploying Wagtail to AWS Lambda, but that tutorial creates a non-production-ready site. I’d like to achieve a more flexible and secure deployment using Terraform instead of Zappa. By the end, we’ll be able to create a web site that runs locally, deploy it as a Lambda function behind AWS API Gateway, and keep the state in AWS RDS Aurora and S3.

We’ll divide this process into nine sections:

Create a Wagtail Site Locally
Create the Production Django Settings
Create the Lambda Entry Point Module
Generate the Zip File Using Docker
Run Terraform
Publish the Static Resources
Set Up the Site
Evaluation
Conclusion

All the code created for this blog is publicly available on GitHub at:

https://github.com/hathawsh/wagtail_lambda_demo/tree/master/mysite

I recommend either cloning that repository or starting from scratch and following the next steps in detail until you start the Terraform deployment. Note: The Terraform steps are too complex to include here in full detail.

1. Create a Wagtail Site Locally

Using Ubuntu 20.04 or similar, create a folder called wagtail_lambda_demo. Create and activate a Python virtual environment (VE) inside it, like this:

mkdir wagtail_lambda_demo
cd wagtail_lambda_demo
python3 -m venv venv
. venv/bin/activate

‍

Install the wagtail library in that environment using pip install.‍

pip install wagtail

‍

Wagtail isn’t a complete content management system (CMS) on its own; it only becomes interesting once you have added your own code. For this blog, I followed Wagtail’s tutorial, “Your first Wagtail site” — starting after the pip install wagtail command — resulting in a simple blog site hosted locally. You can follow the Wagtail tutorial without worrying about the security of local passwords and secrets because the local site and the Lambda site will run different databases. I recommend following Wagtail’s tutorial; it teaches a lot about both Django and Wagtail that will help further in this process.

*A simple post I created in my privately hosted blog*

An interesting feature of Wagtail sites is the bird icon in the bottom right corner. It opens a menu for editing the page, as well as other administrative functions.

The bird menu is a clue that solidifies the purpose of Wagtail: while Django allows arbitrary database models, Wagtail is focused on editable, hierarchical, publishable web pages. There’s a lot of demand for web apps that start with that narrow focus, but there’s also a high demand for apps with a different focus, so it’s great that Django and Wagtail are kept distinct.

Before moving on to step 2, ensure your app works locally.

2. Create the Production Django Settings

Replace mysite/settings/production.py with the following Python code:

from .base import *
import os
import urllib.parse

DEBUG = False

SECRET_KEY = os.environ['DJANGO_SECRET_KEY']

DATABASES = {
    'default': {
        'ENGINE': os.environ['DJANGO_DB_ENGINE'],
        'NAME': os.environ['DJANGO_DB_NAME'],
        'USER': os.environ['DJANGO_DB_USER'],
        'PASSWORD': os.environ['DJANGO_DB_PASSWORD'],
        'HOST': os.environ['DJANGO_DB_HOST'],
        'PORT': os.environ['DJANGO_DB_PORT'],
    }
}

ALLOWED_HOSTS = []
for spec in os.environ['ALLOWED_HOSTS'].split():
    if '://' in spec:
        host = urllib.parse.urlsplit(spec).hostname
        ALLOWED_HOSTS.append(host)
    else:
        ALLOWED_HOSTS.append(spec)

STATIC_URL = os.environ['STATIC_URL']

# The static context processor provides STATIC_URL to templates
TEMPLATES[0]['OPTIONS']['context_processors'].append(
    'django.template.context_processors.static')

DEFAULT_FROM_EMAIL = os.environ['DEFAULT_FROM_EMAIL']
EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = os.environ['EMAIL_HOST']
EMAIL_HOST_USER = os.environ.get('EMAIL_HOST_USER', '')
EMAIL_HOST_PASSWORD = os.environ.get('EMAIL_HOST_PASSWORD', '')
EMAIL_PORT = int(os.environ.get('EMAIL_PORT', 587))
EMAIL_USE_TLS = True

‍

As shown above, the development and production settings differ in the following ways:

Production settings come from environment variables, following the 12-factor app methodology;
The SECRET_KEY is different;
The production site uses a different database;
The development site allows an arbitrary Host header, but the production site locks it down for security;
In production, a separate service hosts the static assets for speed and cost savings;
The development site doesn’t send emails, but the production site does.

The majority of the environment variable values will be provided by settings on the Lambda function, while a few will be provided by a secret stored in AWS Secrets Manager.

3. Create the Lambda Entry Point Module

Create a Python file called lambda_function.py in the mysite folder with the following content:

# The lambda_venv_path module is generated by `lambda.dockerfile`.
import lambda_venv_path  # noqa


def hello(event, context):
    """Entry point for minimal testing"""
    if event.get('install_secrets'):
        install_secrets()

    return {
        'message': 'Hello from the Wagtail Lambda Demo',
        'event': event,
        'context': repr(context),
    }


def install_secrets():
    """Add the secrets from the secret named by ENV_SECRET_ID to os.environ"""
    import os

    secret_id = os.environ.get('ENV_SECRET_ID')
    if not secret_id:
        return

    import boto3
    import json

    session = boto3.session.Session()
    client = session.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_id)
    overlay = json.loads(response['SecretString'])
    os.environ.update(overlay)


def manage(event, context):
    """Entry point for running a management command. Supported formats:

    - "migrate"
    - ["migrate"]
    - {"command": ["migrate"]}
    """
    if isinstance(event, dict):
        command = event['command']
    else:
        command = event
    if isinstance(command, str):
        command = command.split()

    install_secrets()
    from django.core.wsgi import get_wsgi_application
    get_wsgi_application()  # Initialize Django
    from django.core import management
    return management.call_command(*command)


_real_handler = None


def lambda_handler(event, context):
    """Entry point for web requests"""
    global _real_handler

    if _real_handler is None:
        install_secrets()

        from apig_wsgi import make_lambda_handler
        from django.core.wsgi import get_wsgi_application
        application = get_wsgi_application()
        _real_handler = make_lambda_handler(application)

    return _real_handler(event, context)

‍

This module provides three entry points for three different Lambda functions. Note: Don’t try to run it; it’s designed to run in AWS Lambda only. Here’s what the module makes available to AWS Lambda:

The lambda_venv_path is a module that will be generated automatically. It needs to be imported first because it alters sys.path to make other libraries importable.
The hello() function is a Lambda entry point used for minimal testing. It’s useful for verifying that AWS can successfully import the module and optionally read the environment secrets.
The install_secrets() function:
- gets the name of an AWS secret from the environment,
- reads the value,
- and imports the value into the environment.
  - The secret value includes the Django secret key and the database password.
The manage() function provides access to Django management commands. Once everything is set up, you’ll be able to invoke management commands using the Test tab of the AWS Lambda console.
- The lambda_handler() function is a Lambda entry point that adapts events from the AWS API Gateway to the Django WSGI Interface. It creates the handler once per process and stores it in a module global called _read_handler.
At this stage, the Python code is ready for packaging and you can continue on to step 4.

4. Generate the Zip File Using Docker

Next you’ll need to generate a zip file for AWS Lambda. Docker is a great way to produce that zip file because Docker lets you build in an environment that’s very similar to the Lambda environment. The app won’t use Docker in production; Docker is only used for building the zip file.

Create a file called lambda.dockerfile next to lambda_function.py:

FROM amazonlinux:2.0.20210219.0 AS build-stage

RUN yum upgrade -y
RUN yum install -y gcc gcc-c++ make freetype-devel yum-utils findutils openssl-devel git zip

ARG PYTHON_VERSION_WITH_DOT=3.8
ARG PYTHON_VERSION_WITHOUT_DOT=38

RUN amazon-linux-extras install -y python${PYTHON_VERSION_WITH_DOT} && \
        yum install -y python${PYTHON_VERSION_WITHOUT_DOT}-devel

ARG INSTBASE=/var/task

WORKDIR ${INSTBASE}
RUN python${PYTHON_VERSION_WITH_DOT} -m venv venv

COPY requirements.txt lambda_function.py .

RUN venv/bin/pip install \
        -r requirements.txt \
        psycopg2-binary \
        apig-wsgi

# Create lambda_venv_path.py
RUN INSTBASE=${INSTBASE} venv/bin/python -c \
    'import os; import sys; instbase = os.environ["INSTBASE"]; print("import sys; sys.path[:0] = %s" % [p for p in sys.path if p.startswith(instbase)])' \
    > ${INSTBASE}/lambda_venv_path.py

COPY blog blog
COPY home home
COPY search search
COPY mysite mysite
COPY static static

# Remove artifacts that won't be used.
# If lib64 is a symlink, remove it.
RUN rm -rf venv/bin venv/share venv/include && \
        (if test -h venv/lib64 ; then rm -f venv/lib64 ; fi)

RUN zip -r9q /tmp/lambda.zip *

# Generate a filesystem image with just the zip file as the output.
# See: https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs
FROM scratch AS export-stage
COPY --from=build-stage /tmp/lambda.zip /

‍

This dockerfile expresses a lot in only a few lines. Most of it is straightforward if you’re familiar with dockerfiles. Here are a few notes on what’s going on:

The amazonlinux:latest base image mirrors the environment AWS Lambda runs.
The amazon-linux-extras and yum commands install Python 3.8, the preferred version of Python on AWS Lambda at the time of writing. When AWS makes a new version of Python available in Lambda, you’ll need to update the version here too.
AWS Lambda runs a container with the zip file unpacked in the /var/task folder, so this dockerfile creates a VE and installs everything in /var/task.
This dockerfile generates a tiny module called lambda_venv_path. AWS Lambda doesn’t use the full Python VE created by this dockerfile, so the generated lambda_venv_path module mimics the VE by adding the sys.path entries that would normally be present there.
The Lambda entry point module, lambda_function.py, imports lambda_venv_path before anything else, making it possible for AWS to load the library code as if everything were running in the VE.

Note: the generated lambda_venv_path module is usually very simple and predictable. In your own projects, you might want to simplify and use a static lambda_venv_path module instead of a generated module.

Each Django application folder needs to be copied into the image. The example dockerfile copies the blog, home, search, and mysite apps prepared by the Wagtail tutorial, along with the generated static folder. If you change the set of apps to install, update the COPY command list.
This is a two stage dockerfile:

The first stage is called build-stage; it starts from an Amazon Linux image, runs the build steps, and produces /tmp/lambda.zip.
The second stage is called export-stage; it starts from an empty image (a virtual filesystem containing no files) and grabs a copy of the zip file from the build stage.
The docker command outputs the full contents of the export stage, which consists of just the zip file.

Once the lambda.dockerfile file exists, run the following command to build the zip file:

DOCKER_BUILDKIT=1 docker build -o out -f lambda.dockerfile .

‍

If successful, that command produces the file out/lambda.zip. You can open the zip file to ensure it contains installed versions of Django, Wagtail, your code, and all the necessary libraries. It even includes compiled shared library code as .so files. The zip file is now ready to use as the code for Lambda functions.

To make the project easy to build, I recommend you create a Makefile similar to the following:

default: out/lambda.zip

out/lambda.zip: lambda.dockerfile lambda_function.py requirements.txt mysite/settings/production.py static/staticfiles.json
        mkdir -p out && \
        DOCKER_BUILDKIT=1 docker build -o out -f lambda.dockerfile .

static/staticfiles.json:
        rm -rf static && \
        ../venv/bin/python manage.py collectstatic --no-input

upload-static: static/staticfiles.json
        aws s3 sync static "s3://$(shell cd tf; terraform output -raw static_bucket)/s" \
                --exclude staticfiles.json --delete

.PHONY: default upload-static

‍

Use the standard make command to run the Makefile.

5. Run Terraform

We are almost ready to deploy, but first let’s talk about Terraform. Terraform is a great tool to deploy code and keep cloud infrastructure maintained. Terraform is designed to be portable, meaning that if I write the configuration correctly, the configuration I create for my AWS account can be applied to other AWS accounts without changing the configuration. Terraform has the following responsibilities:

keep track of the IDs of resources it creates,
take care of dependencies between resources,
and keep the resources it manages in sync with the configuration.

Unfortunately, the Terraform code created for this article is too much to include inline. You can find it on GitHub at:

https://github.com/hathawsh/wagtail_lambda_demo/tree/master/mysite/tf

If you created the previous code from scratch, you can now copy the tf folder to your own folder. The tf folder should be a subfolder of the folder that contains mysite, lambda_function.py, lambda.dockerfile, and Makefile, among other Wagtail and Django files and folders.

Next, you’ll need to:

install Terraform,
install the AWS CLI,
and authenticate to AWS using aws configure.
Tip: The AWS CLI is just a normal Python package that you can install using pip, although the authors chose not to publish it to PyPI.
Tip: If you have already set up an AWS authentication, use aws sts get-caller-identity to find out who you're currently authenticated as. If you manage multiple AWS profiles, see https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html.

There are a few variables you can set; to do so, see variable.tf. If you want to change any variables from the default, create a file called my.auto.tfvars and format it like this:

default_from_email = "wagtaildemo2@example.com"
vpc_cidr_block = "10.130.0.0/16"

‍

From the tf folder, run Terraform to deploy:

terraform init && terraform apply

‍

Unfortunately, Terraform cannot complete successfully without some manual help. Terraform creates many resources, including the RDS database cluster that will host the data. However, because the created cluster is in a virtual private cloud (VPC) subnet, there isn’t a simple way for Terraform to connect to it and create the application role and database — even though Terraform knows the location of the cluster and its master password.

To fix that, once Terraform creates the RDS cluster, do the following in the AWS console:

Locate the secret in AWS Secrets Manager named wagtaildemo_rds_master_credentials. Copy the Secret ARN of that secret. It might be too long to fit on one line, so make sure you copy all of it.
Note: the Secret ARN is not very sensitive on its own; it’s only the ID of a sensitive value.
Locate the RDS cluster. From the Actions menu, choose Query. A dialog will pop up.
Choose Connect with a Secrets Manager ARN in the Database username selector.
Paste the Secret ARN you copied earlier into the Secrets manager ARN field that appears.
Enter postgres as the name of the database and click the Connect to database button.
Locate the secret in AWS Secrets Manager named wagtaildemo_env_secret. Click the Retrieve secret value button. Copy the value of the DJANGO_DB_PASSWORD.
In the database query window, enter the following statements, replacing PASSWORD with the DJANGO_DB_PASSWORD you copied in the previous step:

create role appuser with password 'PASSWORD' login inherit;
create database appdb owner appuser;

‍

Click the Run button. Once it succeeds, the database will be ready to accept connections from the Lambda functions. You may need to run terraform apply more than once, as some AWS resources are eventually consistent.

When Terraform completes successfully, it generates a few outputs, including:

app_endpoint: The URL of the Wagtail site. (You can customize the URL later.)
init_superuser_password: The initial superuser password. (You should change it after logging in.)

6. Publish the Static Resources

Once Terraform completes successfully, in the folder containing Makefile, type:

make upload-static

‍

The Makefile performs the following steps:

runs the collectstatic Django management command to populate the static folder,
consults Terraform to identify the S3 bucket that should contain the static files, and
uses aws s3 sync to efficiently upload the static files.

Once that step is complete, the static files are available through AWS Cloudfront.

7. Set Up the Site

In the AWS console, visit the Lambda service. Terraform added 3 new functions:

Use the wagtaildemo_hello function as a quick test to verify that AWS can call the Python code without calling Django. Use the Test tab to call it with {} as the event.
Use the wagtaildemo_hello function again to verify the Python code can read the secret containing environment passwords. Use the Test tab to call it with {"install_secrets":true} as the event.
Use the wagtaildemo_manage function to complete installation.

Call it with "migrate" as the event (it’s JSON, so include the double quotes) to run Django migration on the database. If it times out, try running it again.
Call it with "createsuperuser --no-input --username admin" as the event to create the initial superuser.

The wagtaildemo_wsgi function provides the main service used by API Gateway.

Now you’re ready to view the site. Get the app_endpoint and init_superuser_password from the terraform apply command; they are shown near the end of the output. Visit the URL specified by app_endpoint. The page will likely be blank, as there’s no content yet, but the title of the page — shown in the browser title — should be Home. Add /admin to the URL and log in as user admin with the init_superuser_password. Add some text content and verify your content is published live. Congratulations!

8. Evaluation

In the process of writing this blog, I learned a lot about Wagtail, AWS and Django production settings. The goal was to learn how close I could get to publishing a production Django site using only serverless patterns with minimal costs. The project was a success: the demo site works, it can scale fluidly to thousands of concurrent users — all built on AWS resources that are easy to maintain and change — and I learned about AWS features and limitations.

My total AWS bill for this project — even with all the different approaches I tried, as well as a few mistakes — was $5.39 USD:

RDS: $4.65 USD
VPC: $0.72 USD
S3: $0.02 USD

All other services I used stayed within the AWS free tier. The RDS cost was higher than I intended because I initially forgot to configure the database to scale to zero instances. Once I set that up correctly, the RDS costs grew very slowly and only increased on days I used the site.

The VPC cost is the only cost that draws my attention. The VPC cost consists only of VPC endpoint hours. AWS charges $0.01 per hour per endpoint, and it was necessary to create 2 endpoints in order to run RDS in a way that lets me scale to zero database instances, while keeping with recommended security practices — especially storing secrets in Secrets Manager. Two pennies per hour doesn’t sound like much, but that adds up to $14.40 USD in 30 days.

$14.40 USD is significantly more than the price of a small, always-on virtual machine (VM) that does the same thing as I built, but the VM would have no scaling ability or redundancy. Therefore, the minimal cost of building something that depends on RDS in serverless mode seems to be around $15 USD/month, even if no one uses the service. I conclude that if I want to provide a service to customers based on AWS, I can’t offer a single-tenant scalable solution for less than $15 USD per month. Knowing the minimal costs is helpful for identifying the kinds of business models that can be created on AWS.

One additional takeaway from this project is how the VPC concept works at AWS. A VPC is like a private network. I came to understand how well VPC components are isolated, making them behave much like physical network components. In particular, if you want a Lambda function in a VPC to be able to connect to the Internet, you need more VPC components than you would need with Elastic Compute Cloud (EC2).

EC2 instances in a VPC can be assigned ephemeral public IP addresses, making them easy to wire up to the Internet. To get the same thing with a Lambda function in a VPC, you need a few more AWS resources: you need a private subnet — preferably 2 or more — that routes to a NAT Gateway with its own Elastic IP address, which forwards packets to a public subnet connected to an Internet Gateway. No resources can be removed from that stack. The NAT Gateway assigns a public IP address to packets, but can’t connect to the Internet on its own; the Internet Gateway connects a subnet to the Internet, but can’t assign a public IP address to packets on its own. The correct VPC structure for Lambda is less complicated than the official AWS documentation makes it seem.

9. Conclusion

A few technical issues still remain in this project:

The service relies on API Gateway — which has a fixed 30 second timeout — even though the Lambda functions API Gateway calls are allowed to run as long as 15 minutes
In online forums, AWS has repeatedly refused to raise the limit.
This means the site appears to time out on the first request because it’s waiting for database instances to spin up, which takes about 35 seconds. The request does succeed, but the user doesn’t get to see the result.
A possible solution is to switch from API Gateway to AWS Elastic Load Balancer (ELB) — which allows up to an hour timeout — but ELB has a fixed hourly cost.
If you try to upload an image, the image won’t be stored in a permanent place. Wiring up media storage on S3 is left as an exercise for the reader.
I tried to use a Terraform provider that would set up the database using the RDS Data API to avoid the VPC database configuration issue, but that provider isn’t finished - especially the “role” resource.
It would’ve been nice to avoid the VPC costs, but Aurora Serverless only runs on a VPC.
When a Lambda function tries to reach an AWS service from a VPC, and there’s no endpoint in that VPC for that AWS service, the function hangs until Lambda kills it.
Timeouts exist in the code, but they don’t seem to work when running in Lambda. This makes it difficult to understand why AWS isn’t working until you are aware of the broken timeouts.
Because the SES (Simple Email Service) VPC endpoint accepts only SMTP connections (not SES API connections), the django-ses library does not work on an AWS VPC.
My solution was to fall back to the default Django SMTP connector — possibly losing some functionality — but it did work as expected.

There are a few alternatives I would be interested in trying/researching further:

Wagtail can run on MySQL instead of Postgres; Aurora Serverless supports MySQL
The next preview release of Aurora Serverless is currently MySQL only, which suggests AWS may have a preference for MySQL.
I’d consider replacing API Gateway with Elastic Load Balancer, fixing the 30 second timeout issue, despite ELB’s significant fixed costs.
Is there an alternative to VPC Endpoints that’s free with low usage? Is it possible to make an API Gateway (or something else) listen to an IP address in a VPC and use that to proxy to AWS services?
I would like to try building a similar stack on Azure and GCP.

I hope this blog inspires you to build Django apps and connect with us at Six Feet Up! We love solving complex problems, learning how to build things better and networking with fellow do-gooders. There is so much left to do.