As web developers, we have barely scratched the surface of what’s possible with Amazon Web Services (AWS) Lambda and its cousins at Google and Azure. As deployment options for web apps have multiplied in the past few years, AWS Lambda continues to stand out as a powerful, simple way to run a web app with minimal maintenance and cost — if there is low traffic.
If you’re new to AWS Lambda — but not new to web development — it can be helpful to think of a Lambda function as the combination of:
I’m impressed by the people who decided to combine all of these ideas as a single unit, price it by the millisecond, then give away a reasonable number of compute-seconds for free. It’s an interesting 21st century invention.
Now that you understand the basics of Lambda, I’ll explain a solution to host a Wagtail app using Lambda at its core. Wagtail is a framework used for building Django apps focused on content management. I’ll also explore how to deploy and manage the app using Terraform, an infrastructure configuration management tool. AWS Lambda alone doesn’t expose a website or provide a database; you’ll need Terraform (or a similar tool) to deploy the rest and connect the pieces together in a repeatable and testable way.
There is already a demo of deploying Wagtail to AWS Lambda, but that tutorial creates a non-production-ready site. I’d like to achieve a more flexible and secure deployment using Terraform instead of Zappa. By the end, we’ll be able to create a web site that runs locally, deploy it as a Lambda function behind AWS API Gateway, and keep the state in AWS RDS Aurora and S3.
We’ll divide this process into nine sections:
All the code created for this blog is publicly available on GitHub at:
https://github.com/hathawsh/wagtail_lambda_demo/tree/master/mysite
I recommend either cloning that repository or starting from scratch and following the next steps in detail until you start the Terraform deployment. Note: The Terraform steps are too complex to include here in full detail.
Using Ubuntu 20.04 or similar, create a folder called wagtail_lambda_demo
. Create and activate a Python virtual environment (VE) inside it, like this:
mkdir wagtail_lambda_demo
cd wagtail_lambda_demo
python3 -m venv venv
. venv/bin/activate
Install the wagtail
library in that environment using pip install.
pip install wagtail
Wagtail isn’t a complete content management system (CMS) on its own; it only becomes interesting once you have added your own code. For this blog, I followed Wagtail’s tutorial, “Your first Wagtail site” — starting after the pip install wagtail
command — resulting in a simple blog site hosted locally. You can follow the Wagtail tutorial without worrying about the security of local passwords and secrets because the local site and the Lambda site will run different databases. I recommend following Wagtail’s tutorial; it teaches a lot about both Django and Wagtail that will help further in this process.
An interesting feature of Wagtail sites is the bird icon in the bottom right corner. It opens a menu for editing the page, as well as other administrative functions.
The bird menu is a clue that solidifies the purpose of Wagtail: while Django allows arbitrary database models, Wagtail is focused on editable, hierarchical, publishable web pages. There’s a lot of demand for web apps that start with that narrow focus, but there’s also a high demand for apps with a different focus, so it’s great that Django and Wagtail are kept distinct.
Before moving on to step 2, ensure your app works locally.
Replace mysite/settings/production.py
with the following Python code:
from .base import *
import os
import urllib.parse
DEBUG = False
SECRET_KEY = os.environ['DJANGO_SECRET_KEY']
DATABASES = {
'default': {
'ENGINE': os.environ['DJANGO_DB_ENGINE'],
'NAME': os.environ['DJANGO_DB_NAME'],
'USER': os.environ['DJANGO_DB_USER'],
'PASSWORD': os.environ['DJANGO_DB_PASSWORD'],
'HOST': os.environ['DJANGO_DB_HOST'],
'PORT': os.environ['DJANGO_DB_PORT'],
}
}
ALLOWED_HOSTS = []
for spec in os.environ['ALLOWED_HOSTS'].split():
if '://' in spec:
host = urllib.parse.urlsplit(spec).hostname
ALLOWED_HOSTS.append(host)
else:
ALLOWED_HOSTS.append(spec)
STATIC_URL = os.environ['STATIC_URL']
# The static context processor provides STATIC_URL to templates
TEMPLATES[0]['OPTIONS']['context_processors'].append(
'django.template.context_processors.static')
DEFAULT_FROM_EMAIL = os.environ['DEFAULT_FROM_EMAIL']
EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = os.environ['EMAIL_HOST']
EMAIL_HOST_USER = os.environ.get('EMAIL_HOST_USER', '')
EMAIL_HOST_PASSWORD = os.environ.get('EMAIL_HOST_PASSWORD', '')
EMAIL_PORT = int(os.environ.get('EMAIL_PORT', 587))
EMAIL_USE_TLS = True
As shown above, the development and production settings differ in the following ways:
SECRET_KEY
is different;Host
header, but the production site locks it down for security;The majority of the environment variable values will be provided by settings on the Lambda function, while a few will be provided by a secret stored in AWS Secrets Manager.
Create a Python file called lambda_function.py
in the mysite
folder with the following content:
# The lambda_venv_path module is generated by `lambda.dockerfile`.
import lambda_venv_path # noqa
def hello(event, context):
"""Entry point for minimal testing"""
if event.get('install_secrets'):
install_secrets()
return {
'message': 'Hello from the Wagtail Lambda Demo',
'event': event,
'context': repr(context),
}
def install_secrets():
"""Add the secrets from the secret named by ENV_SECRET_ID to os.environ"""
import os
secret_id = os.environ.get('ENV_SECRET_ID')
if not secret_id:
return
import boto3
import json
session = boto3.session.Session()
client = session.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_id)
overlay = json.loads(response['SecretString'])
os.environ.update(overlay)
def manage(event, context):
"""Entry point for running a management command. Supported formats:
- "migrate"
- ["migrate"]
- {"command": ["migrate"]}
"""
if isinstance(event, dict):
command = event['command']
else:
command = event
if isinstance(command, str):
command = command.split()
install_secrets()
from django.core.wsgi import get_wsgi_application
get_wsgi_application() # Initialize Django
from django.core import management
return management.call_command(*command)
_real_handler = None
def lambda_handler(event, context):
"""Entry point for web requests"""
global _real_handler
if _real_handler is None:
install_secrets()
from apig_wsgi import make_lambda_handler
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
_real_handler = make_lambda_handler(application)
return _real_handler(event, context)
This module provides three entry points for three different Lambda functions. Note: Don’t try to run it; it’s designed to run in AWS Lambda only. Here’s what the module makes available to AWS Lambda:
lambda_venv_path
is a module that will be generated automatically. It needs to be imported first because it alters sys.path
to make other libraries importable.hello()
function is a Lambda entry point used for minimal testing. It’s useful for verifying that AWS can successfully import the module and optionally read the environment secrets.install_secrets()
function:manage()
function provides access to Django management commands. Once everything is set up, you’ll be able to invoke management commands using the Test
tab of the AWS Lambda console.lambda_handler()
function is a Lambda entry point that adapts events from the AWS API Gateway to the Django WSGI Interface. It creates the handler once per process and stores it in a module global called _read_handler
.Next you’ll need to generate a zip file for AWS Lambda. Docker is a great way to produce that zip file because Docker lets you build in an environment that’s very similar to the Lambda environment. The app won’t use Docker in production; Docker is only used for building the zip file.
Create a file called lambda.dockerfile
next to lambda_function.py
:
FROM amazonlinux:2.0.20210219.0 AS build-stage
RUN yum upgrade -y
RUN yum install -y gcc gcc-c++ make freetype-devel yum-utils findutils openssl-devel git zip
ARG PYTHON_VERSION_WITH_DOT=3.8
ARG PYTHON_VERSION_WITHOUT_DOT=38
RUN amazon-linux-extras install -y python${PYTHON_VERSION_WITH_DOT} && \
yum install -y python${PYTHON_VERSION_WITHOUT_DOT}-devel
ARG INSTBASE=/var/task
WORKDIR ${INSTBASE}
RUN python${PYTHON_VERSION_WITH_DOT} -m venv venv
COPY requirements.txt lambda_function.py .
RUN venv/bin/pip install \
-r requirements.txt \
psycopg2-binary \
apig-wsgi
# Create lambda_venv_path.py
RUN INSTBASE=${INSTBASE} venv/bin/python -c \
'import os; import sys; instbase = os.environ["INSTBASE"]; print("import sys; sys.path[:0] = %s" % [p for p in sys.path if p.startswith(instbase)])' \
> ${INSTBASE}/lambda_venv_path.py
COPY blog blog
COPY home home
COPY search search
COPY mysite mysite
COPY static static
# Remove artifacts that won't be used.
# If lib64 is a symlink, remove it.
RUN rm -rf venv/bin venv/share venv/include && \
(if test -h venv/lib64 ; then rm -f venv/lib64 ; fi)
RUN zip -r9q /tmp/lambda.zip *
# Generate a filesystem image with just the zip file as the output.
# See: https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs
FROM scratch AS export-stage
COPY --from=build-stage /tmp/lambda.zip /
This dockerfile expresses a lot in only a few lines. Most of it is straightforward if you’re familiar with dockerfiles. Here are a few notes on what’s going on:
amazonlinux:latest
base image mirrors the environment AWS Lambda runs. amazon-linux-extras
and yum
commands install Python 3.8, the preferred version of Python on AWS Lambda at the time of writing. When AWS makes a new version of Python available in Lambda, you’ll need to update the version here too./var/task
folder, so this dockerfile creates a VE and installs everything in /var/task
.lambda_venv_path
. AWS Lambda doesn’t use the full Python VE created by this dockerfile, so the generated lambda_venv_path
module mimics the VE by adding the sys.path
entries that would normally be present there. lambda_function.py
, imports lambda_venv_path
before anything else, making it possible for AWS to load the library code as if everything were running in the VE.lambda_venv_path
module is usually very simple and predictable. In your own projects, you might want to simplify and use a static lambda_venv_path
module instead of a generated module.blog
, home
, search
, and mysite
apps prepared by the Wagtail tutorial, along with the generated static
folder. If you change the set of apps to install, update the COPY
command list.build-stage
; it starts from an Amazon Linux image, runs the build steps, and produces /tmp/lambda.zip
. export-stage
; it starts from an empty image (a virtual filesystem containing no files) and grabs a copy of the zip file from the build stage.docker
command outputs the full contents of the export stage, which consists of just the zip file.Once the lambda.dockerfile
file exists, run the following command to build the zip file:
DOCKER_BUILDKIT=1 docker build -o out -f lambda.dockerfile .
If successful, that command produces the file out/lambda.zip
. You can open the zip file to ensure it contains installed versions of Django, Wagtail, your code, and all the necessary libraries. It even includes compiled shared library code as .so
files. The zip file is now ready to use as the code for Lambda functions.
To make the project easy to build, I recommend you create a Makefile
similar to the following:
default: out/lambda.zip
out/lambda.zip: lambda.dockerfile lambda_function.py requirements.txt mysite/settings/production.py static/staticfiles.json
mkdir -p out && \
DOCKER_BUILDKIT=1 docker build -o out -f lambda.dockerfile .
static/staticfiles.json:
rm -rf static && \
../venv/bin/python manage.py collectstatic --no-input
upload-static: static/staticfiles.json
aws s3 sync static "s3://$(shell cd tf; terraform output -raw static_bucket)/s" \
--exclude staticfiles.json --delete
.PHONY: default upload-static
Use the standard make
command to run the Makefile
.
We are almost ready to deploy, but first let’s talk about Terraform. Terraform is a great tool to deploy code and keep cloud infrastructure maintained. Terraform is designed to be portable, meaning that if I write the configuration correctly, the configuration I create for my AWS account can be applied to other AWS accounts without changing the configuration. Terraform has the following responsibilities:
Unfortunately, the Terraform code created for this article is too much to include inline. You can find it on GitHub at:
https://github.com/hathawsh/wagtail_lambda_demo/tree/master/mysite/tf
If you created the previous code from scratch, you can now copy the tf
folder to your own folder. The tf
folder should be a subfolder of the folder that contains mysite
, lambda_function.py
, lambda.dockerfile
, and Makefile
, among other Wagtail and Django files and folders.
Next, you’ll need to:
aws configure
.aws sts get-caller-identity
to find out who you're currently authenticated as. If you manage multiple AWS profiles, see https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html.There are a few variables you can set; to do so, see variable.tf.
If you want to change any variables from the default, create a file called my.auto.tfvars
and format it like this:
default_from_email = "wagtaildemo2@example.com"
vpc_cidr_block = "10.130.0.0/16"
From the tf
folder, run Terraform to deploy:
terraform init && terraform apply
Unfortunately, Terraform cannot complete successfully without some manual help. Terraform creates many resources, including the RDS database cluster that will host the data. However, because the created cluster is in a virtual private cloud (VPC) subnet, there isn’t a simple way for Terraform to connect to it and create the application role and database — even though Terraform knows the location of the cluster and its master password.
To fix that, once Terraform creates the RDS cluster, do the following in the AWS console:
wagtaildemo_rds_master_credentials
. Copy the Secret ARN
of that secret. It might be too long to fit on one line, so make sure you copy all of it.Actions
menu, choose Query
. A dialog will pop up.Connect with a Secrets Manager ARN
in the Database username
selector.Secret ARN
you copied earlier into the Secrets manager ARN
field that appears.Connect to database
button.wagtaildemo_env_secret
. Click the Retrieve secret value
button. Copy the value of the DJANGO_DB_PASSWORD
.PASSWORD
with the DJANGO_DB_PASSWORD
you copied in the previous step:create role appuser with password 'PASSWORD' login inherit;
create database appdb owner appuser;
Click the Run
button. Once it succeeds, the database will be ready to accept connections from the Lambda functions. You may need to run terraform apply
more than once, as some AWS resources are eventually consistent.
When Terraform completes successfully, it generates a few outputs, including:
app_endpoint
: The URL of the Wagtail site. (You can customize the URL later.)init_superuser_password
: The initial superuser password. (You should change it after logging in.)Once Terraform completes successfully, in the folder containing Makefile
, type:
make upload-static
The Makefile
performs the following steps:
collectstatic
Django management command to populate the static
folder,aws s3 sync
to efficiently upload the static files.Once that step is complete, the static files are available through AWS Cloudfront.
In the AWS console, visit the Lambda service. Terraform added 3 new functions:
wagtaildemo_hello
function as a quick test to verify that AWS can call the Python code without calling Django. Use the Test
tab to call it with {}
as the event.wagtaildemo_hello
function again to verify the Python code can read the secret containing environment passwords. Use the Test
tab to call it with {"install_secrets":true}
as the event.wagtaildemo_manage
function to complete installation."migrate"
as the event (it’s JSON, so include the double quotes) to run Django migration on the database. If it times out, try running it again."createsuperuser --no-input --username admin"
as the event to create the initial superuser.The wagtaildemo_wsgi
function provides the main service used by API Gateway.
Now you’re ready to view the site. Get the app_endpoint
and init_superuser_password
from the terraform apply
command; they are shown near the end of the output. Visit the URL specified by app_endpoint
. The page will likely be blank, as there’s no content yet, but the title of the page — shown in the browser title — should be Home
. Add /admin
to the URL and log in as user admin
with the init_superuser_password
. Add some text content and verify your content is published live. Congratulations!
In the process of writing this blog, I learned a lot about Wagtail, AWS and Django production settings. The goal was to learn how close I could get to publishing a production Django site using only serverless patterns with minimal costs. The project was a success: the demo site works, it can scale fluidly to thousands of concurrent users — all built on AWS resources that are easy to maintain and change — and I learned about AWS features and limitations.
My total AWS bill for this project — even with all the different approaches I tried, as well as a few mistakes — was $5.39 USD:
All other services I used stayed within the AWS free tier. The RDS cost was higher than I intended because I initially forgot to configure the database to scale to zero instances. Once I set that up correctly, the RDS costs grew very slowly and only increased on days I used the site.
The VPC cost is the only cost that draws my attention. The VPC cost consists only of VPC endpoint hours. AWS charges $0.01 per hour per endpoint, and it was necessary to create 2 endpoints in order to run RDS in a way that lets me scale to zero database instances, while keeping with recommended security practices — especially storing secrets in Secrets Manager. Two pennies per hour doesn’t sound like much, but that adds up to $14.40 USD in 30 days.
$14.40 USD is significantly more than the price of a small, always-on virtual machine (VM) that does the same thing as I built, but the VM would have no scaling ability or redundancy. Therefore, the minimal cost of building something that depends on RDS in serverless mode seems to be around $15 USD/month, even if no one uses the service. I conclude that if I want to provide a service to customers based on AWS, I can’t offer a single-tenant scalable solution for less than $15 USD per month. Knowing the minimal costs is helpful for identifying the kinds of business models that can be created on AWS.
One additional takeaway from this project is how the VPC concept works at AWS. A VPC is like a private network. I came to understand how well VPC components are isolated, making them behave much like physical network components. In particular, if you want a Lambda function in a VPC to be able to connect to the Internet, you need more VPC components than you would need with Elastic Compute Cloud (EC2).
EC2 instances in a VPC can be assigned ephemeral public IP addresses, making them easy to wire up to the Internet. To get the same thing with a Lambda function in a VPC, you need a few more AWS resources: you need a private subnet — preferably 2 or more — that routes to a NAT Gateway with its own Elastic IP address, which forwards packets to a public subnet connected to an Internet Gateway. No resources can be removed from that stack. The NAT Gateway assigns a public IP address to packets, but can’t connect to the Internet on its own; the Internet Gateway connects a subnet to the Internet, but can’t assign a public IP address to packets on its own. The correct VPC structure for Lambda is less complicated than the official AWS documentation makes it seem.
A few technical issues still remain in this project:
django-ses
library does not work on an AWS VPC. There are a few alternatives I would be interested in trying/researching further:
I hope this blog inspires you to build Django apps and connect with us at Six Feet Up! We love solving complex problems, learning how to build things better and networking with fellow do-gooders. There is so much left to do.