• What is AWS Elastic Beanstalk (EB)?
• What are the advantages of using EB over managing EC2 instances / Load-balancing / Auto-scaling myself?
• What are some common issues I might run into when deploying my Django app to EB?
1. Scaling Django Apps using
AWS Elastic Beanstalk
Oct 22nd, 2015
Lushen Wu
Founder, Bebop
http://bebop.com
2. What we will cover
• What is AWS Elastic Beanstalk (EB)?
• What are the advantages of using EB over
managing EC2 instances / Load-balancing / Auto-
scaling myself?
• What are some common issues I might run into
when deploying my Django app to EB?
4. Scaling option 1: manage it yourself
Amazon RDS (database)
Amazon S3 (static assets)
Amazon EC2
(cloud computing)
Amazon EC2
(cloud computing)
Amazon EC2
(cloud computing)
User User User User User User
Instances: either manually
boot up instances or configure
an Auto-scaling policy
Load-balancing: point your
DNS host to the Load
Balancer’s IP address instead
of an EC2 public IP
Use a deployment tool like
Fabric to deploy new
application code to multiple
EC2 instances (e.g. run git pull
on every instance)
+
Load-balancing
Auto-scaling
5. Core concepts primer
What is AWS auto-scaling?
• AWS monitors your EC2 instances
based on metrics that you specify
(e.g. CPU usage, RAM usage, # of
HTTP requests per minute)
• AWS adds or terminates EC2
instances based on your policy
thresholds (e.g. add instance if
average CPU >60% for 5 minutes,
remove if average CPU < 20% for 5
minutes)
What is AWS load-balancing?
• Distributes incoming requests
among multiple EC2 instances
• Usually routes traffic to instance
with lowest CPU load but you can
specify other policies (e.g. latency)
6. Scaling option 2: AWS Elastic Beanstalk
Amazon RDS (database)
Amazon S3 (static assets)
Amazon EC2
(cloud computing)
Amazon EC2
(cloud computing)
Amazon EC2
(cloud computing)
User User User User User User
EB web console allows
you to easily configure
Load Balancer and
Autoscaling policy
EB command line tool lets
you easily deploy an
application (in this case,
Django code) to all EC2
instances+
Auto-scaling
Load-balancing
Amazon EB
7. Quick aside: thoughts on “scaling”
Why is scaling necessary?
• AWS EB/EC2 supports horizontal
scaling (adding more instances) as
well as vertical scaling (using more
powerful instances)
• For speed: You want your web app
to load quickly, and avoid downtime
• For availability: if one datacenter
goes down, people can still access
your web app
Other things to keep in mind:
• Scaling can be expensive, and many site speed
bottlenecks are not solved by adding or upgrading
(e.g. render-blocking javascript/css,
uncompressed or unnecessarily large static
assets, too many HTTP requests)
• Make sure you optimize the basics, e.g.:
minified/gzipped assets, backend caching (e.g.
memcached), frontend caching (e.g. varnish) with
selective AJAX fetching
• We are not using EB to scale the database!
8. Summary of scaling options
AWS Elastic Beanstalk
• Easier to configure than manually setting up Load Balancer / Autoscaling group
• EB handles source code deployment so you don’t have to use tools like Fabric
Docker
• Briefly looked into it and seemed like EB was sufficient for our needs at the moment
• AWS also supports Docker containers
• Docker just acquired Tutum, potentially easier integrated deployment process
Heroku
• Google search for deploying / scaling Django on Heroku returned some worrying forum posts
Others … ?
9. EB Overview
1. Creating an EB Environment
2. Configuring an EB Environment
• Instances and scaling
• Environment properties
3. Deployment
4. Recap & tricky snags
10. 1. Creating an EB environment
pip install awsebcli
# check we’re good to go
eb --version
# configure eb with your AWS credentials (Access Key ID, Secret Access Key, default
region, etc.)
eb init
# create an EB environment (will add a .elasticbeanstalk and .ebextensions folder in
your project root)
eb create
Getting started with EB
11. 1. Creating an EB environment
Environment type
• Use ‘Web Server’ environment type
for an external-facing web application
• Make sure the Configuration you use
has a matching PostgreSQL version
with your RDS instance (otherwise
you may run into trouble executing
Postgres commands like ‘pgdump’)
12. 1. You can have more than one environment!
• Testing / Staging -> use smallest EC2 size, 1 or 2 instances in group
• Production -> can be larger EC2 size, at least 3 instances in group
• Workers -> integrate with Amazon SQS, do cronjobs, etc.
13. 2. Instances -> scaling -> deployment
EC2 instance management with Elastic Beanstalk
• Pick the type/size of the EC2 instances that EB will automatically launch
• All instances launched by the auto-scaling policy will be of this type/size!
• You can still manage all EC2 instances as you normally would in the AWS web console for EC2
What size should I pick?
• Remember that you are paying for EC2 instances by the hour, and (AFAIK), even a ping or some background
task using CPU will make that instance count as used
• If you spin up a ton of large instances thinking “Just in case I get a ton of traffic” … you’re gonna get a big bill
How many should I get?
• Having 3 means that you can have one instance down/overwhelmed, one instance deploying new application
version, and still have 1 for failover
14. 2. Instances -> scaling -> deployment
Scaling policy
• Specify minimum and maximum # of instances
• Specify triggers to add or terminate EC2 instances (typically based on CPU usage,
RAM usage, or # of HTTP requests)
• Cooldown: minimum amount of time after a scaling event before another scaling
event can happen… prevents “see-sawing” due to traffic micro spikes
Deployment
• Specify batch size (what # or % of instances to deploy to at the same time)
• Maximum # of instances to deploy to simultaneously
• Minimum # of instances in service
• What could be an issue here?
15. 2. EB Environment Properties
A great way to:
• Avoid committing sensitive
credentials (e.g. social API keys,
DB passwords) into source code
• Deploy same source to different
EB environments (e.g. testing /
staging / production may each
target different databases)
16. 3. Deployment Process
pip install awsebcli
# specify which EB environment to target with subsequent commands
eb use {environment-name}
# output status (e.g. whether currently deploying) and health of target EB environment
eb status
# deploy the latest commit in the current git repository
eb deploy
# dumps logs for whole environment (all instances) into a gzipped file in your project folder
eb logs -z
Using EB command line to deploy
17. 3. What actually happens when you deploy?
What happens in your environment during deployment?
1. ‘eb deploy’ creates a zip file of the latest git commit
• The version you are deploying doesn’t need to be pushed to github, it just
needs to be locally committed
• You can use another revision control platform (e.g. mercurial hg) if you want
2. EB uploads zip file to an S3 bucket associated with your Elastic
Beanstalk account, so it’s accessible by the EB environment
3. EB cycles through your instances and deploys the application to
each ‘deployment batch’ (according to your deployment policy)
18. 3. What actually happens when you deploy?
What happens on an instance during deployment?
• EB runs any server commands you specified (e.g. yum install)
• Extracts source of new version into /opt/python/ondeck
• EB runs any container_commands you specified (e.g. manage.py syncdb)
• If an error is encountered, the deployment aborts, no other instances are
attempted, and the current application version remains active
• If no errors are encountered, EB proceed with deployment:
• Links /opt/python/current/ to the /ondeck/ folder above, WSGI updated
• Django project root is always /opt/python/current/app
19. 3. Helping EB find your Django app
• Options specified in .config
files stored in .ebextensions
folder in project root
• Adds application root to
PYTHONPATH
• Points EB to WSGI
application and configure
threading
option_settings:
"aws:elasticbeanstalk:application:environment":
DJANGO_SETTINGS_MODULE: "musicserver.settings"
"PYTHONPATH": "/opt/python/current/app:$PYTHONPATH"
"aws:elasticbeanstalk:container:python":
WSGIPath: musicserver/wsgi.py
NumProcesses: 3
NumThreads: 20
"aws:elasticbeanstalk:container:python:staticfiles":
"/static/": "static/"
Option Settings (Django WSGI config)
20. 3. Customizing server
01_packages.config
Server Commands (this page) and Container Commands (next page)
• Executed in alphabetical order (name them whatever you want)
• You can run arbitrary bash commands, manage linux packages & services, run arbitrary
python code…
More: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html
commands:
set_time_zone:
command: ln -f -s /usr/share/zoneinfo/US/Eastern /etc/localtime
02_timezone.config
packages:
yum:
gcc: []
git: []
postgresql93-devel: []
21. 3. Customizing server
Container Commands
• leader_only means only the first
instance in a deployment will run
the command
• Other useful parameters:
ignoreErrors (true|false), env
(name/value pairs), cwd
• This is also the time to set up
dependencies like running chmod
on executables (targeting the
/ondeck/app folder)
03_django.config
container_commands:
01_migrate:
command: "source /opt/python/run/venv/bin/activate &&
python manage.py migrate --noinput"
leader_only: true
02_collectstatic:
command: "source /opt/python/run/venv/bin/activate &&
python manage.py collectstatic --noinput"
leader_only: true
03_compress:
command: "source /opt/python/run/venv/bin/activate &&
python manage.py compress --force"
22. 4. Recap & tricky snags
• Scaling EC2 instances is only one part of the solution!
• Select the right EC2 image version for EB (with same PostgreSQL version as RDS)
• Watch out for devtools-related deployment errors (e.g. collectstatic or compress)
• Get familiar with the deployment flow
• Zip & upload -> run server commands -> extract to /ondeck folder -> run container
commands -> link from /current folder
• Watch out for deployment batch size and minimum instances settings (avoid
“stalled” deployments)
23. Sources
# SSH invoke Django shell with virtualenv
sudo su
cd /opt/python/current/app
source /opt/python/current/env
source /opt/python/run/venv/bin/activate
python manage.py shell
Bonus snippet:• Google / various blogs
• AWS walkthrough on EB and Django
• About 3 days of head-banging and a
dozen Monster energy drinks in
August 2015