Walker Blogs

New Media Initiatives

Continuous Deployment with Fabric

Posted February 22, 2012 at 9:25 am — Filed under:

textured brown fabric
We have been using Fabric to deploy changes to walkerart.org. Fabric is a library that enables a string of commands to be run on multiple servers. Though similar things could be done with shell scripts, we enjoy staying in one language as much as possible. In Fabric, strings are composed and sent to remote servers as commands over an SSH connection. Our Fabric scripts have been evolving over time with the project using the mentality: “If you know you are going to be doing something more than twice, script it!”

With Fabric we can tailor our deployments precisely. We deploy often with one of two commands:
fab production simple_deploy or fab production deploy.
simple_deploy simply pulls new code from the repo and restarts the web server.
deploy does many things, each of which can be executed independently, and is explained below.

The scripts we run go both ways, code goes up to the server and data comes back to the workstation. We have fab sync_with_production, which pulls the database and images. The images arrive locally in a directory specified by an environment variable or the default directory. Conventional naming schemes simplify variables across systems such as the database name. Except for some development settings, our workstation environments are identical to the production environment, which means we can replicate a bug or feature locally and immediately.

We have been collecting all of the commands we normally run on the servers into our fabfile. And then we can group them by calling tasks from other tasks. Our deployment consists of 12 tasks. With this Fabric task, one can deploy to the production or staging server with this one command:
fab production deploy.

This makes it incredibly simple to put code that is written on developer workstations into production in as safe and secure way. Here is our deployment in Fabric:

def deploy():
with cd(env.project):
run('git pull')
get_settings()
install_requirements()
production_templates()
celerybeat('stop')
celeryd('stop')
synccompress()
migrate()
gunicorn('restart')
celeryd('start')
celerybeat('start')

First the “with” blocks put us onto the remote server, into the right directory and within Python’s virtual environment. From there “git_pull” gets the new code which contains the settings files, and “get_settings” copies any new settings into place. The task called “install_requirements” calls on pip to validate our virtual environment’s packages against the setting file called requirements. All third party packages are locked to a version so we aren’t surprised by new “features” that have adverse effects. We use celery to harvest data from other sites so we make sure they are running with fresh config files. The task “syncompress” does our compressing of css and js, “migrate” alters the database per our migration files and gunicorn is the program that is running django.

It takes about 60 seconds for a new version of the website to get into production. From there it takes 0-10 minutes for the memcached values to expire before the public changes are visible. We are deploying continuously so watch closely for updates!

Optimizing page load time

Posted December 14, 2011 at 8:02 am — Filed under:

We launched the new walkerart.org late on December 1, and it’s been a great ride. The month leading up to (and especially the preceding week starting Thanksgiving Day, when I was physically moving servers and virtualizing old machines) was incredibly intense and really brought the best out of our awesome team. I would be remiss if I didn’t start this post by thanking Eric & Chris for their long hours and commitment to the site, Robin for guiding when needed and deflecting everything else so we could do what we do, and Andrew and Emmet for whispering into Eric’s ear and steering the front-end towards the visual delight we ended up with. And obviously Paul and everyone writing for the site, because without content it’s all just bling and glitz.

Gushy thanks out of the way, the launch gave us a chance to notice the site was a little … slow. Ok, a lot, depending on your device and connection, etc. Not the universally fast experience we were hoping for. The previous Walker site packed all the overhead into the page rendering, so with the HTML cached the rest would load in under a second, easy. The new site is heavy even if the HTML is cached. Just plain old heavy: custom fonts, tons of images popping and rotating, javascript widgets willy-nilly, third-party API calls…

Here’s the dirty truth of the homepage when we kicked it out the door December 1:

12/1: 2.6 MB over 164 requests. Load times are pretty subjective depending on a lot of things, but we had good evidence of the page taking at least 4+ seconds from click to being usable — and MUCH longer in some cases. Everyone was clearly willing to cut us some slack with a shiny new site, but once the honeymoon is over we need to be usable every day — and that means fast. This issue pretty quickly moved to the top of our priority list the Monday after launch, December 5.

The first thing to tackle was the size: 2.6 MB is just way too big. Eric noticed our default image scaling routine was somehow not compressing jpgs (I know, duh), so that was an easy first step and made a huge difference in download size.

12/5: 1.9 MB.

On the 6th we discovered (again, duh) lossless jpeg and png compression and immediately applied it to all the static assets on the site, but not yet to the dynamically-generated versions. Down to 1.8 MB. We also set up a fake Content Delivery Network (CDN) to help “parallelize” our image downloads. Modern browsers allow six simultaneous connections to a single domain, so by hosting all our images at www.walkerart.org we were essentially trying to send all our content through one tiny straw. Chris was able to modify our image generator code to spread requests across three new domains: cdn0.walkerart.org, cdn1, etc. This bypasses the geography and fat pipe of a real CDN, but does give the end user a few more straws to suck content through.

Requests per Domain

www.walkerart.org 26
cdn1.walkerart.org 24
cdn0.walkerart.org 24
cdn2.walkerart.org 21
f.fontdeck.com 4
other 7

 

By the 8th we were ready to push out global image optimization and blow away the cache of too-big images we’d generated. I’m kind of astounded I’d never done this on previous sites, considering what an easy change it was and what a difference it made. We’re using jpegoptim and optipng, and it’s fantastic: probably 30% lossless saving on already compressed jpegs and pngs. No-brainer.

12/8: 1.4 MB, almost half of what we launched with.

Next we needed to reduce the number of requests. We pushed into the second weekend with a big effort to optimize the Javascript and CSS. Earlier attempts using minify had blown up and were abandoned. Eric and Chris really stepped up to find a load order that worked and a safe way to combine and compress files without corrupting the contents. Most of the work was done Friday, but we opted to wait for Monday to push it out.

Meanwhile, I spent the weekend pulling work from the client’s browser back to the server where we could cache it site-wide. This doesn’t really impact bytes transferred, but it does remove a remote API call, which could take anywhere from a fraction of a second (unnoticeable) to several seconds in a worst-case scenario (un-usable). This primarily meant writing code to regularly call and cache all of our various Twitter feeds and the main weather widget. These are now served in the cached HTML and it’s negligible in the load time, instead of 200+ ms on average. It all adds up!

 

CSS Sprite for Header and Footer nav images (it has a transparent background, so it’s supposed to look like that):

 

 

So Monday, 12/12, we pushed out our first big changes to address the number of queries. Eric had combined most of the static pngs into a CSS Sprite, the javascript and CSS were reduced to fewer files, and the third party APIs were no longer called in the browser. Really getting there, now.

12/12: 1.37 MB, and 125 requests

Happily (as I was writing this) Eric just pushed out the last (for now) CSS sprite, giving us these final numbers:

12/13: 1.37 MB, and 110 requests! (down 53% and 67% respectively)

This isn’t over, but it’s gotten to the point of markedly diminishing returns. We’re fast enough to be pretty competitive and no longer embarrassing on an iPad, but there are a few more things to pick off around the edges. We’re heavier and slower than most of our museum peers, but lighter and faster than a lot of similar news sites. Which one are we? Depends which stats I want to compare. :)

We used the following tools to help diagnose and prioritize page loading time issues:
http://tools.pingdom.com/fpt/
https://developers.google.com/pagespeed/
http://developer.yahoo.com/yslow/