Scaling servers is hard work, and neither of us comes from a devops background. Our approach to scaling starts well before that – with every decision a developer take when structuring the app.
For a campaign site we made recently, we expected huge amounts of traffic. The site itself was one of those one-page things, but had alot of dynamic elements; Instagram pictures with a certain hashtag from a specific area, tweets from a predefined list of users, tracking data for 10 different objects rendered on a map, and a countdown, which needed server time in seconds to be accurate.
Budget was tight and we wanted the server environment to be dead simple.
Amazon S3 to the rescue. We set up the S3 bucket for static site hosting, pointed a domain to it and were on our way. The dynamic content then? And the server timestamp to create the countdown?
The dynamic content is written to S3 by a cronjob on an admin server. This box runs normal LAMP stuff and every minute (or when Instagram makes a callback for a new image) it writes new content as JSON file to S3. The static site runs some simple AJAX to poll this file, and when new content is found, JavaScript pulls it into the static HTML. This admin server never takes unexpected load, as it isn’t even visible to the end users. It just chugs along in the background, preparing data for the static site to render.
Timestamp then? We could just make a simple AJAX request to the admin server, right? Getting the server date is a very cheap request? Nah. we reused the HTTP Header from the AJAX request for polling the dynamic content. The Date-header served by S3 with all responses is accurate and predictable.
This setup means that as long as Amazon S3 can take the load, the campaign is fine. If the admin box goes down for some reason, the site will still work but with old content.
Unfortunately we can’t tell you which site it is, as we are prevented by an NDA.
Ask us more if you’re interested.
Comments
One response to “Scaling isn’t tweaking httpd.conf – our approach to high volume sites”
Very cool approach! I’ve heard people use similar setups before and in this day and age it makes perfect sense.
I’m building a couple of my own sites using Jekyll, which generates static HTML using very basic templating together with the Markdown format. It makes less and less sense to recompute all of that for every request and this approach fills the gap before you’d slap on edge side caching to handle that.