The Quick & Dirty:

  • 9+ years in the biz, 10 messing around
  • All the latest in HTML, CSS, JS & PHP (Sorry, no .net or Ruby skills yet)
  • Cross-browser compatible code from Photoshop or Illustrator files.. or napkins!
  • Custom Specialities in Wordpress theming & plugins, Twitter & jQuery
  • Subversion, server-log analysis, and blocking hack-attempts (of late).
  • Data/Project Geek. I ♥ timelines.

I’ve compiled some handy PHP functions I’ve had to whip up. More extensive code-samples are also available.

Wordpress-from-Photoshop/Illustrator

Wordpress-from-static HTML

Wordpress-from-Photoshop

How not to build a Load-Balancer

I’m not interested in starting a hosting company. But handling upwards of 50 domains for work, I get a little nervous about uptime.

“There’s gotta be a simple way to do automatic failover”
Famous Last Words

Now, the general idea for automatic failover is that:

  1. when the main server is down,
  2. the backup server takes its place.

Sounds great! Here’s the setup:

  • Cloudflare for www-records
  • A-record to the webserver
  • *.domain.com to the webserver
  • dev.domain.com to the dev server
  • Cheap dev server for testing, running nginx.

So I’ve got an Nginx server ready, running 1 test domain. The great part of Nginx is it’s proxy-upstream capabilities. This allows any traffic to pass through to the main server, and serve from local if that is available. Here’s where I have things backwards: Nginx is much better suited to serve-from-local-if-available, and then failover to the proxy than it is vice-versa. Lovely ‘try_files‘ setting.

However that would mean that all the traffic would have to go through and to this poor little dev server. Not happening. I’m not willing to go full-bore and try my hosting servers first and failover to someone else’s (a professional). So I’ll keep the proxy-first, serve-local next setup going forward.

While I was digging through how to do this, I came across some other possible “things that sound like good ideas”:

DNS:

  • If the dev is not available, forward on to www. (Handy that the wildcard is setup, right?) This also means that I don’t have to copy over content to the dev server – woot! Turns out you don’t want to use the proxy_pass version of this. Using a separate server{} config block and do a rewrite/302 forwarder. The wildcard domain is useful only for typos, so I’ve also set up all non-www to be forced to www (aka, over to Cloudflare).
    server {
            server_name *.domain.com;
            set $subdomain $1; #capture for later;
            set $servername $2; #capture for later;
            location @devtowww {
                 return 302 $scheme://www.$servername$request_uri;
            }
    }
  • StackOverflow seemed to indicate that “modern browsers” will automatically try the second A-record if the first is unavailable. That sounds like I can have 2 A-records – 1 for the primary hosting, and 1 for the load-balanced/backup hosting.
  • Other people said that the A-record would have to be swapped when one is down. That sounded like scary-scripting to me, so I opted to try the dual-A-record. “Round-robin” is indeed supported by Cloudflare.
  • Since modern browsers are using the available-A-record, why not Cloudflare? It wouldn’t cache 400 or 500 responses compared to a 200 response, right? Wrong. This is explicitly stated on the round-robin page: “Note: the system currently does not have the functionality to automatically select the next available server if one of the servers in the group goes down.”

Nginx:

  • Why not setup Nginx to use 1 pool with everything in it! Just put all the primary servers in 1 pool, and obviously the next server will be chosen until the correct server is found! This was destroyed thanks to my primary hosting being kind enough to return a 200 status and blank page if the wrong domain was pointed to it. Either content-filtering or domain-lists were the answer. I wish Nginx had content-filtering. So I went with nightly domain-list building, using the handy map with include function in Nginx.
  • Nginx’s proxying has a nice setting called “ip_hash” that will keep users on the same server, provided it is “up”. What qualifies for being “up” I assumed meant a non-500, non-400 response (otherwise defined in proxy_next_upstream ). What really qualifies as up is not having the “down” attribute. This is terrible.  Absolutely terrible. Especially when “it works for me” and not for the client. Totally destroyed my morning. At one point I even had to change localhost to 127.0.0.1 in the upstream config, just to force users over. Never again.

 

In the end, I setup a ton of custom error pages all over the place, to identify which server: apache backend, production-external or load-balancer was throwing the error.

Other handy bits:

  • Nginx only supports adding in 1 header. Choose wisely.
  • http header variables: remember which are for the request, and which are for the response.
  • location /{} is not the same as location =/{}
  • error_page can dump out to a named location, which can dump out to an proxy_upstream! But you definitely want recursive_error_pages to be turned on.
  • error_page 404 403 402 [email protected];   is not the same as error_page 404 403 402 =404 @devtowww;
  • AWS for user-provided content/uploads. Provides server-independence.

Ultimately there’s 1 major trick: What defines “down”? Most 404 pages are actually 200 responses! Nginx’s proxy_next_upstream handles 500, 502, 503, 504, 403 and 404. Note: not 401 unauthorized, something certain webservers without an index file like to return. And my particular Nginx installation seemed to hate the 403 option as well.

 

 

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS
Posted in articles, Day Job, devOps, howto, nginx Leave a comment

Leave a Reply

You must be logged in to post a comment.