A while back I wrote a post about using NginX as a reverse-proxy cache for PHP (or whatever your backend is) and mentioned how I was using HAProxy to load balance. The main author of HAProxy wrote a comment about keep-alive support and how it would make things faster.

At the time, I thought “What’s the point of keep-alive for front-end? By the time the user navigates to the next page of your site, the timeout has expired, meaning a connection was left open for nothing.” This assumed that a user downloads the HTML for a site, and doesn’t download anything else until their next page request. I forgot about how some websites actually have things other than HTML, namely images, CSS, javascript, etc.

Well in a recent “omg I want everything 2x faster” frenzy, I decided for once to focus on the front-end. On beeets, we’re already using S3 with CloudFront (a CDN), aggressive HTTP caching, etc. I decided to try the latest HAProxy (1.4.4) with keep-alive.

I got it, compiled it, reconfigured:

defaults
	...
	option httpclose

became:
defaults
	...
	timeout client  5000
	option http-server-close

Easy enough…that tells HAProxy to close the server-side connection, but leave the client connection open for 5 seconds.

Well, a quick test and site load times were down by a little less than half…from about 1.1s client load time (empty cache) to 0.6s. An almost instant benefit. How does this work?

Normally, your browser hits the site. It requests /page.html, and the server says “here u go, lol” and closes the connection. Your browser reads page.html and says “hay wait, I need site.css too.” It opens a new connection and the web server hands the browser site.css and closes the connection. The browser then says “darn, I need omfg.js.” It opens another connection, and the server rolls its eyes, sighs, and hands it omfg.js.

That’s three connections, with high latency each, your browser made to the server. Connection latency is something that, no matter how hard you try, you cannot control…and there is a certain amount of latency for each of the connections your browser opens. Let’s say you have a connection latency of 200ms (not uncommon)…that’s 600ms you just waited to load a very minimal HTML page.

There is hope though…instead of trying to lower latency, you can open fewer connections. This is where keep-alive comes in.

With the new version of HAProxy, your browser says “hai, give me /page.html, but keep the connection open plz!” The web server hands over page.html and holds the connection open. The browser reads all the files it needs from page.html (site.css and omfg.js) and requests them over the connection that’s already open. The server keeps this connection open until the client closes it or until the timeout is reached (5 seconds, using the above config). In this case, the latency is a little over 200ms, the total time to load the page 200ms + the download time of the files (usually less than the latency).

So with keep-alive, you just turned a 650ms page-load time into a 250ms page-load time… a much larger margin than any sort of back-end tweaking you can do. Keep in mind most servers already support keep-alive…but I’m compelled to write about it because I use HAProxy and it’s now fully implemented.

Also keep in mind that the above scenario isn’t necessarily correct. Most browsers will open up to 6 concurrent connections to a single domain when loading a page, but you also have to factor in the fact that the browser blocks downloads when it encounters a javascript include, and then attempts to download and run the javascript before continuing the page load.

So although your connection latency with multiple requests goes down with keep-alive, you won’t get a 300% speed boost, more likely a 100% speed boost depending on how many scripts are loading in your page along with any other elements…100% is a LOT though.

So for most of us webmasters, keep-alive is a wonderful thing (assuming it has sane limits and timeouts). It can really save a lot of page load time on the front-end, which is where users spend the most of their time waiting. But if you happen to have a website that’s only HTML, keep-alive won’t do you much good =).

Recently I’ve been working on speeding up the homepage of beeets.com. Most speed tests say it takes between 4-6 seconds. Obviously, all of them are somehow fatally flawed. I digress, though.

Everyone (who’s anyone) knows that gzipping your content is a great way to reduce download time for your users. It can cut the size of html, css, and javascript by about 60-90%. Everyone also knows that gzipping can be very cpu intensive. Not anymore.

I just installed nginx’s Gzip Static Module (compile nginx with –with-http_gzip_static_module) on beeets.com. It allows you to pre-cache your gzip files. What?

Let’s say you have the file /css/beeets.css. When a request for beeets.css comes through. the static gzip module will look for /css/beeets.css.gz. If it finds it, it will serve that file as gzipped content. This allows you to gzip your static files using the highest compression ratio (gzip -9) when deploying your site. Nginx then has absolutely no work to do besides serving the static gzip file (it’s very good at serving static content).

Wherever you have a gzip section in your nginx config, you can do:

gzip_static on;

That’s it. Note that you will have to create the .gz versions of the files yourself, and it’s mentioned in the docs that it’s better if the original and the .gz files have the same timestamp; so it may be a good idea to “touch” the files after both are created. It’s also a good idea to turn the gzip compression down (gzip_comp_level 1..3). This will minimally compress dynamic content without putting too much strain on the server.

This is a great way to get the best of both worlds: gzipping (faster downloads) without the extra load on the server. Once again, nginx pulls through as the best thing since multi-cellular life. Keep in mind that this only works on static content (css, javascript, etc etc). Dynamic pages can and should be gzipped, but with a lower compression ratio to keep load off the server.

I never thought I’d see the day where people who build web servers would care what other people use them to host. In section 1 of LiteSpeed’s licence agreement you will see “You cannot use the SOFTWARE PRODUCT for any illegal activity or to host pornographic content.” HA!

That’s the stupidest thing I’ve ever seen. What kind of business limits the usage of its products to upstanding citizens only? Last I checked it was the government’s job to impose its views on businesses, not businesses imposing their views on their customers.

I have to say, it’s nice that someone is using their business to take a stand, I guess I’d just prefer it to be in defense of free speech and expression. Sure ALL porn is smutty and violent, but that’s expression in itself. Also, can you fight basic human nature? Perhaps, on a personal level. Repression of sexual tendencies is a lot different than acceptance and non-action though. My point is that pornography is the one place where sexual fantasies are allowed to exist in any way shape and/or form, and Americans, being extremely sexually self-repressed, need that outlet, not more repression.

I also think it’s funny when someone tries to be exclusive because they’re SO awesome when someone else is doing it way better

Here’s a good tip I just found. Note that this may not be for all cases. In fact, I may have stumbled on a freak coincidence. Here’s the story:

I hate java. I hate having java on a server, but hate it even more if it’s only for running one small script. Forever, beeets.com has used the YUI compressor to shrink its javascript before deployment. Well, YUI won’t run without java, so for the longest time, jre has been installed collecting dust, only to be brushed off and used once in a while during a deployment. This seems like a huge waste of space and resources.

Well, first I tried gcj. Compiling gcj was fairly straightforward, thankfully. After installing, I realized I needed to know a lot more about java in order to compile the YUI compressor with it. I needed knowledge I did not have the long-term need for, nor the will to learn in the first place. I, although revering myself as extremely tenacious, gave up.

I decided to try JSMin. This nifty program is simple, elegant, and it works well. It also has a much worse compression ratio then YUI. However, I trust any site that hosts C code and has no real layout whatsoever. Knowing the compression wasn’t as good, I still wanted to see what kind of difference gzipping the files would have.

I recorded the size of the GZipped JS files that used YUI. I then reconfigured the deployment script to use JSMin instead of YUI. I looked at the JS files with JSMin compression:

YUI:
mootools.js     88.7K (29.6K gz)
beeets.js       61.5K (20.5K gz)

JSMin:
mootools.js    106.1K (29.5K gz)
beeets.js       71.0K (17.7K gz)

Huh? GZip is actually more effective on the JS files using JSMin vs YUI! The end result is LESS download time for users.

I don’t know if this is a special case, but I was able to derive a somewhat complex formula:

YUI > JSMin
YUI + GZip < JSMin + GZip

Who would have thought. See you in hell, java.