201002.04

NginX as a caching reverse proxy for PHP

So I got to thinking. There are some good caching reverse proxies out there, maybe it's time to check one out for beeets. Not that we get a ton of traffic or we really need one, but hey what if we get digged or something? Anyway, the setup now is not really what I call simple. HAproxy sits in front of NginX, which serves static content and sends PHP requests back to PHP-FPM. That's three steps to load a fucking page. Most sites use apache + mod_php (one step)! But I like to tinker, and I like to see requests/second double when I'm running ab on beeets.

So, I'd like to try something like Varnish (sorry, Squid) but that's adding one more step in between my requests and my content. Sure it would add a great speed boost, but it's another layer of complexity. Plus it's a whole nother service to ramp up on, which is fun but these days my time is limited. I did some research and found what I was looking for.

NginX has made me cream my pants every time I log onto the server since the day I installed it. It's fast, stable, fast, and amazing. Wow, I love it. Now I read that NginX can cache FastCGI requests based on response caching headers. So I set it up, modified the beeets api to send back some Cache-Control junk, and voilà...a %2800 speed boost on some of the more complicated functions in the API.

Here's the config I used:

# in http {}
fastcgi_cache_path /srv/tmp/cache/fastcgi_cache levels=1:2
                           keys_zone=php:16m
                           inactive=5m max_size=500m;
# after our normal fastcgi_* stuff in server {}
fastcgi_cache php;
fastcgi_cache_key $request_uri$request_body;
fastcgi_cache_valid any 1s;
fastcgi_pass_header Set-Cookie;
fastcgi_buffers 64 4k;

So we're giving it a 500mb cache. It says that any valid cache is saved for 1 second, but this gets overriden with the Cache-Control headers sent by PHP. I'm using $request_body in the cache key because in our API, the actual request is sent through like:

GET /events/tags/1 HTTP/1.1
Host: ...
{"page":1,"per_page":10}

The params are sent through the HTTP body even in a GET. Why? I spent a good amount of time trying to get the API to accept the params through the query string, but decided that adding $request_body to one line in an NginX config was easier that re-working the structure of the API. So far so good.

That's FastCGI acting as a reverse proxy cache. Ideally in our setup, HAproxy would be replaced by a reverse proxy cache like Varnish, and NginX would just stupidly forward requests to PHP like it was earlier today...but I like HAproxy. Having a health-checking load-balancer on every web server affords some interesting failover opportunities.

Anyway, hope this helps someone. NginX can be a caching reverse proxy. Maybe not the best, but sometimes, just sometimes,  simple > faster.