All these years, since the day I first turned on a linux distribution, I’ve ignored vi/vim. Sure, there are swarms of geeks covering you with saliva as they spew fact after fact about how superior vim is to everything else, but to me it’s always been “that editor that is on every system that I eventually replace with pico anyway.”

Not anymore. Starting a few years back, I’ve done all of my development in Eclipse. It has wonderful plugins for PHP, C++, Javascript, etc. The past week or so I’ve been weening myself off of it and diving into vim. What actually got me started is I bought a Droid 2 off ebay for various hacking projects (I’m planning on reviewing it soon). Well, it was really easy to get vim working in it (sorry, lost the link already). I thought, well, shit, I’ve got vim, what the hell can I do with it? First things first, let’s get a plugin for syntax coloring/indentation for a few of my favorite languages. What?! It has all of them already.

Ok, now I’m interested. I installed vim for Windows (gvim), which was followed by a slow-but-steady growing period of “well, how do I do this” and “HA…I bet vim can’t do THI…oh, it can.” There are “marks” for saving your place in code, you can open the same file in multiple views (aka “windows”), you can bind just about any key combination to run any command or set of commands, etc. I even discovered tonight there’s a “windows” mode for vim that mimics how any normal editor works. I hate to admit it, but I’ll be using that a lot. One feature that blew my mind is the undo tree. Not stack, tree. Make a change, undo, make a new change, and the first change you did before your undo is still accessible (:undolist)!

The nice thing about vim is that it saves none of its settings. Every change you make to it while inside the editor is lost after a restart. This sounds aggravating, but it actually makes playing with the editor really fun and easy. If I open 30 windows and don’t know how to close them, just restart the editor. There are literally hundreds of trillions of instances when I was like “oh, shit” *restart*.

Once you have a good idea of what you want your environment to be like, you put all your startup commands in .vimrc (_vimrc on Windows) and vim runs it before it loads. Your settings file uses the same syntax as the commands you run inline in the editor, which is awesome and makes it easy to remember how to actually use vim.

So far I’m extremely impressed. The makers of vim have literally thought of everything you could possibly want to do when coding. And if they haven’t thought of it, someone else has and has written a plugin you can drop into your plugins directory and it “just works.” Speaking of plugins, vim.org’s plugin list seems neverending. I was half expecting to see most plugins have a final mod date of 2002 or something, but a good portion have newer version released within the past two weeks. It seems the ones that are from 2002 never get updated because they’re mostly perfect. Excellent.

I do miss a few things though. First off, the project file list most editors have on the left side. I installed NERDTree to alleviate that pain, but honestly it’s not the same as having my right click menus and pretty icons. I’m slowly getting used to it though. The nice thing about a text-only file tree is that in those instances where you only have shell access and need to do some coding, there isn’t a dependency on a GUI.

Tabs are another thing I miss. Gvim has tabs, but they aren’t one tab == one file (aka “buffer”) like most editors. You can hack it to do this, sort of, but it works really jenky. Instead I’m using MiniBufExplorer, which takes away some of the pain. I actually hacked it a bit because I didn’t like the way it displays the tabs, which gave me a chance to look at some real vim script. It’s mostly readable to someone who’s never touched it before.

That about does it for my rant. Vim is fast, free, customizable, extendable, scriptable, portable, wonderful, etc…and I’ve barely scratched the surface.

Let me preface this by saying I know neither of the two people involved in this situation nor have any connection to them other than the fact that I use both Google and Twitter.

A Google tech writer recently accused Twitter engineer of sexual assault on her blog, and given the responses shot at both sides (Noirin Shirley, accuser, and Florian Leibert, accused) I thought I’d inject my personal thoughts on both the actual report given by Noirin and the responses to the incident.

First off, it’s a big deal to make an accusation like this. Careers hang in the balance, and blah blah blah, we’ve all heard this already. That said, a lot of women are sexually assaulted and never mention it. A lot tell a few people and it never goes anywhere. A lot try to get help but it never comes.

I think it’s not only amazing, but brave that Noirin had the guts to stand up to her assaulter and accuse him in public. It takes brass balls to do this. It also takes brass balls to do this knowing full well the responses you’re going to get because of it. I’m not one to not take stands on things, so I will say I think she’s awesome. I’m sick of women getting pushed around and there being no consequences for the men doing it.

I also do know that women make false accusations, but in my experience the ones who do so have a history of doing so and don’t start off doing it later on in life.

Now, at least one publication is saying that although it’s great to be public about this matter, it’s not ok to be public about the assailant’s name. I have to disagree. So many assaults go unresolved because it’s hard to prove unless you have a police officer right there watching, or at least 10 witnesses. Something like this wouldn’t hold up in court. It’s important that the person who did it be publicly recognized for his actions, because otherwise there very well may be no consequences, ever.

A lot of people are saying that she should say absolutely nothing until the police investigate and the courts make a decision. I have to wonder if they are batshit insane. First off, the police generally have “more important” things to worry about than “hey sum guy jus touched my privatz,” unfortunately. And without any material evidence, it will never hold up in court. What I’m getting at is that even though I love our justice system here in good old USA, there are many things that will fall through the cracks. Does Noirin really need the police or court system to validate what she experienced that night? That’s fucking insane! She knows what happened better than the police or courts, and has every right to talk about it. Plus, she’s opening herself up to a world of legal trouble by doing this, which is just one more reason she’s brave for doing it (and one more incentive to NOT do it falsely).

Let me put it this way: If somebody assaults you, you have the right to fucking let the world know who did it and what happened! Just because it won’t hold up in court (and believe me, it won’t) doesn’t mean it didn’t happen, and doesn’t mean the assailant shouldn’t suffer the social consequences. If a rape happens in the woods and nobody is there to witness it, did it happen? The courts, rightfully so, say “No.” But it still happened, and the aggressor needs to pay for it in some way.

If she lied about it, then that’s another issue entirely. If it did happen, as she said it did, then good for her for letting the world know and making the world that much safer for women.

Either way, there are some very good counter-arguments and discussion on the reddit comments page for the post, which I spent a good amount of time reading before making this post.

Let’s talk failover. Most tools for failover (keepalived, heartbeat, wackamole/spread) use a protocol known as multicast. Multicast acts as a sort of “bulletin board” between computers. Anybody on the network can look at the bulletin board, and anybody on the network can post to the bulletin board. Normally, failover tools use multicast to pass messages between computers. For instance you could have three computer on a network, all posting and listening to the same multicast group: “Hey, I’m alive!” If one of the machines stops sending this repetitive  message, the others know that something is wrong…either it has been disconnected or gone down, etc. They can use that information to act: was that computer hosting a shared IP? Give the IP to one of the computers that are still responding. This is the general idea behind IP-based failover.

Now, there’s no inherent problem with multicast. It’s generally known for being unreliable, but when all you’re sending is “Hi!” over the wire, data integrity isn’t a high priority. The problem with multicast in reality is that most “cloud” (VPS) providers (AWS, Linode, Slicehost, Rackspace, etc) don’t support it on their networks. You can send a multicast message to a group, but your other machines listening on that group won’t hear it. The other problem with multicast is that the failover tools mentioned above ONLY support multicast. There is no way to tell them to listen to another machine directly over unicast, which is supported by cloud providers.

One way you can solve this is by using GRE tunnels, which allow you to create a tunnel to another computer with everything inside encrypted. This allows multicast communications to pass between two computers, even if the router blocks them normally.

I recently tried to get this set up on my current host, Linode. I was not successful, even with the help of another member who had the same problem (but solved it with GRE). I just could not get two machines to talk to eachother over a GRE tunnel with keepalived.

The solution

I posted my question to serverfault.com in a last resort (video). I’d asked more or less the same question there before, but didn’t get the answer I wanted. This time, I hit a jackpot though.

Willy Tarreau, creator of HAProxy, responded with a patch to keepalived that allows it to communicate over unicast. I applied it, recompiled, set up the new options the patch gives (“vrrp_unicast_bind” &”vrrp_unicast_peer“), and spun it up on both machines.

Yesss!! It works! Stopping HAProxy on the first server made the second machine take the shared IP.

Now, ideally there would be a bunch of machines, namely all my web servers that would be standing by ready to take the shared IP. This patch only allows me two machines. Failover is failover though, when one instance goes down, I get an email and can go in and investigate.  I’d still like to know if there is a way to do failover on a cluster of servers without multicast, but for now this works great.

We were writing some parsing code for a client today. It takes a long string (html) and parses it out into array items. It loops over the string recursively and running a few preg_replaces on it every pass. We got “out of memory” errors when running it. After putting in some general stats, we found that memory usage was climbing 400k after each block of preg_replaces, which was being added on each loop (there were around 600 loops or so). This memory just grew and grew, even though the recursion at most got 6 levels deep. It was never being released.

I did some reading and found that the preg* functions cache up to 4096 regex results in a request. This is the problem…a pretty stupid one too. It would be nice if they made this a configurable option or at least let you turn it off when, say, you are running a regex on a different string every time (why the hell would I run the same regex on the same string twice…isn’t that what variables are for?) Unless I’m misunderstanding and PHP caches the compiled regex (but not its values)…but either way, memory was climbing based on the length of the string.

Since the regex was only looking at the beginning of the string and disregarding the rest (thank god), the fix was easy (although a bit of a hack):

$val = preg_replace('/.../', '', $long_string);

Becomes:

$short_string = substr($long_string, 0, 128);
$val = preg_replace('/.../', '', $short_string);

PHP guys: how about an option to make preg* NOT have memory leaks =).

After reading an article about how the number of phone calls made is decreasing, I feel I have to interject something. This obviously shouldn’t be news to most people, because most of us are right in the middle of it (in North America, anyway). The fact is that people are talking less and less in favor of texting each other. While this is an interesting shift in our culture, I’m starting to think things are going a bit too far.

It seems that since widespread adoption of the internet, although more and more people have become seemingly connected through social networking and other mediums, people are drifting further and further apart. A friend is no longer a friend. A real friend is now what a friend was, and a friend is someone you say “damn we haven’t talked in years, how r u?” to. Communities are popping up everywhere online that replace the communities around us physically.

This in itself I don’t feel is bad. A lot of people who never would have met are meeting and sharing new ideas. Information spreads more rapidly. Cultural consciousness is more global, which in most cases is a very good thing.

I think things start to go wrong when people get addicted to this information overload though. They use it as a fuel for everyday distraction, a replacement for the communities they live in, and a tool to deliver opinions and beliefs to them when they would have otherwise had to think (although this last item is true of most media).

Also, it’s one thing to not be in front of someone when you talk to them. A voice conversation can have emotion and depth, but it can also be quick and effortless. The fact that it’s being replaced by one-off messages that are 100% ignorable and have no real content to them is kind of sickening. I’ve heard arguments that “I text someone when it doesn’t make sense to have a whole conversation,” but I’ll see the same person texting back and forth with someone for 10 minutes straight. Or a text is delivered and the person who sent it squirms in anticipation for the reply, which may never come.

What’s wrong with a phone call? Granted, if you’re in a bar and it’s very loud, texting would be appropriate. If you call someone and they don’t pick up, either they don’t want to talk or, god forbid, they aren’t right next to their phone all times of the day. If you want to talk to someone, just call them. I don’t believe texting is a viable replacement for what was the last string of human contact we had.

That all said, I know it’s a giant ball and it rolls where it rolls and there’s no stopping it. There’s no problem with being aware of things that are going on around us though. I feel like each time a real connection between two real people is replaced with something artificial, our culture as a whole goes just a little bit more insane. I’m interested to see how this all pans out, mainly because I don’t have a whole lot of attachment to what our culture is now.

In my latest frenzy, which was focused on HA more than performance, I installed some new servers, new services on those servers, and the general complexity of the entire setup for beeets.com doubled. I was trying to remember a utility that I saw a while back that would restart services if they failed. I checked my delicious account, praying that I had thought of my future self when I originally saw it. Luckily, I had saved it under my “linux” tag. Thanks, Andrew from the past.

The tool is called monit, and I’m surprised I ever lived without it. Not only does it monitor your services and keep them running, it can restart them if they fail, use too much memory/cpu, stop responding on a certain port, etc. Not only that, but it will email you every time something happens.

While perusing monit’s site, I saw M/Monit which allows you to monitor monit over web, essentially. The only thing I scratched my head about was that M/Monit uses port 8080 (which is fine) but NginX already uses port 8080, and I wasn’t about to change that, so I opened conf/server.xml and looked for 8080, replaced with 8082 (monit runs on 8081 =)). Then I reconfigured monit to communicate with M/Monit and vice versa, and now I have a kickass process monitor that alerts me when things go wrong, and also sends updates to a service that allows me to monitor the monitor.

I can’t look at things like queries/sec as I can with Cacti (which is awesome but a little clunky) but I can see which important services are running on each of my servers, and even restart them if I need to straight from M/Monit. The free download license allows to use M/Monit on one server, which is all I need anyway.

Great job monit team, you have gone above and beyond.

I decided this weekend I wanted to go down the road of trying out MySQL Cluster for beeets.com. The reason isn’t speed, it’s availability. After countless hours of research, I decided I’d rather have a plate of turds for breakfast than have to worry about Master-Master replication (or DRBD) w/heartbeat, not to mention what to do when things get out of sync. Not my cup of tea. MySQL Cluster may be a bit slower than a replicated setup (in almost all cases except for primary key lookup, I suspect), but to me it’s worth it to have a more set-it and forget-it approach. There are many benefits of cluster over replication:

  • Any server can go down. Assuming you have more than one replica of your data, you can lose any server in your setup and still be up and running. This can be achieved with replication, but it’s not as easy. You have to have some form of Master-Master replication, perhaps with DRDB, and some form of failover (usually heartbeat).
  • Your data  set scales. If you start running out of disk space with a cluster, just add a few more data nodes and your data will be spread out over them. With replication, each replicated server has to have enough storage to fit the entire database. That means if your dataset grows too large, you have to either partition (a hack, essentially) or upgrade your servers.
  • Your bandwidth scales. With a cluster, if you are running out of bandwidth, you can add more mysqld processes on your www servers or add more data nodes and your bandwidth scales almost linearly. With replication, you can only add so many slaves before your writes are the bottleneck. Then, once again, you have to look into things like circular replication (dangerous) or partitioning your data set (large updates to your app unless you have an insanely good ORM, big infrastructure change).

These are the main points that helped me decide. Historically, with a clustered approach, the entire dataset would have to fit in the memory of all the data nodes, which is somewhat restrictive if the dataset gets too large. Nowadays, the cluster only needs to store indexes in memory, and can store all non-indexed data on disk. There is talk of having completely disk-based store as well.

All that being said, I set up cluster, which was surprisingly easy. I’m not going to go over how to set it up or anything, just read the manual. After some benchmarking with the web API for beeets.com, the cluster setup appeared to be running about the same speed as the InnoDB setup when testing various commands…a pleasant surprise. It also appeared to handle concurrency a bit better.

Obviously once the dataset grows past a few megs and the traffic bumps up, we’ll revisit the benchmarking, but my hope is that what cluster loses in speed from your everyday general query, it gains in speed by having ability for higher concurrency.

This weekend I wen’t on a frenzy. I turned beeets.com from a single VPS enterprise to 4 VPSs: 2 web (haproxy, nginx, php-fpm, sphinx, memcached, ndb_mgmd) and 2 database servers (ndmtd). There’s still some work to do, but the entire setup seems to be functioning well.

I had a few problems though. In PHP (just PHP, and nothing else) hosts were not resolving. The linux OS was resolving hosts just fine, but PHP couldn’t. It was frustrating. Also, I was unable to sudo. I kept checking permissions on all my files in /etc, rebooting, checking again, etc.

The fix

Then I looked again. /etc itself was owned by andrew:users. Huh? I changed permissions back root:root, chmod 755. Everything works. Now some background.

A while back, I wrote some software (bash + php) that makes it insanely easy to install software to several servers at once, and sync configurations for different sets of servers. It’s called “ssync.” It’s not ready for release yet, but I can say without it, I’d have about 10% of the work done that I’d finished already. Ssync is a command-line utility that lets you set up servers (host, internal ip, external ip) and create groups. Each group has a set of install scripts and configuration files that can be synced to /etc. The configuration files are PHP scriptable, so instead of, say, adding all my hosts by hand to the /etc/hosts file, I can just loop over all servers in the group and add them automatically. Same with my www group, I can add a server to the “www” group in ssync, and all of a sudden the HAproxy config knows about the server.

Here’s the problem. When ssync was sending configuration files to /etc on remote servers, it was also setting permissions on those files (and folders) by default. This was because I was using -vaz, which attempts to preserve ownership, groupship, and permissions from the source (not good). I added some new params (so now it’s “-vaz –no-p –no-g –no-o”). Completely fixed it.

A while back I wrote a post about using NginX as a reverse-proxy cache for PHP (or whatever your backend is) and mentioned how I was using HAProxy to load balance. The main author of HAProxy wrote a comment about keep-alive support and how it would make things faster.

At the time, I thought “What’s the point of keep-alive for front-end? By the time the user navigates to the next page of your site, the timeout has expired, meaning a connection was left open for nothing.” This assumed that a user downloads the HTML for a site, and doesn’t download anything else until their next page request. I forgot about how some websites actually have things other than HTML, namely images, CSS, javascript, etc.

Well in a recent “omg I want everything 2x faster” frenzy, I decided for once to focus on the front-end. On beeets, we’re already using S3 with CloudFront (a CDN), aggressive HTTP caching, etc. I decided to try the latest HAProxy (1.4.4) with keep-alive.

I got it, compiled it, reconfigured:

defaults
	...
	option httpclose

became:
defaults
	...
	timeout client  5000
	option http-server-close

Easy enough…that tells HAProxy to close the server-side connection, but leave the client connection open for 5 seconds.

Well, a quick test and site load times were down by a little less than half…from about 1.1s client load time (empty cache) to 0.6s. An almost instant benefit. How does this work?

Normally, your browser hits the site. It requests /page.html, and the server says “here u go, lol” and closes the connection. Your browser reads page.html and says “hay wait, I need site.css too.” It opens a new connection and the web server hands the browser site.css and closes the connection. The browser then says “darn, I need omfg.js.” It opens another connection, and the server rolls its eyes, sighs, and hands it omfg.js.

That’s three connections, with high latency each, your browser made to the server. Connection latency is something that, no matter how hard you try, you cannot control…and there is a certain amount of latency for each of the connections your browser opens. Let’s say you have a connection latency of 200ms (not uncommon)…that’s 600ms you just waited to load a very minimal HTML page.

There is hope though…instead of trying to lower latency, you can open fewer connections. This is where keep-alive comes in.

With the new version of HAProxy, your browser says “hai, give me /page.html, but keep the connection open plz!” The web server hands over page.html and holds the connection open. The browser reads all the files it needs from page.html (site.css and omfg.js) and requests them over the connection that’s already open. The server keeps this connection open until the client closes it or until the timeout is reached (5 seconds, using the above config). In this case, the latency is a little over 200ms, the total time to load the page 200ms + the download time of the files (usually less than the latency).

So with keep-alive, you just turned a 650ms page-load time into a 250ms page-load time… a much larger margin than any sort of back-end tweaking you can do. Keep in mind most servers already support keep-alive…but I’m compelled to write about it because I use HAProxy and it’s now fully implemented.

Also keep in mind that the above scenario isn’t necessarily correct. Most browsers will open up to 6 concurrent connections to a single domain when loading a page, but you also have to factor in the fact that the browser blocks downloads when it encounters a javascript include, and then attempts to download and run the javascript before continuing the page load.

So although your connection latency with multiple requests goes down with keep-alive, you won’t get a 300% speed boost, more likely a 100% speed boost depending on how many scripts are loading in your page along with any other elements…100% is a LOT though.

So for most of us webmasters, keep-alive is a wonderful thing (assuming it has sane limits and timeouts). It can really save a lot of page load time on the front-end, which is where users spend the most of their time waiting. But if you happen to have a website that’s only HTML, keep-alive won’t do you much good =).

Recently I’ve been working on speeding up the homepage of beeets.com. Most speed tests say it takes between 4-6 seconds. Obviously, all of them are somehow fatally flawed. I digress, though.

Everyone (who’s anyone) knows that gzipping your content is a great way to reduce download time for your users. It can cut the size of html, css, and javascript by about 60-90%. Everyone also knows that gzipping can be very cpu intensive. Not anymore.

I just installed nginx’s Gzip Static Module (compile nginx with –with-http_gzip_static_module) on beeets.com. It allows you to pre-cache your gzip files. What?

Let’s say you have the file /css/beeets.css. When a request for beeets.css comes through. the static gzip module will look for /css/beeets.css.gz. If it finds it, it will serve that file as gzipped content. This allows you to gzip your static files using the highest compression ratio (gzip -9) when deploying your site. Nginx then has absolutely no work to do besides serving the static gzip file (it’s very good at serving static content).

Wherever you have a gzip section in your nginx config, you can do:

gzip_static on;

That’s it. Note that you will have to create the .gz versions of the files yourself, and it’s mentioned in the docs that it’s better if the original and the .gz files have the same timestamp; so it may be a good idea to “touch” the files after both are created. It’s also a good idea to turn the gzip compression down (gzip_comp_level 1..3). This will minimally compress dynamic content without putting too much strain on the server.

This is a great way to get the best of both worlds: gzipping (faster downloads) without the extra load on the server. Once again, nginx pulls through as the best thing since multi-cellular life. Keep in mind that this only works on static content (css, javascript, etc etc). Dynamic pages can and should be gzipped, but with a lower compression ratio to keep load off the server.