This weekend I wen’t on a frenzy. I turned beeets.com from a single VPS enterprise to 4 VPSs: 2 web (haproxy, nginx, php-fpm, sphinx, memcached, ndb_mgmd) and 2 database servers (ndmtd). There’s still some work to do, but the entire setup seems to be functioning well.

I had a few problems though. In PHP (just PHP, and nothing else) hosts were not resolving. The linux OS was resolving hosts just fine, but PHP couldn’t. It was frustrating. Also, I was unable to sudo. I kept checking permissions on all my files in /etc, rebooting, checking again, etc.

The fix

Then I looked again. /etc itself was owned by andrew:users. Huh? I changed permissions back root:root, chmod 755. Everything works. Now some background.

A while back, I wrote some software (bash + php) that makes it insanely easy to install software to several servers at once, and sync configurations for different sets of servers. It’s called “ssync.” It’s not ready for release yet, but I can say without it, I’d have about 10% of the work done that I’d finished already. Ssync is a command-line utility that lets you set up servers (host, internal ip, external ip) and create groups. Each group has a set of install scripts and configuration files that can be synced to /etc. The configuration files are PHP scriptable, so instead of, say, adding all my hosts by hand to the /etc/hosts file, I can just loop over all servers in the group and add them automatically. Same with my www group, I can add a server to the “www” group in ssync, and all of a sudden the HAproxy config knows about the server.

Here’s the problem. When ssync was sending configuration files to /etc on remote servers, it was also setting permissions on those files (and folders) by default. This was because I was using -vaz, which attempts to preserve ownership, groupship, and permissions from the source (not good). I added some new params (so now it’s “-vaz –no-p –no-g –no-o”). Completely fixed it.

If you’re experiencing LVM snapshot problems, namely:

  LV system/rootsnap in use: not deactivating
  Couldn't deactivate new snapshot

It may be related to udev. In some kernel version (from what I’ve been reading here and there), there’s a race condition between udev (the thing that makes /dev tick) and LVM. LVM creates a snapshot volume, udev grabs it, LVM loses control. You can fix this by editing your udev rules under /etc/udev/rules.d (keep in mind this fix was made on Slackware 12):

  1. Open /etc/udev/rules.d/50-udev.rules
  2. Find the line with `LABEL=”persistent_input_end”` and after this line, add
  3. KERNEL=="dm-[0-9]*",	OPTIONS+="ignore_device"
  4. Restart

That should fix it. Please keep in mind this is for Slackware 12, and even still may not work. If you really want to solve all your problems, please download and use Slack 13 :).