We were writing some parsing code for a client today. It takes a long string (html) and parses it out into array items. It loops over the string recursively and running a few preg_replaces on it every pass. We got “out of memory” errors when running it. After putting in some general stats, we found that memory usage was climbing 400k after each block of preg_replaces, which was being added on each loop (there were around 600 loops or so). This memory just grew and grew, even though the recursion at most got 6 levels deep. It was never being released.

I did some reading and found that the preg* functions cache up to 4096 regex results in a request. This is the problem…a pretty stupid one too. It would be nice if they made this a configurable option or at least let you turn it off when, say, you are running a regex on a different string every time (why the hell would I run the same regex on the same string twice…isn’t that what variables are for?) Unless I’m misunderstanding and PHP caches the compiled regex (but not its values)…but either way, memory was climbing based on the length of the string.

Since the regex was only looking at the beginning of the string and disregarding the rest (thank god), the fix was easy (although a bit of a hack):

$val = preg_replace('/.../', '', $long_string);

Becomes:

$short_string = substr($long_string, 0, 128);
$val = preg_replace('/.../', '', $short_string);

PHP guys: how about an option to make preg* NOT have memory leaks =).

UPDATE – Apparently closing the VM, unplugging the programmer, unselecting the programmer from the USB device menu, or pausing the VM after the programmer has been loaded by the VM makes Windows 7 bluescreen. So far, I have not found a way around this, as such the TOP2004 is effectively useless again. At least it’s able to program chips and stuff, but once loaded, the VM has to stay open and has to be running. Pretty lame. I’ll try to find a fix and update (BTW I’m using the latest VirtualBox as of this writing). Any ideas?


top2004I love electronics. Building basic circuits, programming microcontrollers, making malicious self-replicating robots programmed to hate humans, and even so much as wiring up complete motherboards with old processors and LCDs. I had to find a USB flash/eeprom programmer that fit my hardcore lifestyle. On ebay a few years back, I bought the TOP2004. This wondrous piece of Chinese equipment is cheap, cheap, and USB. I needed USB because in the process of making my own flash programmer a while back, I destroyed half the pins on my parallel port. The programmer worked great, but only worked for one chip. I needed something a bit more versatile. The top2004 isn’t a bad piece of equipment. The manual was translated poorly from Chinese, as is the software that comes with it.

Well, for the longest time, I was a Windows XP guy. Nowadays it’s all about Windows 7. Don’t get me wrong, I’m Slackware through and through, but I need my gaming. So I installed 64-bit Windows and love it, but my programmer no longer works.

Requirements: a 64-bit OS that doesn’t let you use 32-bit drivers (namely Windows 7 x64), a 32-bit version of Windows laying around, virtualization software (check out VirtualBox) which is running your 32-bit version of Windows, a Top2004 programmer, DSEO, and the infwizard utility with libusb drivers (virus free, I promise).

Here’s the fix:

  1. I remembered when jailbreaking my iPod a while back with Quickfreedom that there was a utility used to sniff out USB devices called infwizard, which I believe is part of the libusb package. I never liked libusb because I remember it royally messing up my computer, but the infwizard program was dandy. It can write very simple drivers for USB devices without any prior knowledge of what they are. I used this with the programmer plugged in to create a makeshift driver. Note: Make sure the libusb* files in that zip provided are in the same directory as the .INF file you create for the programmer.
  2. 64-bit Windows doesn’t like you to load unsigned drivers. In fact, it doesn’t allow it at all. You have to download a utility called DSEO (Driver Signature Enforcement Override) to convince Windows that it should let you load the driver you just created.
  3. Once you turn driver enforcement off and load up the driver, you should now be able to see your TOP programmer in the device list. Boot your VM, which previously couldn’t use the programmer (because it had no driver), and install v2.52 of the TopWin software. Once installed, you should be able to select the TOP2004 from the USB device list, and voilá…your programmer works.

Obviously running it in a VM is less than ideal, but it’s better than dropping $200 on a real programmer that might actually have 64-bit support. The great part about this version (2.52) of the TopWin software is that it supports the atmega168, which is almost exactly the same as the atmega328…meaning arduino fans new and old can use it. I’m not an arduino guy and use the chip just by itself with avr-gcc, but you can do whatever the hell you want once you get the TOP programmer working.

Capistrano is a sexy bitch. At least it was until I spent hours trying to figure out how to deploy to multiple servers. Updated Cap, Ruby, compiled Ruby from source twice, etc etc. Capistrano just kept hanging with pushing code to two or more servers at once. Note that I am in Cygwin, if that makes a difference. Also, when deploying with no password on my ssh key, it works…hmm.

Well I added this:

default_run_options[:max_hosts] = 1

To my deploy.rb, and although it now has to deploy to one server at a time, it works. Note that for two servers this is fine. For 200 it’s not so fine. I’ll worry about that when it comes though.

UPDATE!!!! Something I never thought about until now is that you can use ssh-agent to save your keys in memory pre-deploy. Then you have a password-protected key that works with Capistrano WITHOUT doing the max_hosts hack. This is tested (on cygwin) and working for me.

This will be a short post, but pretty cool.

You can add arrays together:

	$test1	=	array('name' => 'andrew');
	$test2	=	array('status' => 'totally gnar, dude');

	print_r($test1 + $test2);
	---------------------------
	Array
	(
	    [name] => andrew
	    [status] => totally gnar, dude
	)

Wow…who would have thought. And my most recent favorite, converting objects to events. It’s a simple foreach($object as $key => $val) and putting each element into a separate array right? WRONG:

	$array	=	(array)$object;

No fucking way. Casting actually works in this case. Why does nobody tell me anything?! This is great for parsing XML because any parser normally returns an object, and quite honestly, I hate dealing with objects. All database data is by default returned as an array usually,  and it’s a pain having some data sources being objects while others are arrays. Now it doesn’t matter…if you like objects, cast an associative array as an (object), if you like arrays cast with (array). I love PHP…