Site Updated and Software Publication

Just updated my software page to bring it a little more up-to-date with what I’ve been doing in the past few years. It’s funny to see how my coding style has evolved and grown continually over that time. There were some things I used to be showing off that were almost totally procedural; even some of the things I’m still linking to (but which I wrote two or three years ago) make very little use of functional programming or anonymous functions in JavaScript.

It’s really tempting to completely refactor some of that stuff, but I haven’t got the time. I actually have at least one more software project that I should really package up and publish — a thing that might be helpful for system administrators who are being annoyed by SSH dictionary attacks.

Writing something that works on your own machine is one thing. Making a full release? That’s quite different! It needs documentation, and that documentation needs to explain all the things that you “just know”. It needs some kind of install mechanism, or else it needs to explain how to install the stuff manually — and you need to make absolutely sure that you haven’t forgotten any of the things you wrote, changed, or tweaked while you were making it work!

Workaround for PEAR/PECL Failure with Message “ERROR: `phpize’ failed”

When you try to upgrade or install various PEAR (or PECL) packages, you may get the rather unhelpful error message “ERROR: `phpize’ failed”. For example, here’s the result I get when I try to install the pecl_http package:

root@finrod:~# pecl install pecl_http
pecl/pecl_http can optionally use PHP extension "iconv"
downloading pecl_http-1.6.3.tar ...
Starting to download pecl_http-1.6.3.tar (Unknown size)
................................................................................
................................................................................
.........................................done: 1,015,808 bytes
71 source files, building
running: phpize
Configuring for:
PHP Api Version: 20041225
Zend Module Api No: 20060613
Zend Extension Api No: 220060519
ERROR: `phpize' failed
root@finrod:~#

The error is actually caused by a bug in PHP itself (filed in PHP’s bug database as of 2004, and currently marked “Won’t fix”): If your installation of PHP was compiled with the --enable-sigchild flag on, then the return value from a pclose() call can’t be trusted. One of PEAR’s components, called PEAR::Builder, uses pclose() as part of the package installation process, to try to determine whether a given operation succeeded or not.

Even though the operation succeeds, pclose() returns -1, signaling a failure, and the rest of PEAR then takes pclose() at its word.

Is This Affecting Your Installation of PHP and PEAR?

If you’ve gotten an “ERROR: `phpize’ failed” message when trying to run a “pecl install” or “pear install” command, try running phpinfo() — if you see --enable-sigchild in the “Configure Command” section near the very top, then you’re most likely being bitten by this bug.

Potential Fixes and Workarounds

The PHP dev team recommends recompiling without the offending flag.

However, you may not be able to do that, for any of various reasons. (You may have installed from a binary package, for instance — like most people these days.) Or it may simply seem like an excessive hassle. I offer the following patch as-is, without any guarantee or support.

First, ensure that you have the latest version of PEAR::Builder. Look in your PEAR/Builder.php file — On most Linux and Unix installations, this is likely to be in /usr/lib/php/PEAR/Builder.php, or possibly /usr/local/lib/php/PEAR/Builder.php.

On Windows systems, PHP might be installed nearly anywhere, but supposing it’s in c:\php, then the file you’re looking for will be in c:\php\PEAR\PEAR\Builder.php (yes, that’s two PEARs in a row).

Check the “@version” line in the big comment block at the beginning of the file; the line you want should be around line 19 or so. If says it’s less than version 1.38 (the latest one, at the time I’m writing this post), then try upgrading. Running “pear upgrade pear” should work. Then you can install this patch file:

patch-pear-builder-1.38.txt

Download the patch file and place it somewhere on your machine. Log in and cd to the PEAR directory that contains the Builder.php file. Then run the patch command. In the following example, I’ve placed the patch file in root’s home directory:

root@finrod:~# ls
loadlin16c.txt loadlin16c.zip patch-pear-builder-1.38.txt
root@finrod:~# cd /usr/lib/php/PEAR
root@finrod:/usr/lib/php/PEAR# cp Builder.php Builder.bak.php
root@finrod:/usr/lib/php/PEAR# patch -p0 < /root/patch-pear-builder-1.38.txt
patching file Builder.php
root@finrod:/usr/lib/php/PEAR#

Naturally, if the patch file doesn’t work for some reason, or it breaks things, you can just cp the backup file back into place.

Please let me know if this patch works for you — or if it fails horribly, for that matter.

[Updated 2009-06-03: Minor edits for clarity]

When Have You Accomplished Enough?

Okay, let me see if I can take stock of the day:

I started off by getting my /etc, /usr/local/bin and /usr/local/sbin, and /var/named directories under version control. That’s good. Plus I think I’ve got things set up to where I can upgrade WordPress plugins on my local setup, then reliably push the changes through version control to my live site.

Oh, and my Twitter feed importer is a little prettier, in terms of how it displays how long ago a tweet was posted.

But then there’s the Live+Press plugin… I have high hopes that I’ll be able to use that to automatically crosspost from here to my new Dreamwidth account, but for now, it only seems to communicate with Livejournal. Since there’s a feature request open in the project’s wish list to make it work with other LJ-codebase sites, I figure I may as well pick that up and run with it.

Of course, that just slows me down on LJ Content Sieve… *sigh*

Because I don’t have everything done, I feel like I didn’t accomplish much today. That’s silly, but knowing that it’s silly doesn’t chance my feelings much.

McAfee: Failing at Security Since 2005

Back in 2005, I was a “geek for hire” and did a lot of general troubleshooting for end-users. Including malware removal and general PC tune-ups. One client wanted me to install some software, including McAfee’s main end-user product at the time — I don’t recall the name.

I do recall, however, that my head nearly exploded when I found that the product required the user to turn on ActiveX… and not even restrict it to local execution only! I informed the client (with as little ranting as I could) that this was an extremely bad idea, and strongly advised that he get a different security software suite, and ditch the McAfee product as quickly as he could.

It seems that McAfee has not improved their security practices since then. People are saying that a security company should know better. I agree, but I’m not all that surprised; this just looks like more of the same stuff I saw from them back in ’05.

TDD and Peace of Mind

Let’s face it, we’re not perfect. As much as I might realize that automated testing is a good practice, it still feels like a chore sometimes. In my latest round of personal-project development, just setting up a decent set of test fixtures and a working test framework turned into something of a hassle, as it’s my first attempt both at Greasemonkey scripting and at building a script that will act on Livejournal pages. (Since LJ users can customize their views with any of 36 basic “styles”, this means quite a few fixtures.)

So it was awfully tempting to say “screw this!” and just start writing code — you know, “code that works, code that gets stuff done”. Actual program logic, instead of testing tools. “Why don’t I just get something built to begin with,” I asked myself, “and then I can try to test that?”

Ha, ha. Of course, we all know that starting off without any tests just makes it easier to continue without them later. So I took the virtuous road, forced myself to get tests working, but allowed myself to skimp by only setting up fixtures for 3 of the 36 styles. The other styles are something I will have to go back and fill in before I can release, so there’s no chance I’ll “give myself permission” to blow them off.

When you really want to be writing code (because let’s face it, you consider yourself a coder, not a tester), it’s pretty annoying to write tests instead. But writing tests is at least a form of writing; if you’re not even writing tests, but rather setting up a test framework and fixtures, that’s almost excruciating.

But it all became worth it the moment I made a change in one of my basic data structures. I had a structure (what a Perlist would call a hash of arrays) that held information about how to identify which LJ style the page uses. Then I realized I didn’t want to have a separate structure for how to manipulate the page; instead, all information about LJ’s styles and DOM structure should live in one place.

So I altered the structure, then altered the functions created so far that rely on it. And that’s where I would then have to wonder, “How badly did I break everything?” Instead, I just ran my JSunit tests again. And seeing them pass was instant peace of mind. I don’t have to worry that there’s some hidden flaw waiting there, ready to be exposed by a user doing something unexpected. And as I add the other styles, I can easily be sure my code works for them, as well.

I’m still fairly early in the development of this code, and while unit testing has cost me a bit of time, it’s given me back peace of mind. The time profit? I have no doubt that will come later. (And had I made any errors in my data-structure change, the unit tests would have helped me find them more quickly than the usual debugging methods.)

On Complexity Versus Efficiency

I sometimes imagine how I would teach certain concepts, if I were put in charge of a class. (Not just in programming, either; many people who know me have said I’d make a great teacher; perhaps I’ve taken it to heart.) One of the concepts in programming that I feel has a particularly poor “ease-of-teaching versus importance-of-knowing” ratio is algorithmic complexity, best described using Big O Notation.

I just came across a great example of one use of Big O Notation, in examining the trade-off between conciseness and efficiency in Perl functions to find the smallest member of an array. While looking around for a good Levenshtein algorithm implementation, I noticed this helper function:

# minimal element of a list
#
sub min
{
    my @list = @{$_[0]};
    my $min = $list[0];

    foreach my $i (@list)
    {
        $min = $i if ($i < $min);
    }

    return $min;
}

And I thought, "Hmmm, that's a little odd. Why go through that entire loop, when you could just do:"

sub min {
	my @list = @{$_[0]};
	return (sort @list)[0];
}

Of course, hard on its heels came the thought: "Wait a second... how do those stack up in Big O?"

Without even bothering to look at the source code, I'll assume Perl's sort() is an implementation of Quicksort, and will achieve O(n log n) performance on most sane data. But the min() function supplied at Merriam Park is O(n); for any situation where log n is greater than 1, the n log n function will take longer than the just-plain n algorithm.

This would make a wonderful "teachable moment": point out to the class that my function is shorter, and easier to read and understand, but the other one is more efficient. This is exactly the sort of thing that Big O Notation is useful for, and it's something that can't be easily expressed just by looking at the code itself.

(Not that I'd try to tell the class that one way or the other was automatically better. The real usefulness of this example is the way the psychological complexity and the computational efficiency are in opposition to each other. It makes a great springboard to talk about how we decide on what trade-offs to make in engineering.)

Why Downloadable Documentation Is Critical

PHP is a kludged-together, ugly mess of a language. But its documentation is quite superlative: practically every function has documentation written in a more-or-less standardized format, plus whatever comments users have added. In addition, they have something near and dear to my heart:

Downloadable documentation.

This means that if I’m developing on an airplane at 30,000 feet, or on a BART train in the tunnel that runs underneath San Francisco Bay, I can just use my own local copy of the docs.

I’ve long been annoyed with the state of both Ruby’s documentation and Ruby On Rails’ docs. Both of them use the awkward, quadruple-framed RDoc format that breaks up screen real estate inefficiently, makes keystroke navigation difficult if not impossible, and forces the browser to load a 600K page listing every single Rails method, even if what I really want is just the Class hierarchy listing.

But the other thing that’s annoyed me about Ruby’s and Rails’ docs for a while is that there is no downloadable version available. If I haven’t got an Internet connection available, I can’t look up the docs. End of story.

Which is a penalty I suppose I’ve learned to accept when I’m travelling. But at my home workstation, I should be able to see the docs any time I want, right?

api.rubyonrails.org site showing domain squatter's ad page

No, not if the people maintaining the Ruby On Rails web site screw up and let their domain registration lapse. Because of their mistake, I can’t do the development I was trying to get done tonight. (So I’m writing a snarky blog entry instead.)

Always make your docs downloadable. If the Rails team had done that, I’d be coding right now, instead of trying to shame them into making their documentation usable. (Heck, I might not even have noticed their error, because I’d just have opened my own local copy instead of going to their site in the first place.)

Making a Field Required Doesn’t Make It Truthful

Today’s lesson for people who make fields “required” on their web forms:

You can make it “required”, but you can’t force people to tell you the truth.

I recently filled out a form for a service that will eventually ship a book to me. I understand why they needed my street address, my credit card information, and so on… but my phone number? I’ve gotten quite accustomed to typing “you don’t need this” into phone number fields on web forms, because it’s the truth. The company I’m dealing with, 9 times out of 10, doesn’t need my phone number.

In this case, I suspect they wanted it in case they had to contact me regarding any shipping problems, so instead of my usual, I typed in “email me instead”. The nice thing is, the form actually accepted that. Sometimes, they demand to have actual numbers in them. Would you believe my phone number is 123-456-7890?

As long ago as the fall of 2005, I wrote that personal information was a hot commodity, and that users were becoming aware of it. If you run a site that has forms that require certain information, you might want to check through the data you’ve been receiving. I suspect that an astonishing number of your users have the same phone number as I do… and were born on New Year’s Day of some year or other.

Then again, I’m awfully nice about using such “obviously fake” data. I wouldn’t be surprised if some useres, when faced with a form that demands a phone number, just press lots of keys until they’ve got enough numbers.

Two questions, then, for those whose databases have numbers like 123-456-7890 in them:

  1. Obviously, you can’t trust any phone number in the 123 area code. How much of the rest of your information can you trust?
  2. Given that at least some of the data you’re getting is untrustworthy, what purpose is being served by requiring such data at all?

Back to My Usual Server

After a couple of months of temporary hosting with A2 Hosting, I finally have my real server back online, at a colo space in San José. Not that I have any complaints against A2 (heck, I even just gave them a little more Google-juice); they were perfectly servicable. But I’ve really gotten used to having my own box that I have root on and can do whatever I want with; by comparison, cPanel just isn’t enough.

Possibly most important — certainly more important than I would have expected just a few years ago — is the fact that now I have my SVN repository back. I’ve recently checked in two months’ worth of code changes. I wound up describing the extreme discontinuity as “the Spring 2009 Downtime”, because the first few ideas that drifted through my head just sounded a little too grandiose and overblown.

Still, I’m reminded of a scene in Joe Haldeman’s early novel Mindbridge. A team of interstellar explorers get sent off to an unknown planet via teleportation technology, wearing fully self-contained environment suits that keep them alive for two weeks. When they’re teleported back to Earth and a safe enviroment, Haldeman writes: “they scrambled out [of their suits] to an orgy of backscratching”.

I know the feeling.

Where Netbooks Are Taking Us

If you’re working on software development, you should absolutely read Clive Thompson article in Wired, The Netbook Effect: How Cheap Little Laptops Hit the Big Time. Thompson points out that the rise of the netbooks showed us that “traditional PC users…. didn’t want more out of a laptop—they wanted less.”

Says Thompson:

I wrote this story on a netbook, and if you had peeked over my shoulder, you would have seen precisely two icons on my desktop: the Firefox browser and a trash can. Nothing else.

It turns out that about 95 percent of what I do on a computer can now be accomplished through a browser. I use it for updating Twitter and Facebook and for blogging. Meebo.com lets me log into several instant-messaging accounts simultaneously. Last.fm gives me tunes, and webmail does the email. I use Google Docs for word processing, and if I need to record video, I can do it directly from webcam to YouTube. Come to think of it, because none of my documents reside on the netbook, I’m not sure I even need the trash can.

This is big news to anyone designing applications, whether they’re Web-based, cloud-computing SaaS apps or standalone Windows apps in the old-school vein. If you’re designing for the old school, you need to remember: the game is changing. People are expecting different things, and measuring your app against a different yardstick.

And if you’re designing for the cloud? Hey, consider that your users might have a smaller screen… People designing for the Web have always tried to get away with assuming bigger screens than they should. I realize working in such constrained real estate is a hassle, but it’s long past time to get over it.

Thompson has an interesting response to those who say, “But what about Photoshop?”, too.