Several years ago, Gary Hart said something that has stuck with me ever since: “just because it’s messy doesn’t mean you don’t want to do it.” We were discussing plans for driving throughout the UK while there on business — but time and again I find this principle applies to other situations as well. Besides the obvious, um, recreational possibilities, I can think of many more: eating sloppy sandwiches, parenting, electing public officials, starting a relationship, and coding in PHP.
PHP powers a lot of the web, including such popular frameworks as WordPress, so you really need to have it in your programming portfolio. But it’s a messy language. I’m not just talking about the way that PHP tends to sprawl like semicolon-laden vines around your HTML, earning an alternative interpretation of “Page Hacked to Pasta”. Nor am I referring (this time) to its sucky object model. The language itself contains many little gotchas and “almost right”s that wait like snares for the feet of the unwary coder.
One I ran into yesterday (not for the first time) is PHP’s partial distinction between false and zero. In PHP, false is boolean, and zero is numeric. Fine and good. But if you treat a zero value as a boolean expression, the zero gets converted to a boolean false.
This seems innocent and helpful enough until you get to a function like strpos, which returns the position of one string within another. If the target is not found in the source, it returns boolean false. So, to make sure that one string doesn’t contain another, you’d think you’d do something like this:
if (!strpos($source, $target))
But there’s a problem here. If $target starts at the beginning of $source, strpos will return 0. Because it’s in a conditional, that gets converted to boolean false, and the test passes. So, you need to explicitly test for false instead, right?
if (strpos($source, $target) == false)
Wrong. Because you’re comparing it against a boolean value, the zero still gets converted to a boolean false. The only right way to do this is:
if (strpos($source, $target) === false)
The triple equal sign tells PHP to only evaluate to true if both operands have the same type and the same value, so no conversion takes place. Granted, the PHP docs have a great big warning on strpos to this effect, but the net result is something less than intuitive coding.
Languages like Java and C# solve this problem by not allowing automatic conversion between numeric and boolean values. That’s sort of like castrating the entire population to prevent birth defects. So why doesn’t this create problems for other languages?
In Synergy/DE, indices start at 1, so the instr function returns zero for not found:
if (!instr(1,source, target))
Quite logical and linguistically elegant (except for having the starting index as the first parameter), but then you don’t get to have the secret handshake.
Even though Ruby uses a starting index of 0, it prevents confusion in routines like String#index by returning nil instead of a number for not found. Zero is treated as an object — and thus is not false — while nil does evaluate to false. Thus,
print "good" if ("abcde".index("a"))
print "bad" if ("abcde".index("f"))
prints “good”, even though the index of “a” within “abcde” is 0. Why should 0 be false anyway?
In C and C++, indices start at 0, and 0 can be used as a stand-in for false (in fact, all falses are zero), yet I hardly ever make the mistake of “if (strpos(…))” in C. Why? Because the documented return value for “target not found” isn’t FALSE, it’s -1. I immediately know that to test for this return value I must say:
if (strpos(source, target) < 0)
The significance of the result does not rely on a distinction between returned values that can be automatically converted from one to the other. PHP, on the other hand, requires the programmer to explicitly prevent that conversion from happening. Automatic conversions are supposed to be a convenience for the programmer, not a trap.
Maybe PHP stands for Purposely Hampers Programming.