Friday, September 28, 2007

Operators and types

Ruby and Python overload the + operator for a large number of things, the most common ones being addition of numbers, concatenation of strings, and concatenation of tuples. Very different things are represented by the same syntax. In Perl these three roles are occupied by three different operators (+,. and ,). For that matter almost all operators for numbers are separated from operators on strings in Perl. This causes is one of the most common misconception among non-natives about Perl's typing system: it's not weakly typed. This piece of code:

my $foo = "1";
return $foo + 0;

does not cause an implicit type conversion, nor is the last statement in any way ambiguous. The addition operator causes an explicit conversion of its argument. It's not an implicit one for a simple reason: it's the Perl idiom for converting any variable to a number.

I think this is an excellent example of the waterbed theory of complexity. To reduce the number of operators in the language Python and Ruby use runtime polymorphism on data whose behavior is already known to the programmer (not to the compiler) at compile time. I cannot think of real-world code where you don't know if your variable is a number, a string or a tuple but want to do addition/concatenation nonetheless. It is trading semantic clarity for syntactic clarity. It's a valid choice, just as Perl's choice of separating them. I think a lot of rubyists and pythonistas fail to see that their choice has its disadvantages too.

Monday, September 17, 2007

Control structures in Perl

As I said in my previous entry, there is a need for education in social coding skills in Perl. Therefor I'll put my money where my mouth is ;-)

Basically there are four patterns for branching in Perl.

  1. Conditional statements: if(condition) { true_action }
    else { false_action }
  2. Statement modifiers action if condition;
  3. Logical operators: condition and action
  4. Ternary operator: condition ? true_action : false_action

How to decide which one to use? That depends on the situation of course. Only conditional statements and the ternary operator can provide an else clause. If you need one, your choices are already limited. All four put their emphasis differently. For example print $line unless $line =~ /^#/; may be better than
$line =~ /^#/ or print $line; Because the principles of prominence. This is a linguistic notion that tells us that people tend to prefer important things to be in front and details to be in the end so they can skip over the details when scanning the code. When skipping the second part the code still makes sense (even though it may not be correct anymore). In general statement modifiers are best used when the action is much more important than the condition.

Logical operators are useful in two situations. The first one is exactly the opposite of the statement modifier: When the condition is vastly more important than the action. This usage typically uses the low precedence version. For example: open my $filehandle, $filename or die "Can't open $filename: $!"; is better than die "Can't open $filename: $!" unless open my $filehandle, $filename; because the first one communicates the intend of the programmer (opening a file) better. Error handling is not important for understanding the big picture of the code.

It has another trait that is important for deciding when to use this pattern. Unlike the previous two patterns, logical operators are expressions not statements. As such they can be used in places where the former can not. parse($filename || "default.filename") is significantly easier to read than if(!$filename) { $filename = "default.filename"; } parse($filename);

Similarly my $id; if($input >= 0) { $id = $input; } else { $id = 1; } can be simplified using the ternary operator to my $id = $input >= 0 ? $input : 1

You may wonder now, when should I use old fashioned conditional statements then? First of all, if the action contains multiple statements and isn't suitable for putting in a function. It puts an equal emphasis on the condition and the action. It makes sense to use this pattern if you don't have a reason to do otherwise.

Conditional statements Statement modifiers Logical operators Ternary operator
Emphasis None Action Condition Condition
Expression No No Yes Yes
Else clause Yes No No Yes
Nestable/Chainable Yes, very well No Yes Yes
Multiple statements Yes No Yes No