One of the things that's been bugging me lately is the lack of strong typing in perl. Sure, you start off by thinking "OMG! There's only three data structures in perl! Great!" Later you make the realisation that there are, in fact, references, and you can build complex data structures and make useful programs. Later on, you go "what's this 'bless' thingy?" and learn about perl's module encapsulation, and the dirty, unclean "implementation" of the Object Oriented paradigm.
And you are happy. You can make nice, useful programs that work, with relative ease.
Until you realise that there are bastards out there that are really slack with their coding.
One of the concepts that seems foreign to perl programming is the idea of "robustness". That is, your code works the way it says it should, and tollerates errors gracefully. One reason for this is that perl is a weakly typed language: you only have three datatypes, and the best you can do with that is use subroutine prototyping to specify that how many of each you want. This means that if you want your code to be "robust", you have to check everything yourself, in your code to make sure its sane before continuing. And you have to do it over, and over, and over. So people forget about it, or don't do it because its so hard. But the downside of that is that you make assumptions about what gets handed to you. That might be fine when there's a small, closed group of people working on the code (with good style guides and peer review, etc), but when you're in the business of providing libraries that other people use, you can't make this assumption. In essence, any interface contract that you may define is not enforceable unless you want to do the high work overhead yourself. Strongly typed languages allow for interface contract, or specification enforcement in code, because when you define a function or method, you define the contract as well.
So, in perl we can't trust our given input, and verification of it is long and tedious. What can you trust? We can write a whole bunch of routines to verify this for us, but that can get cumbersome when writing your code. You can trust your own libraries, so there should be a way to use that. So I've been trying to think of ways to easilly enforce stronger typing, easy to use, and fairly transparent.
One method that I've come across is by using Attribute Handlers. These are funky bits of code that are called during compile time when you declare a function, variable, etc. You can use them for all sorts of things, from debugging, error handling, watching, enforcing calling conventions, permissions (public/private/protected), etc.
The first prototype I made was a simple method debugging handler. The idea is: when a function is called, the call is intercepted, and information about the call is gatherred, and then logged. Then the call is executed, and the results inspected and logged, and then handed back to the callee. This is the code to do it:
package MyDebug;
use strict;
use Attribute::Handlers;
use Data::Dumper;
sub Debug :ATTR {
my ($package, $symbol, $referent, $attr, $data, $phase) = @_;
my $name = join '::', *{$symbol}{PACKAGE}, *{$symbol}{NAME};
no warnings 'redefine';
*{$symbol} = sub {
print "DEBUG: Enterring: $name\n";
my @arr = @_;
my $wantarray = wantarray;
print "DEBUG: Called: $name(\n" . Dumper(@arr) ."\t)\n";
print "DEBUG: in " .($wantarray?'array':'scalar') ." context\n";
if ( $wantarray ) {
my @res = $referent->(@arr);
print "\nDEBUG: Returning array: (" . Dumper(@res) .")\n";
return(@res);
} else {
my $res = $referent->(@arr);
print "\nDEBUG: Returning scalar: " . Dumper($res) ."\n";
return($res);
}
};
}
1;
How to use it:
#!/usr/bin/perl -w
use strict;
use base qw( MyDebug );
sub foo : Debug {
print "LOOK AT ME I'M THE FUNCTION AND I'M DOING STUFF!\n";
}
foo();
So what's it doing? Note the magic "sub foo : Debug" in there? The ": Debug" in the delaration is specifying to use the attribute handler. The handler itself (in the first block of code) accepts a bunch of information about the function. Then it overwrites the function with a function of its own, that function collects and outputs information about how it was called, and presents it to the user. It then runs the original code, and collects information about those results too. It finally hands the results back to the callee.
By adding a simple handler to a subroutine declaration, I can easilly collect useful debugging information. I could add all sorts of things, like catching die statements, warnings, etc, recording them, then propogating them. You can do similar things with variables, such as monitoring them for changes (although I've got some work to do on making that a reality). With the additional knowledge that I can pass arguments into the handler, I can see that I can start doing much more. Type checking, for example...
So I wrote this rudimentary prototype:
package MyTypes;
use strict;
use Attribute::Handlers;
sub Type :ATTR {
my ($package, $symbol, $referent, $attr, $data, $phase) = @_;
my @tests = (ref($data)?@$data:$data);
my $name = join '::', *{$symbol}{PACKAGE}, *{$symbol}{NAME};
no warnings 'redefine';
*{$symbol} = sub {
my @arr = @_;
print "TYPES: Checking types for $name\n";
print "TYPES: Check rule: ".join(', ',map { "'$_'" } @tests)."\n";
# go through each attribute, and check for it
my $i=0;
foreach my $atty ( @tests ) {
my $pass = 0;
# this is rudimentary, and can be all sorts of things
# and lookup tables
MAGICSPRAY: foreach my $test ( split(/\|/,$atty) ) {
if ( $test eq 'undef' && not defined $arr[$i] ) {
# well, that's okay then
$pass = 1;
last MAGICSPRAY;
}
# catches an extra case that UNIVERSAL::isa doesn't
# (eg: ref(blah)=="")
if ( defined $arr[$i] and ref($arr[$i]) eq $test ) {
$pass = 1;
last MAGICSPRAY;
}
# allows for polymorph too
if ( defined $arr[$i] and UNIVERSAL::isa($arr[$i], $test) ) {
$pass = 1;
last MAGICSPRAY;
}
}
if ( ! $pass ) {
die("Argument $i is not valid (does not match test '$atty')");
}
$i++;
}
return $referent->(@arr);
};
}
1;
Used:
#!/usr/bin/perl -w
use strict;
use base qw( MyTypes );
sub foo : Type( "Foo::Bar", "|undef|SCALAR", "SCALAR", "HASH", "ARRAY" ) {
print "LOOK AT ME I'M THE FUNCTION AND I'M DOING STUFF!\n";
}
...
foo($foobar, "", $scalarref, {}, []);
This does something similar to the Debug handler above, but it accepts input and stores that in the generated subroutine. At runtime, when the function is executed, it checks the supplied arguments against the argument rules ("Foo::Bar", "|undef|SCALAR", "SCALAR", "HASH", "ARRAY"), and dies if it doesn't match. By using perl's inbuilt ref() and UNIVERSAL::isa methods, you can start using the trust in your code to verify input (sure, someone could bless something to "Foo::Bar" themselves, but they're being willfully negligent and thus you can argue that you don't care anymore). This is rudimentary, and so is the grammer defining the argument rules, but it works (for this simple example).
You can then see that you can do all sorts of things here, in this "pre-function" stage. I can define arbitrary data 'types', like "string_email", which checks that the argument is a string that contains a 'valid' email address. I can automatically instantiate objects if I'm only supplied an object ID, etc etc. The possibilities are endless.
And hey, look, you can stack them:
#!/usr/bin/perl -w
use strict;
use base qw( MyTypes );
use base qw( MyDebug );
sub foo : Debug : Type( "Foo::Bar", "|undef|SCALAR", "SCALAR", "HASH", "ARRAY" ) {
print "LOOK AT ME I'M THE FUNCTION AND I'M DOING STUFF!\n";
}
...
foo($foobar, "", $scalarref, {}, []);
Sure, this can be achieved inline using something like:
sub foo {
@_ = Debug(Type( ["Foo::Bar", "|undef|SCALAR", "SCALAR", "HASH", "ARRAY"], @_ ));
...
}
But this is way cooler, and I think its also way easier. The Debug example is a good example for doing it this way, as its much easier to achieve it using handlers instead of inline in the code. Handlers allow you powerful tools to be able to do all sorts of things that would otherwise be impossible, or very difficult to do.