Friday, July 3, 2009

If this isn't an adventure, I'm not sure what is..

I named this blog "adventures in perl" because of the work that I do, being in Perl, and the fact that it's fairly zany stuff compared to my previous work experience. Here we go.

I have applications that start up in the morning, and shut down in the afternoon. They log very important data throughout the day detailing their actions. The log files are gargantuan in size (think TB), every day. We are forced to use LZMA to compress these suckers down so we're not in the business of inflating Seagate's stock price ;-) ... Anyway.. The compression starts shortly after 6pm, when a "spare" system mysteriously crashed, causing one of the pre-req ssh calls to hang. (our engineer forgot to put a timeout wrapper, or put a timeout argument on the ssh call).

The script "unstuck itself" the next morning, between me troubleshooting why the logs weren't transferred, and getting IT to fix the machine that crashed. So, the program went ahead and lzma'd all the log files for the current day, instead of yesterday.

I sat in horror as I realized that our programs were now writing to deleted files (lzma is kind enough to remove the file after it compresses it). Most of our programs shut down immediately at 4pm, so we had to act fast.

The solution, ended up being a simple perl script, that interrogated "lsof" for deleted files in our log directory, on all of our machines, and simply opened a file handle to the /proc/pid file descriptor matching the file in question. This incremented the "open" count, so even if the program that was writing to it went away, the file would not be marked for deletion.

We had a hook on our script that allowed us to signal it to copy files in proc back to the file system, after we had cleaned up the lzma files that were already there. The best part, we got to use the totally underrated "cluster ssh" to take command of all of our servers at once to fix the problem one time, for all of our machines.

Problem, solved, the program was easy to write, easy to verify, and performed perfectly on the first try. Hooray Perl!

Saturday, June 20, 2009

Benchmark.pm

Tim Bunce++ !! Benchmark.pm is one of the great libraries that separates Perl from other dynamic languages. It's my favorite module for determining the performance of code snippets that run external from my full application environment.

Sometimes, you tend to need to know what your response time is at given rates of input, so you can know your boundaries on maintaining a specified level of response time. (A webserver's response time to a dynamically generated request for example). Averages, rates, etc, tend to hide the fat tails on either end of the response time spectrum (of a given rate).

Taking your result set, and looking at from a percentile perspective gives you the fat tails, an idea of what your median execution time is, and it allows you to more easily compare two different rates (say, 100 page requests per second, vs 1000 requests per second).

In my world, it's how many quotes, or orders per second, and every microsecond counts. Finding a fat tail helps us expose design flaws, technical limitations, or simply algorithms that don't scale terribly well.

So, to summarize...

Benchmark is an awesome tool for comparing snippets of code.
Calculating percentiles is a great way to understand how a system is capable of handling varying levels of load, and how "fat" the tails are to understand a systems capacity.

Thursday, June 18, 2009

Why Parrot is important, or not..

http://www.linux-mag.com/cache/7373/1.html

Here's the summary that I got from this...

1. Parrot is neat, you can write languages very quickly and easily.
2. Parrot is a register based VM, which is better than a stack based one.
3. Concurrency is hard, Parrot isn't mentioned as being part of the solution, but if we ever figure it out, why not do it in Parrot so we don't have to do it again and again going forward?
4. Parrot is going up against Microsoft's CLR/DLR, and Sun's JVM. Parrot is forcing their hand to make improvements.

I feel like reading between the lines there is a defeatist attitude in the last three paragraphs, that MSFT and ORCL will just hulk smash Parrot with $$ and Parrot is one big academia project that won't end with a system capable of generating production level code. Seriously, that's ok though, they're doing all this for FREE, they don't owe me anything. I'm just thankful for Perl5, my entire livelihood is based on it, and it's an honor to stand on the shoulders of giants to do what I do.

I've been using what little free time I have to try to learn Perl 6 and Parrot (not exactly easy with our first baby arriving a little over a month ago). I'm not sure I want to invest, literally my future into it anymore. Hence my intrigue over Vala. Being a Perl/Java programmer in the past has got me pretty far, but seeing how I hate Java, I'd better find a decent replacement, and I'm not sure Parrot/Perl6 is it.

Vala?! Genie?!

Ok, let me be totally clear. I loathe, and detest java, in all forms. I love Perl. The dynamic nature of scalars, the quick and easy lists and hashes, it's all just too good.. The bonus is "strict" and "fields" that give you that warm fuzzy compiler error, instead of having to hunt issues down with endless unit tests (and even then, you'll probably never find them all)...

But, Perl isn't C++, and well, coding in C++ is about as much fun as pouring gasoline in my eyes. I NEED a language that is faster than Perl, but isn't C or C++, and for the love of god don't tell me about some VM that has magical powers. (I have more to say about Allison's post here in a second).

Sooo, a guy in my office showed me a link to Vala. Wow...

Reference counted memory management..
Braces / semicolons based syntax (or python, if you prefer, use Genie)
Compiles down to a binary (no virtual machine)..
It's only pre-requisite is glib..
Lambda expressions..

Wow wow wow.. So, I give up my untyped scalars, to get C level performance, no more ridiculous XS calls, and I get even tighter compile time safety?! Where do I sign up?

Sunday, May 31, 2009

The promises of Parrot?

I own both Parrot / Perl 6 books. I'm totally aware that they're totally out of date now, but I'm curious how much of the "original" ideas are going to hold on?

1. If you have a "Python" library, can you compile that to parrot bytecode, and then import that into Perl6 transparently and use it transparently in your Perl application?
2. Is the garbage collection system going to support a fallback "reference counted" system? Perl seems to have a "constant" burn (when it comes to memory management) that doesn't hiccup like Python, Java or .NET seems to have (without some tuning of course).
3. Will we ever be able to truly "compile" a Perl application into a binary (without statically linking the entire Parrot system into the executable?)
4. Guys that I work with claim that Parrot will fail to "be the best" because it's not based on LLVM, and because everyone else (Google, with unladen swallow) is playing the "Not invented here" card, instead of helping out Parrot. Competition is good, but it seems that Parrot / Perl has NO friends out there, did somebody burn some bridges or something?

High precision timestamps in Perl5/Perl6?

Why does Perl5 and Perl6 chop off a digit of precision on a high precision time call?

use strict;
use Data::Dumper;
use Time::HiRes qw( gettimeofday );

print gettimeofday() . "\n";
printf("%f\n", scalar(gettimeofday()));

Produces:

1243793974.02361
1243793974.023704


And, in Perl 6...

say time();
printf("%f\n", time());

Produces:

1243794052.38889
1243794052.391031

Perl 6!

I feel bad bugging people for help with Perl 6, especially when you're simply trying to compile the damn thing. Chromatic goes on and on about how doing lots of releases is really important for something to be viewed as "alive" and "progressing". I dunno. I waited with glee for the Parrot 1.0 announcement to go out, and then fumbled until this last week to get Rakudo running on my mac. No big deal, I'm an early adopter. I believe in Perl, and I know that it is going to rock when it's finished, so dealing with the headaches to get it set up are worth the trouble. But, not everyone is like me.

Anyway, I just wanted to point out this one VERY cool thing (in my eyes) that was like the second or third test program I wrote.. Heck, I'd love to make this into a test for the test suite if it isn't covered... (it addresses a weakness that Perl 5 had, that was documented)..

my $foo = "1f9";
$foo++;
say $foo;
$foo--;
say $foo;

This spits out...
1g0
1f9
A perl 5 equivalent will say...
2
1
But Perl 5 supports a simpler increment system (sort of)...

$foo = "f9";
$foo++;
print $foo . "\n";
$foo--;
print $foo . "\n";
This dumps out:
g0
-1
Hooray Perl 6!

(I found the test for this, it's S03-operators/autoincrement.t, it doesn't cover the "1f9" case, which doesn't work in Perl 5). I'll see if I can get commit bit access to this test next week!

My heretical love for compile time safety...

I have to admit, as a first blog post, this one may not be the best to introduce myself. I program in Perl, every day, mostly for work, but it's cool work none the less. I work in the financial industry, with nearly 13% of the US Stock market coursing through my models . We use lots of Perl to get lots of things done. Having only really seen Perl used in a web environment, I was blown away to see what Perl is really capable of.

I have always been a "use strict" guy, like many of you who would read this I have the t-shirt from thinkgeek with that on the front. My current position introduced to a module called "fields". It took me a while to appreciate it, and it was really only after I started writing code where I wasn't using it did I realize how awesome it is.

The point of my post is, quite simply that Perl fills a niche that no other scripting language (that I have seen) is capable of doing. It provides compile time safety on variables, as well as attributes of objects. People have accused me of having flimsy testing suites, and my only reply to that is that my codebase is so incredibly big (and most of it inherited from previous engineers) that we MUST know about the possibilities for a mistake, and are willing to pay any price to find out before we deploy. This simple validation of any changes we make drastically reduces any errors, and provides a cushion for any code path not covered by a regression or unit test.

I'd really like to be able to have compile time safety of methods, as well as their arguments, the closest way to seem to be able to do this, typesafety doesn't work at all for me, on my mac or on a linux box. I'm told that Perl 5.10 (which we haven't upgraded to yet) has a new method resolution system, I'm hoping there may be a way to override the whole thing and disallow sub-routine generation at runtime so I can do compile time checks. It may be just one big pipe dream, but we'll see.

So there you go. Fields is awesome, and it looks like this whole Moose thing kind of glosses over this, so I imagine that Perl 6 will as well, (if I'm wrong, please point this out!). Hash::Util::lock_keys is not the same. I want to do "perl -wc" and see my errors, right now..

Example:



use strict;
use warnings;
use Hash::Util qw( lock_keys );

my %hsh = ();
lock_keys(%hsh, qw(x y z));
$hsh{'A'} = 1;



This passes "-wc" with no problems...

We prefer a compile time failure to occur, like this:



package FieldTest;

use strict;
use warnings;
use fields qw(a b);

sub new {
my FieldTest $self = shift;
unless (ref $self) {
$self = fields::new($self);
}
return $self;
}

package main;

use strict;
use warnings;

my FieldTest $tmp = new FieldTest();
$tmp->{'c'} = 'whatever';