Posts Tagged ‘perl’

Tatsumaki, or how to write a nice webapp in less than two hours

Monday, December 21st, 2009

Until today, I had a script named “lifestream.pl”. This script was triggered via cron once every hour, to fetch various feeds from services I use (like github, identi.ca, …) and to process the result through a template and dump the result in a HTML file.

Today I was reading Tatsumaki’s code and some examples (Social and Subfeedr). Tatsumaki is a “port” tornado (a non blocking server in Python), based on Plack and AnyEvent. I though that using this to replace my old lifestream script would be a good way to test it. Two hours later I have a complete webapp that works (and the code is available here).

The code is really simple: first, I define an handler for my HTTP request. As I have only one things to do (display entries), the handler is really simple:

package Lifestream::Handler;   
use Moose;                     
extends 'Tatsumaki::Handler';  
 
sub get {                      
    my $self = shift;          
    my %params = %{$self->request->params};
    $self->render( 'lifestream.html', {
        memes    => $self->application->memes($params{page}),
        services => $self->application->services
    });
}
1;

For all the get request, 2 methods are called : memes and services. The memes get a list of memes to display on the page. The services get the list of the various services I use (to display them on a sidebar).

Now, as I don’t want to have anymore my lifestream.pl script in cron, I will let Tatsumaki do the polling. For this, I add a service to my app, which is just a worker.

package Lifestream::Worker;    
use Moose;                     
extends 'Tatsumaki::Service';  
use Tatsumaki::HTTPClient;     
...
sub start {
    my $self = shift;
    my $t; $t = AE::timer 0, 1800, sub {
        scalar $t;
        $self->fetch_feeds;
    };
}
....
sub fetch_feeds {
    my ($self, $url) = @_;
    Tatsumaki::HTTPClient->new->get( $url, sub { #do the fetch and parsing stuff });
}

From now, every 60 minutes, feeds will be checked. Tatsumaki::HTTPClient is a HTTP client based on AnyEvent::HTTP.

Let’s write the app now

package Lifestream;            
 
use Moose;
extends "Tatsumaki::Application";
 
use Lifestream::Handler;       
use Lifestream::Worker;        
...
sub app {
    my ( $class, %args ) = @_;
    my $self = $class->new( [ '/' => 'Lifestream::Handler', ] );
    $self->config( $args{config} ); 
    $self->add_service( Lifestream::Worker->new( config => $self->config ) );
    $self;
}
...
sub memes {
...
}
 
sub services {
....
}

The memes and services method called from the handler are defined here. In the app method, I “attch” the “/” path to the handler, and I add the service.

and to launch the app

my $app = Lifestream->app( config => LoadFile($config) );
require Tatsumaki::Server;      
Tatsumaki::Server->new(
    port => 9999,
    host => 0,
)->run($app);

And that’s it, I now have a nice webapp, with something like only 200 LOC. I will keep playing with Tatsumaki as I have more ideas (and probably subfeedr too). Thanks to miyagawa for all this code.

MooseX::Net::API

Sunday, December 20th, 2009

Net::Twitter

I’ve been asked for $work to write an API client for backtype, as we plan to integrate it in one of our services. A couple of days before I was reading the Net::Twitter source code, and I’ve found interesting how semifor wrote it.

Basically, what Net::Twitter does is this: for each API method, there is a twitter_api_method method, where the only code for this method is an API specification of the method. Let’s look at the public timeline method:

twitter_api_method home_timeline => (
    description => <<'',
Returns the 20 most recent statuses, including retweets, posted by the
authenticating user and that user's friends. This is the equivalent of
/timeline/home on the Web.
 
    path      => 'statuses/home_timeline',
    method    => 'GET',
    params    => [qw/since_id max_id count page/],
    required  => [],
    returns   => 'ArrayRef[Status]',
);

The twitter_api_method method is exported with Moose::Exporter. It generates a sub called home_timeline that is added to the class.

MooseX::Net::API

As I’ve found this approch nice and simple, I thought about writing a little framework to easily write API client this way. I will show how I’ve write a client for the Backtype API using this (I’ve wrote some other client for private API at works too).

Backtype API

First we defined our class:

package Net::Backtweet;        
 
use Moose;
use MooseX::Net::API;

MooseX::Net::API export two methods: net_api_declare and net_api_method. The first method is for all the paramters that are common for each method. For Backtype, I’ll get this:

net_api_declare backtweet => (
    base_url    => 'http://backtweets.com',
    format      => 'json',
    format_mode => 'append',
);

This set

  • the base URL for the API
  • the format is JSON
  • some API use an extension at the name of the method to determine the format. “append” do this.

Right now three formats are supported: xml json and yaml. Two modes are supported: append and content-type.

Now the net_api_method method.

net_api_method backtweet_search => (
    path     => '/search',
    method   => 'GET',
    params   => [qw/q since key/],  
    required => [qw/q key/],
    expected => [qw/200/],
);
  • path: path for the method (required)
  • method: how to acces this resource (GET POST PUT and DELETE are supported) (required)
  • params: list of parameters to access this resource (required)
  • required: which keys are required
  • expected: list of HTTP code accepted

To use it:

my $backtype = Net::Bactype->new();
my $res = $backtype->backtweet_search(q => "http://lumberjaph.net", key => "foo");
warn Dump $res->{tweets};

MooseX::Net::API implementation

Now, what is done by the framework. The net_api_declare method add various attributes to the class:

  • api_base_url: base URL of the API
  • api_format: format for the query
  • api_format_mode: how the format is used (append or content-type)
  • api_authentication: if the API requires authentication
  • api_username: the username for accessing the resource
  • api_password: the password
  • api_authentication: does the resource requires to be authenticated

It will also apply two roles, for serialization and deserialization, unless you provides your own roles for this. You can provides your own method for useragent and authentication too (the module only do basic authentication).

For the net_api_method method, you can overload the authentication (in case some resources requires authentication). You can also overload the default code generated.

In case there is an error, an MooseX::Net::API::Error will be throw.

Conclusion

Right now, this module is not finished. I’m looking for suggestions (what should be added, done better, how I can improve stuff, …). I’m not aiming to handle all possibles API, but at least most of the REST API avaible. I’ve uploaded a first version of MooseX::Net::API and Net::Backtype on CPAN, and the code is available on github.

For testing purpose, i’ve set a dumb REST service here (the code is here). I will update this service to add more tests to MX::Net::API.

Riak, Perl and KiokuDB

Sunday, December 13th, 2009

As I was looking for a system to store documents at $work, Riak was pointed to me by one of my coworkers. I’m looking for a solution of this type to store various types of documents, from HTML pages to json. I need a system that is distributed, faul tolerant, and that works with Perl.

So Riak is a document based database, it’s key value, no sql, REST, and in Erlang. You can read more about it here or watch an introduction here. Like CouchDB, Riak provides a REST interface, so you don’t have to write any Erlang code.

One of the nice things with Riak it’s that it let you defined the N, R and W value for each operation. This values are:

  • N: the number of replicas of each value to store
  • R: the number of replicas required to perform a read operation
  • W: the number of replicas needed for a write operation

Riak comes with library for python ruby PHP and even javascript, but not for Perl. As all these libraries are just communicating with Riak via the REST interface, I’ve started to write one using AnyEvent::HTTP, and also a backend for KiokuDB.

Installing and using Riak

If you interested in Riak, you can install it easily. First, you will need the Erlang VM. On debian, a simple

sudo aptitude install erlang

install everything you need. Next step is to install Riak:

wget http://hg.basho.com/riak/get/riak-0.6.2.tar.gz
tar xzf riak-0.6.2.tar.gz
cd riak
make
export RIAK=`pwd`

Now, you can start to use it with

./start-fresh config/riak-demo.erlenv

or if you want to test it in cluster mode, you can write a configuration like this:

{cluster_name, "default"}.
{ring_state_dir, "priv/ringstate"}.
{ring_creation_size, 16}.
{gossip_interval, 60000}.
{storage_backend, riak_fs_backend}.
{riak_fs_backend_root, "/opt/data/riak/"}.
{riak_cookie, riak_demo_cookie}.
{riak_heart_command, "(cd $RIAK; ./start-restart.sh $RIAK/config/riak-demo.erlenv)"}.
{riak_nodename, riakdemo}.
{riak_hostname, "192.168.0.11"}.
{riak_web_ip, "192.168.0.11"}.
{riak_web_port, 8098}.
{jiak_name, "jiak"}.
{riak_web_logdir, "/tmp/riak_log"}.

Copy this config on a second server, edit it to replace the riak_hostname and riak_nodename. On the first server, start it like show previously, then on the second, with

./start-join.sh config/riak-demo.erlenv 192.168.0.11

where the IP address it the address of the first node in your cluster.

Let’s check if everything works:

curl -X PUT -H "Content-type: application/json" \
    http://192.168.0.11:8098/jiak/blog/lumberjaph/ \
    -d "{\"bucket\":\"blog\",\"key\":\"lumberjaph\",\"object\":{\"title\":\"I'm a lumberjaph, and I'm ok\"},\"links\":[]}"
 
curl -i http://192.168.0.11:8098/jiak/blog/lumberjaph/

will output (with the HTTP blabla)

{"object":{"title":"I'm a lumberjaph, and I'm ok"},"vclock":"a85hYGBgzGDKBVIsbGubKzKYEhnzWBlCTs08wpcFAA==","lastmod":"Sun, 13 Dec 2009 20:28:04 GMT","vtag":"5YSzQ7sEdI3lABkEUFcgXy","bucket":"blog","key":"lumberjaph","links":[]}

Using Riak with Perl and KiokuDB

I need to store various things in Riak: html pages, json data, and objects using KiokuDB. I’ve started to write a client for Riak with AnyEvent, so I can do simple operations at the moment, (listing information about a bucket, defining a new bucket with a specific schema, storing, retriving and deleting documents). To create a client, you need to

my $client = AnyEvent::Riak->new(
    host => 'http://127.0.0.1:8098',
    path => 'jiak',
);

As Riak exposes to you it’s N, R, and W value, you can also set them in creation the client:

my $client = AnyEvent::Riak->new(
    host => 'http://127.0.0.1:8098',
    path => 'jiak',            
    r    => 2,
    w    => 2,                 
    dw   => 2,
);

where:

  • the W and DW values define that the request returns as soon as at least W nodes have received the request, and at least DW nodes have stored it in their storage backend.
  • with the R value, the request returns as soon as R nodes have responded with a value or an error. You can also set this values when calling fetch, store and delete. By default, the value is set to 2.

So, if you wan to store a value, retrieve it, then delete it, you can do:

my $store = $client->store(                                           
    { bucket => 'foo', key => 'bar', object => { baz => 1 }, } )->recv;    
my $fetch  = $client->fetch( 'foo', 'bar' )->recv;
my $delete = $client->delete( 'foo', 'bar' )->recv;

If there is an error, the croak method from AnyEvent is used, so you may prefer to do this:

use Try::Tiny;
try {
  my $fetch = $client->fetch('foo', 'baz')->recv;
}catch{
  my $err = decode_json $_;
  say "error: code => ".$err->[0]." reason => ".$err->[1];
};

The error contains an array, with the first value the HTTP code, and the second value the reason of the error given by Riak.

At the moment, the KiokuDB backend is not complete, but if you want to start to play with is, all you need to do is:

my $dir = KiokuDB->new(
    backend => KiokuDB::Backend::Riak->new(
        db => AnyEvent::Riak->new(      
            host => 'http://localhost:8098',
            path => 'jiak',
        ),
        bucket => 'kiokudb',            
    ),
);
 
$dir->txn_do(sub { $dir->insert($key => $object)});

sd : the peer to peer bug tracking system

Tuesday, November 17th, 2009

SD is a peer to peer bug tracking system build on top of Prophet. Prophet is A grounded, semirelational, peer to peer replicated, disconnected, versioned, property database with self-healing conflict resolution. SD can be used alone, on an existing bug tracking system (like RT or redmine or github) and it plays nice with git.

Why should you use SD ? Well, at $work we are using redmine as our ticket tracker. I spend a good part of my time in a terminal, and checking the ticket system, adding a ticket, etc, using the browser, is annoying. I prefer something which I can use in my terminal and edit with my $EDITOR. So if you recognize yourself in this description, you might want to take a look at SD.

In the contrib directory of the SD distribution, you will find a SD ticket syntax file for vim.

how to do some basic stuff with sd

We will start by initializing a database. By default

sd init

will create a .sd directory in your $HOME. If you want to create in a specific path, you will need to set the SD_REPO in your env.

SD_REPO=~/code/myproject/sd sd init

The init command creates an sqlite database and a config file. The config file is in the same format as the one used by git.

Now we can create a ticket:

SD_REPO=~/code/myproject/sd ticket create

This will open your $EDITOR, the part you need to edit are specified. After editing this file, you will get something like this:

Created ticket 11 (437b823c-8f69-46ff-864f-a5f74964a73f)
Created comment 12 (f7f9ee13-76df-49fe-b8b2-9b94f8c37989)

You can view the created ticket:

SD_REPO=~/code/myproject/sd ticket show 11

and the content of your ticket will be displayed.

You can list and filter your tickets:

SD_REPO=~/code/myproject/sd ticket list
SD_REPO=~/code/myproject/sd search --regex foo

You can edit the SD configuration using the config tool or editing directly the file. SD will look for three files : /etc/sdrc, $HOME/.sdrc or the config file in your replica (in our exemple, ~/code/myproject/sd/config).

For changing my email address, I can do it this way:

SD_REPO=~/code/myproject/sd config user.email-address franck@lumberjaph.net

or directly

SD_REPO=~/code/myproject/sd config edit

and update the user section.

sd with git

SD provides a script for git: git-sd.

Let’s start by creating a git repository:

mkdir ~/code/git/myuberproject
cd ~/code/git/myuberproject
git init

SD comes with a git hook named “git-post-commit-close-ticket” (in the contrib directory). We will copy this script to .git/hooks/post-commit.

now we can initialize our sd database

git-sd init

git-sd will try to find which email you have choosen for this project using git config, and use the same address for it’s configuration.

Let’s write some code for our new project

#!/usr/bin/env perl
use strict;
use warnings;
print "hello, world\n";
git add hello.pl
git commit -m "first commit" hello.pl

now we can create a new entry

git-sd ticket create # create a ticket to replace print with say

We note the UUID for the ticket: in my exemple, the following output is produced:

Created ticket 11 (92878841-d764-4ac9-8aae-cd49e84c1ffe)
Created comment 12 (ddb1e56e-87cb-4054-a035-253be4bc5855)

so my UUID is 92878841-d764-4ac9-8aae-cd49e84c1ffe.

Now, I fix my bug

#!/usr/bin/env perl
use strict;
use 5.010;
use warnings;
say "hello, world";

and commit it

git commit -m "Closes 92878841-d764-4ac9-8aae-cd49e84c1ffe" hello.pl

If I do a

git ticket show 92878841-d764-4ac9-8aae-cd49e84c1ffe

The ticket will be marked as closed.

sd with github

Let’s say you want to track issues from a project (I will use Plack for this exemple) that is hosted on github.

git clone git://github.com/miyagawa/Plack.git
git-sd clone --from "github:http://github.com/miyagawa/Plack"
# it's the same as
git-sd clone --from "github:miyagawa/Plack"
# or if you don't want to be prompted for username and password each time
 git-sd clone --from github:http://githubusername:apitoken@github.com/miyagawa/Plack.git

It will ask for you github username and your API token, and clone the database.

Later, you can publish your sd database like this:

git-sd push --to "github:http://github.com/$user/$project"

Now you can code offline with git, and open/close tickets using SD :)

Modules I like : Devel::Declare

Monday, November 9th, 2009

For $work, I’ve been working on a job queue system, using Moose, Catalyst (for a REST API) and DBIx::Class to store the jobs and some meta (yeah I know, there is not enough job queue system already, the world really needs a new one …).

Basicaly, I’ve got a XXX::Worker class that all the workers extends. This class provide methods for fetching job, add a new job, mark a job as fail, retry, …

The main loop in the XXX::Worker class look like this:

# $context is a hashref with some info the job or method may need
while(1) {
    my @jobs = $self->fetch_jobs();
    foreach my $job (@jobs) {
        my $method = $job->{funcname};
        $self->$method($context, $job);
    }
    $self->wait;
}

and the worker look like this

package MyWorker;
use Moose;
extends 'XXX::Worker';
 
sub foo {
    my ($self, $context, $job) = @_;
    # do something
    $self->job_success();
}

But as I’m using Moose, I want to add more sugar to the syntax, so writing a new worker would be really more easy.

Here comes Devel::Declare.

The syntax I want for my worker is this one:

work foo {
    $self->logger->info("start to work on job");
    # do something with $job
}
 
work bar {
    # do something with $job
}
 
success foo {
    $self->logger->info("woot job success");
}
 
fail bar {
    $self->logger->info("ho noez this one failed");
}

Where with ‘work‘ I write the code the writer will execute on a task, ‘success‘, a specific code that will be executed after a job is marked as successfull, and ‘fail‘ for when the job fail.

I will show how to add the ‘work‘ keyword. I start by writing a new package :

XXX::Meta:
 
package XXX::Meta;
 
use Moose;
use Moose::Exporter;
use Moose::Util::MetaRole;
 
use Devel::Declare;
 
use XXX::Meta::Class;
use XXX::Keyword::Work;
 
Moose::Exporter->setup_import_methods();
 
sub init_meta {
    my ( $me, %options ) = @_;
 
    my $for = $options{for_class};
 
    XXX::Keyword::Work->install_methodhandler( into => $for, );
 
    Moose::Util::MetaRole::apply_metaclass_roles(
        for_class       => $for,
        metaclass_roles => ['XXX::Meta::Class'],
    );
 
}
 
1;

The init_meta method is provided by Moose: (from the POD)

The init_meta method sets up the metaclass object for the class specified by for_class. This method injects a a meta accessor into the class so you can get at this object. It also sets the class’s superclass to base_class, with Moose::Object as the default.

So I inject into the class that will use XXX::Meta a new metaclass, XXX::Meta::Class.

Let’s take a look to XXX::Meta::Class:

package XXX::Meta::Class;
 
use Moose::Role;
use Moose::Meta::Class;
use MooseX::Types::Moose qw(Str ArrayRef ClassName Object);
 
has work_metaclass  => (
    is      => 'ro',
    isa     => Object,
    builder => '_build_metaclass',
    lazy    => 1,
);
 
has 'local_work' => (
    traits     => ['Array'],
    is         => 'ro',
    isa        => ArrayRef [Str],
    required   => 1,
    default    => sub { [] },
    auto_deref => 1,
    handles    => { '_add_work' => 'push', }
);
 
sub _build_metaclass {
    my $self = shift;
    return Moose::Meta::Class->create_anon_class(
        superclasses => [ $self->method_metaclass ],
        cache        => 1,
    );
}
 
sub add_local_method {
    my ( $self, $method, $name, $code ) = @_;
 
    my $method_name = $method . "_" . $name;
    my $body        = $self->work_metaclass->name->wrap(
        $code,
        original_body => $code,
        name          => $method_name,
        package_name  => $self->name,
    );
 
    my $method_add = "_add_" . $method;
    $self->add_method( $method_name, $body );
    $self->$method_add($method_name);
}
 
1;

Here I add to the ->meta provided by Moose ‘local_work‘, which is an array that contains all my ‘work‘ methods. So each time I do something like

work foo {
}
 
work bar {
}

in my worker, I add this method to ->meta->local_work.

And the class for our keyword work:

package XXX::Keyword::Work;
 
use strict;
use warnings;
 
use Devel::Declare ();
use Sub::Name;
 
use base 'Devel::Declare::Context::Simple';
 
sub install_methodhandler {
    my $class = shift;
    my %args  = @_;
    {
        no strict 'refs';
        *{ $args{into} . '::work' } = sub (&) { };
    }
 
    my $ctx = $class->new(%args);
    Devel::Declare->setup_for(
        $args{into},
        {
            work => {
                const => sub { $ctx->parser(@_) }
            },
        }
    );
}
 
sub parser {
    my $self = shift;
    $self->init(@_);
 
    $self->skip_declarator;
    my $name = $self->strip_name;
    $self->strip_proto;
    $self->strip_attrs;
 
    my $inject = $self->scope_injector_call();
    $self->inject_if_block(
        $inject . " my (\$self, \$content, \$job) = \@_; " );
 
    my $pack = Devel::Declare::get_curstash_name;
    Devel::Declare::shadow_sub(
        "${pack}::work",
        sub (&) {
            my $work_method = shift;
            $pack->meta->add_local_method( 'work', $name, $work_method );
        }
    );
    return;
}
 
1;

The install_methodhandler add the work keyword, with a block of code. This code is sent to the parser, that will add more sugar. With the inject_if_block, I inject the following line

my ($self, $context, $job) = @_;

as this will always be my 3 arguments for a work method.

Now, for each new worker, I write something like this:

package MyWorker;
use Moose;
extends 'XXX::Worker';
use XXX::Meta;
 
work foo {
}

The next step is too find the best way to reduce the first four lines to two.

(some of this code is ripped from other modules that use Devel::Declare. The best way to learn what you can do with this module is to read code from other modules that use it)