Franck Cuny @franckcuny

Software and Operation engineer.

A simple feed aggregator with modern Perl - part 1

Following Matt's post about people not blogging enough about Perl, I've decided to try to post once a week about Perl. So I will start by a series of articles about what we call modern Perl. For this, I will write a simple feed agregator (using Moose, DBIx::Class, KiokuDB, some tests, and a basic frontend (with Catalyst). This article will be split in four parts:

  • the first one will explain how to create a schema using DBIx::Class
  • the second will be about the aggregator. I will use Moose* and KiokuDB
  • the third one will be about writing tests with Test::Class
  • the last one will focus on Catalyst

The code of these modules will be available on my git server at the same time each article is published.

I'm not showing you how to write the perfect feed aggregator. The purpose of this series of articles is only to show you how to write a simple aggregator using modern Perl.

The database schema

We will use a database to store a list of feeds and feed entries. As I don't like, no, wait, I hate SQL, I will use an ORM for accessing the database. For this, my choice is DBIx::Class, the best ORM available in Perl.

If you never have used an ORM before, ORM stands for Object Relational Mapping. It's a SQL to OO mapper that creates an abstract encapsulation of your databases operations. DBIx::Class' purpose is to represent "queries in your code as perl-ish as possible.

For a basic aggregator we need:

  • a table for the list of feeds
  • a table for the entries

We will create these two tables using DBIx::Class. For this, we first create a Schema module. I use Module::Setup, but you can use Module::Starter or whatever you want.

% module-setup MyModel
% cd MyModel
% vim lib/MyModel.pm
package MyModel;
use base qw/DBIx::Class::Schema/;
__PACKAGE__->load_classes();
1;

So, we have just created a schema class. The load_classes method loads all the classes that reside under the MyModel namespace. We now create the result class MyModel::Feed in lib/MyModel/Feed.pm:

package MyModel::Feed;
use base qw/DBIx::Class/;
__PACKAGE__->load_components(qw/Core/);
__PACKAGE__->table('feed');
__PACKAGE__->add_columns(qw/ feedid url /);
__PACKAGE__->set_primary_key('feedid');
__PACKAGE__->has_many(entries => 'MyModel::Entry', 'feedid');
1;

Pretty self explanatory: we declare a result class that uses the table feed, with two columns: feedid and url, feedid being the primary key. The has_many method declares a one-to-many relationship.

Now the result class MyModel::Entry in lib/MyModel/Entry.pm:

package MyModel::Entry;
use base qw/DBIx::Class/;
__PACKAGE__->load_components(qw/Core/);
__PACKAGE__->table('entry');
__PACKAGE__->add_columns(qw/ entryid permalink feedid/);
__PACKAGE__->set_primary_key('entryid');
__PACKAGE__->belongs_to(feed => 'MyModel::Feed', 'feedid');
1;

Here we declare feed as a foreign key, using the column name feedid.

You can do a more complex declaration of your schema. Let's say you want to declare the type of your fields, you can do this:

__PACKAGE__->add_columns(
    'permalink' => {
        'data_type'         => 'TEXT',
        'is_auto_increment' => 0,
        'default_value'     => undef,
        'is_foreign_key'    => 0,
        'name'              => 'url',
        'is_nullable'       => 1,
        'size'              => '65535'
    },
);

DBIx::Class also provides hooks for the deploy command. If you are using MySQL, you may need a InnoDB table. In your class, you can add this:

sub sqlt_deploy_hook {
    my ($self, $sqlt_table) = @_;
    $sqlt_table->extra(
        mysql_table_type => 'InnoDB',
        mysql_charset    => 'utf8'
    );
}

Next time you call deploy on this table, the hook will be sent to SQL::Translator::Schema, and force the type of your table to InnoDB, and the charset to utf8.

Now that we have a DBIx::Class schema, we need to deploy it. For this, I always do the same thing: create a bin/deploy_mymodel.pl script with the following code:

use strict;
use feature 'say';
use Getopt::Long;
use lib('lib');
use MyModel;

GetOptions(
    'dsn=s'    => \my $dsn,
    'user=s'   => \my $user,
    'passwd=s' => \my $passwd
) or die usage();

my $schema = MyModel->connect($dsn, $user, $passwd);
say 'deploying schema ...';
$schema->deploy;

say 'done';

sub usage {
    say
        'usage: deploy_mymodel.pl --dsn $dsn --user $user --passwd $passwd';
}

This script will deploy for you the schema (you need to create the database first if using with mysql).

Executing the following command perl bin/deploy_mymodel.pl --dsn dbi:SQLite:model.db will generate a model.db database so we can work and test it. Now that we got our (really) simple MyModel schema, we can start to hack on our aggregator.

The code is available on my git server.

while using DBIx::Class, you may want to take a look at the generated queries. For this, export DBIC_TRACE=1 in your environment, and the queries will be printed on STDERR.