NAME
VERSION
SHORT DESCRIPTION
LONG DESCRIPTION
- Care And Feeding of Template Data Structures
LICENSE
CONTACT

NAME

blipper.cgi - sample CGI for box-stuffer

VERSION

This file documents blipper.cgi version .9

SHORT DESCRIPTION

box-stuffer is a mechanism for storing message attributes from multiple MHonArc archives in a SQL database. blipper.cgi is a sample CGI to show off some applications of the box-stuffer data model, and to serve as a framework for further customization. At the moment it is somewhat more successful at the former.

The silly name stems from my desire that this sample not be used as-is. In fact, it can't be, as you need to at least customize a few variables located at the head of the script.

LONG DESCRIPTION

blipper tragically fell victim to sprawl. I originally intended to divorce the templates from the code, but modifying the templates still turns out to require some familarity with perl data structures in general, and these data structures in particular.

I would be immensely pleased to discuss simpler/better CGI strategies; I'm not wedded to this example by any means (though I'll probably keep sharpening it, rather than chart a new course, until someone comes along with a better way).

Templates are implemented by Text::Template. Quick perusal of that module's (excellent) documentation is recommended.

 http://www.plover.com/~mjd/perl/Template/Manual.html

In brief, Text::Template provides for templates which employ all the standard Perl looping constructs and have access to Perl variables. As far as blipper is concerned, all data intended for use by the box-stuffer templates is placed in the TPL package. All templates live at the end of the script.

There are currently four templates in the CGI. Aside from the default, which provides a search form, they're best introduced in the context of the four things that the CGI knows to look for in the database:

message subjects:
``default_template'', which displays results of subject searches
email addresses or author names:
``first_person_search_template'' which verifies possible matches on name or email address searches
``second_person_search_template'' which displays results of person searches
message-IDs
the browser is redirected to a URL, without a template

Care And Feeding of Template Data Structures

this section details the various data structures made available to the templates.

The most basic, an array of hashes providing a straight dump of whatever data the database server returned, is always generated:

  @data

Each hash in @data corresponds to a single row returned from the database server, where each key represents a field. The rows are loaded sequentially in the order returned by the SQL query, which of course also determines precisely which data is being returned. The names of the hash keys are determined by the call to stuff_data, which is the subroutine which makes the database's output accessible to the templates.

So the first two elements in @data might plausibly look like this:

  @data = (

    {
        dates     => "1999-08-02 12:22:12",
        names     => "Alice Andover",
        addresses => "alice@abba.org",
        subjects  => "Here's what I think about Abba",
        filename  => "/usr/home/lists/archive/people-thinking/1999/08/",
        namespace => "people-thinking"
    },

    {
        dates     => "1999-08-05 08:12:12",
        names     => "Bob Borland",
        addresses => "bob@babble.org",
        subjects  => "Here's where you can stick what you think about Abba",
        filename  => "/usr/home/lists/archive/people-thinking/1999/08/",
        namespace => "people-thinking"
    }

...

);

Extracting data from @data is pretty straightforward: loop over the array within the template, and ask for whichever fields you need at the moment:

 foreach my $row (@data) { # returns hash references, one per row
 
        print "$row->{names} wrote $row->{subjects} on $row->{dates}\n";

@data's hash keys are determined entirely by the call to the ``stuff_data'' function within the CGI. For example:

 &stuff_data($db_results, "dates", "subjects", "filenames", "names");

This call represents the last step before the template is executed. The $db_results scalar is an array reference containing all the data returned from the database, each as rows. The rest of the parameters indicate the order of the data requested by the SQL query, and will be used as the names of the hash keys in @data. If you misspell one of the names of the important hash keys (like ``names''), the script will break.

The primary reason to stick with my hash key names is that the @data array is a little unwieldy. There are easier ways for the templates to approach this information than in a big pile. &stuff_data tries to be helpful and generate some of these useful ways to look at the data, but it can only make rational decisions if it knows what kind of data it's looking at. To this end, it relies on known key names.

These are the currently available supplementary hashes:

%unique_names
All the author names represented in the data set, uniqued. Hash keys are the names (first and last concatenated in a single field), hash values are the number of messages in the data set from this name. Note that a single name could easily account for responses from multiple addresses, which may or may not belong to the same person.
The creation of this hash depends on the existence of a hash key called ``names''.
%unique_addresses
All the author email addresses represented in the data set, uniqued. Hash keys are addresses, hash values are the number of messages in the data set from this address.
The creation of this hash depends on the existence of a hash key called ``addresses''.
%unique_individuals
This hash holds uniqued concatenated names and email addresses, in this format: ``Name <email_address>''
This is the key you should trust over straight names or addresses when you're sorting by people. Very often in email archives spanning a long period, individuals will have changed addresses, and/or their names may have changed slightly.
The creation of this hash depends on the existence of hash keys called ``names'' and ``addresses''.
%unique_namespaces
uniqued namespaces represented in the data set. Handy when you're returning multiple messages with a common message-ID; the namespace will likely be the sole differentiating factor.
The creation of this hash depends on the existence of a hash key called ``namespaces''.

Right now, these keys are the best way to return intelligently labeled results -- iterate over %unique_individuals, for example, and use the grep function to extract from @data all the hashes who match a given author. This is demonstrated in the sample.

LICENSE

blipper.cgi is made available under the GPL.

 http://www.opensource.org/licenses/gpl-license.html

CONTACT

This documentation is first-cut. Please direct any comments or suggestions to <lexical@bumppo.net>, or the box-stuffer-talk mailing list.

 http://lists.sourceforge.net/mailman/listinfo/box-stuffer-talk

 4/9/00
 Nat Irons
 lexical@bumppo.net