blipper.cgi
- sample CGI for box-stuffer
This file documents blipper.cgi
version .9
box-stuffer
is a mechanism for storing message attributes from multiple MHonArc
archives in a SQL database. blipper.cgi
is a sample CGI to show off some applications of the box-stuffer data
model, and to serve as a framework for further customization. At the moment
it is somewhat more successful at the former.
The silly name stems from my desire that this sample not be used as-is. In fact, it can't be, as you need to at least customize a few variables located at the head of the script.
blipper tragically fell victim to sprawl. I originally intended to divorce the templates from the code, but modifying the templates still turns out to require some familarity with perl data structures in general, and these data structures in particular.
I would be immensely pleased to discuss simpler/better CGI strategies; I'm not wedded to this example by any means (though I'll probably keep sharpening it, rather than chart a new course, until someone comes along with a better way).
Templates are implemented by Text::Template. Quick perusal of that module's (excellent) documentation is recommended.
http://www.plover.com/~mjd/perl/Template/Manual.html
In brief, Text::Template provides for templates which employ all the standard Perl looping constructs and have access to Perl variables. As far as blipper is concerned, all data intended for use by the box-stuffer templates is placed in the TPL package. All templates live at the end of the script.
There are currently four templates in the CGI. Aside from the default, which provides a search form, they're best introduced in the context of the four things that the CGI knows to look for in the database:
``default_template'', which displays results of subject searches
``first_person_search_template'' which verifies possible matches on name or email address searches
``second_person_search_template'' which displays results of person searches
the browser is redirected to a URL, without a template
this section details the various data structures made available to the templates.
The most basic, an array of hashes providing a straight dump of whatever data the database server returned, is always generated:
@data
Each hash in @data
corresponds to a single row returned from
the database server, where each key represents a field. The rows are loaded
sequentially in the order returned by the SQL query, which of course also
determines precisely which data is being returned. The names of the hash
keys are determined by the call to stuff_data, which is the subroutine
which makes the database's output accessible to the templates.
So the first two elements in @data
might plausibly look like
this:
@data = (
{ dates => "1999-08-02 12:22:12", names => "Alice Andover", addresses => "alice@abba.org", subjects => "Here's what I think about Abba", filename => "/usr/home/lists/archive/people-thinking/1999/08/", namespace => "people-thinking" },
{ dates => "1999-08-05 08:12:12", names => "Bob Borland", addresses => "bob@babble.org", subjects => "Here's where you can stick what you think about Abba", filename => "/usr/home/lists/archive/people-thinking/1999/08/", namespace => "people-thinking" }
...
);
Extracting data from @data
is pretty straightforward: loop
over the array within the template, and ask for whichever fields you need
at the moment:
foreach my $row (@data) { # returns hash references, one per row print "$row->{names} wrote $row->{subjects} on $row->{dates}\n";
}
@data's hash keys are determined entirely by the call to the ``stuff_data'' function within the CGI. For example:
&stuff_data($db_results, "dates", "subjects", "filenames", "names");
This call represents the last step before the template is executed. The
$db_results
scalar is an array reference containing all the
data returned from the database, each as rows. The rest of the parameters
indicate the order of the data requested by the SQL query, and will be used
as the names of the hash keys in @data. If you misspell one of the names of
the important hash keys (like ``names''), the script will break.
The primary reason to stick with my hash key names is that the
@data
array is a little unwieldy. There are easier ways for
the templates to approach this information than in a big pile.
&stuff_data
tries to be helpful and generate some of these
useful ways to look at the data, but it can only make rational decisions if
it knows what kind of data it's looking at. To this end, it relies on known
key names.
These are the currently available supplementary hashes:
All the author names represented in the data set, uniqued. Hash keys are the names (first and last concatenated in a single field), hash values are the number of messages in the data set from this name. Note that a single name could easily account for responses from multiple addresses, which may or may not belong to the same person.
The creation of this hash depends on the existence of a hash key called ``names''.
All the author email addresses represented in the data set, uniqued. Hash keys are addresses, hash values are the number of messages in the data set from this address.
The creation of this hash depends on the existence of a hash key called ``addresses''.
This hash holds uniqued concatenated names and email addresses, in this format: ``Name <email_address>''
This is the key you should trust over straight names or addresses when you're sorting by people. Very often in email archives spanning a long period, individuals will have changed addresses, and/or their names may have changed slightly.
The creation of this hash depends on the existence of hash keys called ``names'' and ``addresses''.
uniqued namespaces represented in the data set. Handy when you're returning multiple messages with a common message-ID; the namespace will likely be the sole differentiating factor.
The creation of this hash depends on the existence of a hash key called ``namespaces''.
Right now, these keys are the best way to return intelligently labeled
results -- iterate over %unique_individuals, for example, and use the grep
function to extract from @data
all the hashes who match a
given author. This is demonstrated in the sample.
blipper.cgi
is made available under the GPL.
http://www.opensource.org/licenses/gpl-license.html
This documentation is first-cut. Please direct any comments or suggestions to <lexical@bumppo.net>, or the box-stuffer-talk mailing list.
http://lists.sourceforge.net/mailman/listinfo/box-stuffer-talk
4/9/00 Nat Irons lexical@bumppo.net