TLRR - Trials in the Late Roman Republic: 149 BC to 50 BC

Technical information

This page is a temporary place-holder. Fuller information will be provided as time allows.

Overview
Current status
XML work flow
gdoc XML version of TLRR 1
Enrichment of the XML
Other documents

Overview

This project will produce a second edition of Trials in the Late Roman Republic, using XML database technology to handle the complex structure of the information collected.

This page provides an initial overview of our approach to the technical issues involved. It is intended primarily for readers concerned with the application of information technology to humanities scholarship and curious about how XML and web technology can be used to help manage information like this. Readers interested primarily in Roman legal history need not concern themselves with the information given here.

TLRR is in every essential feature a database: it is a systematic collection of information on a set of similar entities, designed to allow direct comparison between entities. But the information's structure is so variable, and the sources of our knowledge so scattered and fragmentary, that it poses a number of challenges for those who would attempt to manage it with database software. It would be challenging, and more than a little tedious, to manage the information with a relational database management system. (Even the reduction of the information to third normal form would lead to challenges, as virtually every attempt to retrieve information on a given trial would involve a multi-way join.) Fortunately, XML is designed to handle information with such variable structure, and XML technologies make it feasible to manage the information in suitable ways.

Current status

The current state of the project, from a technical point of view, is as follows. (At the time of writing, this is changing daily.)

Things done so far:

The Waterloo Script input used for TLRR1 has been converted into XML in a vocabulary based on the gdoc vocabulary of Waterloo GML. A stylesheet has been written, and the document is available in various forms.
An XSLT 2.0 stylesheet has been developed to translate the gdoc version of TLRR into a more tractable form of XML, in which the various fields of each trial description are distinguished using XML elements.
A document grammar for the output of this stylesheet has been developed concurrently. (For want of a better term, this is referred to here as 'fielded' XML.)
A search interface for the fielded data has been prepared; others should follow.
An XPath search interface for the fielded data has been prepared; this is intended for the use of those interested in the XML markup. (It has one peculiarity: if quoted strings are needed, Ancilla currently accepts only double quotes. Single quotes are currently escaped out of all recognition.)

Things to be done in the immediate future:

Distinct XML document types will be specified for different kinds of entities involved in TLRR: trials; persons; laws, crimes, causes of action. (Later refinements will add: ancient sources; modern sources.)
Other parts of the initial prototype version of the system remain to be prepared: XForms for editing information about entities remain to be written, stylesheets for displaying it, etc.

Things to be done in the more distant future:

Later work on the project will explore alternative methods of implementing such a system. (TLRR lends itself to such exploration because it's complex enough to be interesting and realistic, constrained enough to be tractable, and the required system is small enough to be reimplemented in full without undue hardship.)

XML work flow

The project-internal interface for editing TLRR2 uses XML technologies throughout.

The basis is an XML version of the TLRR data, derived from the GDoc XML document described below.

Each person, law, and trial is represented by a different XML document.

XForms interfaces allow the co-authors to edit individual entries in the database independently of each other, from standard web browsers.

XForms saves each new document and each new version of a document to a Subversion repository; the documents are automatically propagated to a read-only working copy used by the Web server.

`gdoc` XML version of TLRR 1

The gdoc XML version of TLRR 1 available on this site was created from the Waterloo Script input files of the first edition as follow:

All input files were merged into a single Waterloo Script input file.
An XSLT 2.0 stylesheet translated the Waterloo GML tags and Waterloo Script commands in the input file into an XML version of the gdoc vocabulary defined by Waterloo GML. This XML stylesheet is divided into two parts: one module for handling of generic Waterloo Script and GML, and one for TLRR-specific extensions. The output vocabulary is as close as possible to the original Waterloo GML gdoc vocabulary, as augmented by user-defined macros, etc.; some augmentations were necessary, the handling of non-ASCII characters was altered, and Waterloo-Script specific notations (like the &'italic("pontifices") function for producing italic text) were translated into XML markup.
Some cleanup was then done by hand.

A hand-created DTD for the gdoc vocabulary (as instantiated in this document) is available.

Enrichment of the XML

From the gdoc XML described above, an XSLT 2.0 stylesheet is being developed, which recognizes the different areas in each description of a trial and marks them with XML elements.

The gdoc version of trial #73, for example, is as follows (lines have been re-broken).

<trial id="XAH">
<?WScript .sr XAH = &chapter?>
date:  after 104,
<en>V. Max. refers to the juror
as <hp1>consularis</hp1>.  But if 
he does so only to distinguish him from 
the C. Flavius Fimbria active in the 80s, 
then the term <hp1>consularis</hp1> 
does not provide a 
<hp1>terminus post quem</hp1>.
</en>
before 91
<br/>
claim:
<hp1>sponsio</hp1>
(<hp1>ni vir bonus esset</hp1>)
<br/>
party:  M. Lutatius Pinthia (21) e.R.
<ix n="8" target="XAH"
  >Lutatius (+21), M. Pinthia</ix>
<br/>
<ix n="1" target="XAH"
  ><ital>sponsio</ital></ix>
juror:  C. Flavius Fimbria (87) cos. 104
<ix n="6" target="XAH"
  >Flavius (+87), C. Fimbria</ix>
<br/>
outcome:
juror
refused to adjudicate
<?WScript .sk?>
<p>
Cic.
<hp1>Off.</hp1>
3.77; V. Max. 7.2.4
</p>
<?WScript .sk?>
</trial>

In the target vocabulary, this should look something like this:

<trial id="XAH" tlrr1="73">
  <date>after 104,<en>V. Max. refers 
  to the juror as <hp1>consularis</hp1>.  
  But if he does so only to distinguish him 
  from the C. Flavius Fimbria active in 
  the 80s, then the term 
  <hp1>consularis</hp1> does not provide 
  a <hp1>terminus post quem</hp1>.
  </en>
  before 91</date>
  <claim>
    <hp1>sponsio</hp1>
    (<hp1>ni vir bonus esset</hp1>)
  </claim>
  <party label="party"
    >M. Lutatius Pinthia (21) e.R.</party>
  <juror>C. Flavius Fimbria (87) 
    cos. 104</juror>
  <outcome>juror refused to 
    adjudicate</outcome>
  <sources>
    <ancient>
      Cic. <hp1>Off.</hp1> 3.77; 
      V. Max. 7.2.4
    </ancient>
  </sources>
</trial>

The development of the XSLT 2.0 transformation and vocabulary uses the time-honored technique of testing the transform first on a one-per-cent sample of the data, and then a ten-per-cent sample of the data, before using it to convert the entire collection of data.

The one per-cent and ten per-cent samples are available, as are the results of recent test runs of the stylesheet (one per-cent, ten per-cent).

As may be seen, the test result on the one percent sample currently varies from the form shown above mostly in cosmetic ways; the ten percent is currently being studied and used to improve the transformation.

Trials in the Late Roman Republic