ALEG
Introduction to the Design

Introduction

This document introduces the design for the ALEG system. References to documents which describe the ALEG project and the design are included throughout this introduction and should be consulted to obtain an "in-depth" understanding of what the ALEG project team thought it was building before it actually started implementing the system.

As the system is implemented, updated information will appear in the Implementation section of the ALEG Development web site.

What is ALEG?

ALEG is designed to be a tool to aid inquiry and research, to further understanding of Australia's literature. ALEG's scope and functional requirements are described in the project's Stage 1 report.

Some requirements and usages are easy to understand: provide a simple way to find out basic information about a work or an author. However, the potential value of a research tool relates not only to the breadth of the resource on which it operates but also on how flexible it is; how the researcher can use it to answer questions the designers of the system did not anticipate.

A key part of ALEG's potential is the way it can help to reveal and elucidate the relationships between the 'entities' making up Australian literature; the authors, works, publishers, movements, genres, cultural and political forces.

So, the research value of ALEG is not so much in the 'raw' data of who wrote what and when (vital as that is), but how that 'raw' data can be view as coherent clumps - the relationships that are unveiled when the core data is analysed.

ALEG is not a library catalogue system. Although the "base" entities described by ALEG can be cast in terms used by traditional library systems such as "title" and "author", ALEG's reason for existence is not to duplicate the National Library of Australia's Kinetica facility. Rather, it is to make available a rich resource for people interested in Australian Literature by providing:

There have been several recent data modelling and architecture exercises which have greatly influenced the ALEG data model:

The IFLA FRBR

A recent (1998) development in the data modelling of library systems was completed by the International Federation of Library Associations and Institutions (IFLA). Known as the FRBR ("Functional Requirements for Bibliographic Records"), it "teases apart" the concepts of Work, Expression, Manifestation and Item and in so doing helps model and understand the relationships between titles. For example, it can clearly identify two manifestations as embodiments of the same expression, or two expressions as realisations of the same work, although one may be a language translation of another.

The INDECS project

The INDECS project (INteroperability of Data in E-Commerce Systems) was established to develop a metadata framework for representing intellectual property and the transactions involving it. Their Schema and Model documents have a strong bent towards enabling e-commerce rights management related transactions, but necessarily require a precise modelling of the intellectual works and the agents who contribute to them. They also use the basic Work, Expression, Manifestation and Item representations of FRBR, but introduce the concept of the "Event" that describes how these products came about - who did what, the context, the inputs to the process.

The Harmony ABC strawman proposal

Many entities and relationships which different systems in different application area attempt to model are pretty much the same. Rather than force each implementation to develop their own representations, and hence waste effort and complicate interoperability, the Harmony ABC proposal attempts to define a common framework which diverse systems will be able to use. The ABC proposal acknowledges that FRBR and INDECS were major sources of inspiration for their work.

The ISO Topic Map standard

Thesauri are recognised as useful tools which add structure to a subject list. The ISO Topic Map standard (ISO 13250) defines data structures which can be used as a standard thesaurus (with broader, narrower, see also and preferred types of relationships between topics), or as a standard "index" to a collection of works. But Topic Maps have some other characteristics which make them extremely powerful, including:

  • Topics can be assigned Topic Types (eg, "illustrator", "mountain", "country", "era"). Topic Types are themselves Topics.
  • Topics can be arbitrarily linked together by "Associations". Associations each have an "Association Type", which is itself a Topic.
  • Topics are linked to resources by "Occurrences". The resource being linked to plays a "Role" in the Occurrence. Roles are themselves Topics.

A set of Topics, their associations and occurrences form a "Topic Map". The Topic Map is quite separate from the underlying resources which it describes. Hence, multiple Topic Maps can be assembled and maintained quite independently of each other, and the underlying resource.

The ALEG data model

After reviewing the goals of the ALEG system and the data and relationships it needs to represent, seeking the opinions of experienced and respected figures in the Australian library community and reviewing the information discussing library data modelling, we have decided to base the ALEG data model on FRBR and INDECS models, adding:

and removing:

For a full description of the development and contents of the ALEG data model, refer to the data model documentation.

Data Maintenance

The ALEG Data Maintenance User Interface will be used by ALEG partners to maintain the ALEG data base. The interface will be web browser based, hence special software will not need to be installed and upgraded.

It has been decided to base the ALEG Data Maintenance User Interface on Microsoft's Internet Explorer version 5 (IE5) because IE5:

  1. implements an advanced programming interface which allows the development of sophisticated applications
  2. is a stable product and widely installed, having being released in March 1999
  3. is available for current and recent popular operating systems (Windows 95/98/NT/2000, Macintosh) and is automatically bundled with most operating system installations
  4. has overwhelming market share (estimated at over 85% by Stat Market as at 18 June 2000)
  5. is free

Note - the web site designed for end-user access will be designed to support older browsers (Netscape 2 and above) and will address the W3C accessibility guidelines - IE5 will not be required to acess the public ALEG web site.

The guiding principles of the user interface are simplicity, speed and ease of use. The user interface must allow the user to do as much as possible with as few keystrokes/mouse clicks as possible. It must present the most likely action as the default and do as much as possible to prevent common mistakes and maintain the integrity of the system. Where-ever possible, the system should acquire data by requiring that the user makes a selection from possible values rather than having to type a complete value into a text box. The interface must remember the context in which the user is working and provide appropriate "templates", default values and an easy-to-access list of recently assigned values to reduce searching.

Details of the proposed data maintenance user interface are described in the Data Maintenance section of the ALEG Design document.

General Public User Interface

The characteristics of the general user interface will be:

  1. No training required

  2. Both searching and browsing will be supported

  3. Clean layout, fast loading

  4. Support for W3C accessibility guidelines

  5. Clear and rapid navigation around the site

  6. Browser awareness, providing an optimal interface for the user's browser with support for older browsers (ie, ALEG will not assume or require any browser capabilities beyond HTML 2 and basic HTML TABLE support)

  7. User configurable frames and javascript usage

  8. Immediate access to the search capability from every page

  9. Consistent ALEG branding

Details of the proposed user interface are described in the General Public User Interface section of the ALEG Design document.

Multiple Views

ALEG will enable entities (works, agents) to be grouped into collections. Collections can be extracted by their owners for their own formatting and publication, or can be exposed with their own identity on the ALEG site.

For example, assuming that a South Australian Women Writing collection is created, then it may be decided to:

Interoperability

  1. ALEG as a Z39.50 target

    ALEG will host a Z39.50 target implementing the Bath profile, initially just for Functional Area A (Basic Bibliographic Search and Retrieval) at Conformance Level 0, as described in section 5.A.0 of the Bath Profile, with support for the XML DTD and Simple Unstructured Text Record Syntax (SUTRS) and MARC21.

  2. Extracting information from ALEG

    The ALEG system will licence partners to extract records from the database in XML format. The criteria for extract (ie, how to specify what is to extracted) has yet to be decided, but typically examples could be:

    1. All agents associated with a specific attribute or Topic (or set of attribute Topics) in a specific role (or sets of Roles) and the works associated with those agents in a specific relationship (or set of relationships).

      (For example, all agents with a gender of "Female" and either a birthplace of "South Australia" or some period of residency in "South Australia" and all the works they have "created" or "editted".)

    2. All works and/or agents associated with a nominated collection.

    3. All archival items associated with a nominated agent and/or work.

      (This would allow the production of the Lu Rees author files.)

  3. Publishing the ALEG Topic Map

    Initially Topic Maps will be used as a useful abstraction and design principle and implementation technique rather than as public way to access the ALEG database. However, the development and implementation of the system will be undertaken cognisant of the great potential of making Topic Maps publically available, exposing the base resources to be described to external Topic Maps and merging Topic Maps from different sources.

Issues

Major issues which arose during the design stage are discussed in the Data Model Issues document. In summary they are:

  1. Providing access to holdings information
  2. Controlled vocabulary/authority file values for describing form of expression and format
  3. How best to deal with works which are websites
  4. Handling references to manuscript collections
  5. How to structure the thesaurus

Implementation Options

Several approaches to providing the required functionality are discussed in the Implementation Options document. The FRBR, INDECS and Topic Map data models on which ALEG is based are all relatively new and there are no known proprietary or off-the-shelf solutions which can support the required data model and functionality. Hence the proposed approach is based on widely used Open Source Software (Apache, Tomcat Servlet Engine, Xerces/XALAN XML tools, Yaz Z39.50 toolkit), the Oracle 8 database and text-search engine, the Java programming language and the client-side programmining capabilities supported by the Microsoft Internet Explorer 5 web browser.

This approach can be implemented on a variety of server hardware / software platforms (Intel, Sparc and other processors, Solaris, Linux, other unix, Windows NT operating systems), although the ADFA library has a strong preference for (and experience in) Solaris based server systems.


Home > Design
ALEG Development Team
c/- k.fitch@adfa.edu.au
10 August 2000