ALEG

Data Model - Inventory

Index


Introduction

This document attempts to identify the primary entities of the ALEG data model and their attributes.

The function requirements of ALEG were broadly indentified in the Stage 1 report. The role of the ALEG data model is to describe a set of data structures which will allow these requirements to be met, ideally with a design which is consistent with current thinking on the best way to represent literary and related resources.

There are many possible starting points when constructing a model for ALEG. The existing library catalogue world is heavily influenced by the MARC model of description. Museum and archive communities have developed various models of their own, including the CIDOC Conceptual Reference Model and ISAAR (CPF document (International Standard Archival Authority Record for Corporate Bodies, Persons and Families). The work done by INDECS focuses on how best to model agents (a grouping term for people and organisations) and the material which they create. The Harmony Project is an attempt to identify some common mechanisms for representing common entities and attributes across many spheres of interest.

After reviewing these and working through ALEG's specific requirements, we decided to base our model on the IFLA FRBR (warning - 144 pages of PDF!). IFLA is the International Federation of Library Associations and Institutions. FRBR is the Functional Requirements for Bibliographic Records. Aside from our own interpretation of FRBR represented by this and other documents, you are strongly encouraged to peruse the following resources:

The detail of the FRBR model is not repeated here. However, we describe below our specific use and extensions to the FRBR model.

The following image taken from the above referenced D-Lib article summaries the FRBR model as extended by Indecs:

The main points to be emphasised from this model are:

  1. Information Resources are represented by 4 entities: Work, Expression, Manifestation and Item

  2. Instances of each of these Information Resources can be linked to each other. For example, a particular novel (work) may have been influenced by a particular poem (another work), a short story (work) may be expressed in English (an expression) and translated into French (another expression).

  3. Instances of each of these Information Resources can be linked to subjects.

  4. Information Resources are "transformed" by the "actions" of "agents". The word "transformed" is very general; "transforming actions" include conceiving, writing and publishing works, translation, editing and illustrating. FRBR does not describe "actions"; they are an addition to the model introduced by Indecs and the Harmony ABC proposal.

If you haven't yet done so, please read the above commentaries on the FRBR model before proceeding!!

The Core ALEG data model

The Core ALEG data model extends the FRBR model in the following ways:

  1. As shown in the above diagram an entity is inserted between Information Resources and Agents. The above diagram uses the term "Action" to describe this entity. The Harmony ABC project uses the term "Event". As described in both the Indecs and Harmony documentation, the benefit of adding this entity is that it provides a "place" to store information about the event which led to an information resoure being produced. ALEG sometimes wants to record more than just "who did what" - sometimes we want to know "how, when, where, and why". Maybe one day we'll find it useful to link one event to another events. By representing an "event" as a first class object rather than as a series of attributes which are tightly bound to some other entity, this becomes possible.

    An analogy: for the first few years of their life, children could be represented as mere "attributes" of their parents, as the relationships they can take part in and the 'data' which describes them is pretty minimal (birth date and place, weight, hair-colour, mother, father etc). But as children grow, the richness of their relationships and 'data' rapidly approach those of their parents. Building a model of a family where children are 'second class objects' would be a big mistake.

    Another analogy: imagine you were building a system which could be used to generate a TV guide. One approach would be to represent the names of programs (eg, "Bellbird", "Four Corners") as a simple attribute - a string of text. This would work fine initially, but if later you wanted to add information about each program (producer, actors, budget, abstract, links to related programs) you'd have to redesign your approach. If however, you'd represented the program as an "entity" in its own right pointed to by the TV guide (rather than just a string of text included within), you'd then be able to add extra attributes to the program with minimal disruption to your TV guide system.

  2. Relationships exist between agents. For example one author may have been influenced by another.

  3. ALEG greatly expands the subjects representation in the above diagram to that of a Topic Map. The Topic Map contains a range of Topics, grouped by Topic Type and linked together by Associations. For more information on Topic Maps, refer to the ALEG Data Model - Intro/Issues document.

  4. Agents, not just information resources can be associated with Topics. For example, a Topic may be created to represent the Jindyworobaks and writers may be linked to that topic via a "member of" relationship, possibly further described with a date range and other notes.

  5. ALEG represents information about Awards given to Information Resources and Agents.

  6. ALEG will need to represent holdings information, so that when a ALEG user finds a work of interest the system can help them locate a physical instance of that work. Exactly how this will happen is currently an open issue.

  7. An important component of ALEG is biographical material.

  8. ALEG will be used by some partners to store records describing archival items, not just manuscripts but correspondence, exhibition and promotion material, galleys, etc. In the long term this material will probably be housed in a separate (but linked) national system, but in the short and medium term, ALEG must accomodate this material to facilitate the migration from some partners current systems to ALEG.

  9. ALEG will probably need to record some information about its users, and maybe about how they use the system.

  10. ALEG will need to record changes made to its data by the contributing indexers. It will also need to support basic workflow: the progression of changes from 'entered in the system' to 'available to the public'.



Basic Data Model

In this model:

Ancillary entities are not shown:

Some examples of how familiar entities are modelled may help.

In this example, "Voss", the novel by Patrick White is represented as a single work, two expressions (one expressed by Patrick White, the other expressed by Fernando Feroka) and two manifestations (one published by Longhams, the translation published by Garcia).

The Expression (as we shall see below) records the language, allowing the two expressions of "Voss" (the work) to be distinguished.

The above image represents the creation, translation and publication "Events" using a shorthand - a simple arrow. However, the system represents these events as entities, recording attributes of these events (time, place, input(s) and output(s)). For example, the translation event would be properly represented as follows:

Although explicit, this representation clutters up the diagrams, so we won't show events as entities explicitly, although they are really there in the model!.

The following diagram shows how we'd represent "Voss", the opera, and a review of this opera. For clarity, the translation has been dropped, events are not shown as entities, "The Bulletin" (the Serial) is not shown, nor is the publisher of "The Bulletin":

As another example, consider "Sunday Lunch", a short story by Antigone Kefala which appeared in vol 2/3 of "Aspect" and Dale Spender's "Penguin Anthology of Australian Women Writing":

Here, "Sunday Lunch" appears in two manifestations, each of which are "partOf" another manifestation. Typically, many other manifestations would be linked with serial issue and anthology manifestations.

Note also that "Aspect" (the Serial) appears as a Work, linked to (what would be) many "Aspect" (Serial Issue) Works.


Core Entities

Work

Attributes of work:

Relationships of work:

Expression

Attributes of expression:

Relationships of expression:

Manifestation

Attributes of manifestation:

Relationships of manifestation:

Agent

Attributes of agent:

Relationships of agent:

Holding

Holdings will only be stored on ALEG if they cannot be automatically obtained by searching external systems (especially Kinetica). It is not the intention of ALEG to maintain holdings information if it can possibly be avoided!

Attributes of holding:

Archive Item

Archive Items are only held by ALEG to ease the transition of some partners onto ALEG, and will hopefully eventually reside in a national archive description format and database. For details of the proposed interim archival item facility, refer to the Archive Items proposal.

Award

Creators can be awarded either for a specific work or for their body of work.

Attributes of award:


Attributes common to all core record types


Infrastructure Record Types

RegisteredCustomers
Organisations/people who are registered as customers. Not all users are RegisteredCustomers, as some material is available freely to anyone


Notes

The system needs to represent uniformly and efficiently these common conditions:


Issues


The AUSTLIT Indexers (Tessa Wooldridge, Jenny Huntley, Lesley Banson & Jane Rankine) produced an extensive discussion document on this inventory (Word document) on 6 June 2000.

Home > Data Model
Kent Fitch, on behalf of Marie-Louise Ayers, Annette McGuiness and Kerry Kilner
k.fitch@adfa.edu.au
Initial Draft: 22 May 2000
Revised: 26 May 2000
Revised: 9 June 2000
Revised: 27 June 2000
At this stage, a decision was made to
move from the Work/Instantiation model to
Work/Expression/Manifestation. The old
Work/Instantiation version of this
document has been archived here.
Revised: 4 July 2000
Revised: 27 July 2000