Background
ALEG is not a library catalogue system. Although the "base" entities
described by ALEG can be cast in terms used by traditional library systems
such as "title" and "author", ALEG's reason for existence is not to
duplicate the National Library of Australia's
Kinetica facility. Rather,
it is to make available a rich resource for people interested in Australian
Literature by providing:
biographical information on creators
extensive subject description of works, including relationships between
works, creators and general topics
information about criticisms and reviews of work, including subjective rankings
contextual (guided) access to full-content material where possible
There have been several recent data modelling and architecture exercises which have
greatly influenced the ALEG data model:
The IFLA FRBR
The INDECS project
The Harmony ABC strawman proposal
The ISO Topic Map standard
The IFLA FRBR
A recent (1998) development in the data modelling of library systems
was completed by the International Federation of Library Associations
and Institutions (IFLA). Known as the
FRBR
("Functional Requirements for Bibliographic Records"), it
"teases apart" the concepts of Work, Expression, Manifestation and Item
and in so doing helps model and understand the relationships between titles.
For example, it can clearly identify two manifestations as embodiments of
the same expression, or two expressions as realisations of the same
work, although one may be a language translation of another.
The INDECS project
The INDECS project
(INteroperability of Data in E-Commerce Systems)
was established to develop a metadata framework for
representing intellectual property and the transactions
involving it. Their Schema
and Model documents
have a strong bent towards enabling e-commerce rights management related
transactions, but necessarily require a precise modelling of the
intellectual works and the agents who contribute to them. They also
use the basic Work, Expression, Manifestation and Item representations
of FRBR, but introduce the concept of the "Event" that describes how
these products came about - who did what, the context, the inputs to
the process.
The Harmony ABC strawman proposal
Regardless of problem domain, many entities and relationships
which different system attempt to model are pretty much the same.
Rather than force each implementation to develop their own
representations, and hence waste effort and complicate
interoperability, the
Harmony
ABC proposal attempts to define a common framework which
diverse systems will be able to use. The ABC proposal acknowledges
that FRBR and INDECS were major sources of inspiration for their
work.
The ISO Topic Map standard
Thesauri are recognised as useful tools which add
structure to a subject list. The ISO Topic Map
standard
(ISO
13250) defines data structures which can be used as
a standard thesaurus (with broader, narrower, see also and preferred types
of relationships between topics), or as a standard "index" to a collection
of works. But Topic Maps have some other characteristics which make
them extremely powerful, including:
Topics can be assigned Topic Types (eg, "illustrator", "mountain",
"country", "era"). Topic Types are themselves Topics.
Topics can be arbitrarily linked together by "Associations". Associations
each have an "Association Type", which is itself a Topic.
Topics are linked to resources by "Occurrences". The resource
being linked to plays a "Role" in the Occurrence. Roles are
themselves Topics.
A set of Topics, their associations and occurrences form a "Topic Map".
The Topic Map is quite separate from the underlying resources which
it describes. Hence, multiple Topic Maps can be assembled and
maintained quite independently of each other, and the underlying resource.
The ALEG data model
After reviewing the goals of the
ALEG system and the data and relationships it needs to represent,
seeking the opinions of experienced and respected figures in the Australian
library community and reviewing the information discussing library
data modelling, we have decided to base the ALEG data model on FRBR and INDECS
models, adding:
an entity to represent awards
an entity to represent holdings information where a manifestation of
a work has been sighted
Topic Maps to represent subjects, including relationship types (for
example, "creator", narrower term "illustrator")
Encoded Archival Description (EAD) to represent archive items (initially populated by
item descriptions from the Lu Rees collection)
and removing:
the FRBR Item entity
the transaction aspects of INDECS
For a full description of the development and contents of the
ALEG data model, refer to the data
model documentation.
Maintenance User Interface
The ALEG Data Maintenance User Interface will be used by ALEG partners
to maintain the ALEG data base. The interface will be web browser
based, hence special software will not need to be installed and
upgraded.
There are many web browsers each with varying capabilities, adherence
to standards, popularity and cost. It has been decided to base the
ALEG Data Maintenance User Interface on Microsoft's Internet Explorer version 5
(IE5) because IE5:
implements an advanced programming interface which allows the
development of sophisticated applications
is a stable product and widely installed, having being released in March 1999
is available for current and recent popular operating systems
(Windows 95/98/NT/2000, Macintosh)
and is automatically bundled with most operating system installations
has overwhelming market share (estimated at
over 85% by Stat Market as
at 18 June 2000)
is free
[IE version 5.5 has just been released (12 July). Claimed new user functions
are pretty minimal - print preview, better performance. However,
support for programming has been significantly improved, and IE5.5
is probably a significantly better platform for undertaking
user interface development, although many of the new features
are apparently
not
aligned with the W3C standards in-progress.]
The guiding principles of the user interface are simplicity, speed and
ease of use. The user interface must allow the user to do as much
as possible with as few keystrokes/mouse clicks as possible. It must
present the most likely action as the default and do as much as possible
to prevent common mistakes and maintain the integrity of the system.
Where-ever possible, the system should acquire data by requiring that
the user makes a
selection from possible values rather than having to type a complete
value into a text box. The interface must remember the context in
which the user is working and provide appropriate "templates", default
values and an easy-to-access list of recently assigned values to
reduce searching.
A simple mockup of a web page to allow editting on a
work/expression/manifestation is available
here, but it
will only be readable if you are using the Microsoft Internet Explorer 5
(IE5) browser. (If you are not using IE5, screen snaphots which
can be viewed with any browser are available here.)
Some of the required specific attributes of the editting facility are:
It must be fast. As ALEG partners will be connecting over a Wide Area
Network, it must minimise round-trips between the browser and web server
and cache data in the client.
Select rather than enter. The system should present users with
valid selections from lists or as checkboxes/radio buttons where-ever
possible rather than requiring entry of text. For example, the
relationship of an agent with a work should be chosen by the user
from a list, which shows possible relationships (and further, that
"publisher" would be a valid relationship for a manifestation, but
not for a work or expression must be recognised by the system).
Search/browse and select. For many selections, the "list" of
possibilities is too large to practically show (too slow to load to the
client, too cumbersome to navigate). In these cases, such as an agent name
list, the system should allow the user to type as many or few characters
as they wish to restrict the search. For example, a user may enter
"white" to retrieve the 103 "white" names into a select list. Each
entry in the select list must show enough information to allow the
user to differentiate them (eg, full name, dates and locations where
available) and allow the user to find out more about a particular
item in a popup window (eg, for a name, show related names, brief
work list, link to a biography if available).
Local error detection. Ideally, errors should be prevented
by the user interface, but otherwise editing should be performed in the client
so that errors can be detected as soon as practicable and without a round-trip
delay to the server. Some types of errors cannot be detected by the system (a typo
in a new title), but where-ever possible, the system should detect
errors as soon as possible and with the least delay to the user.
Context sensitive operation. The system must present a "template"
which adapts to recognize the type of data being entered. For example,
if the user indicates that the type of work being entered is a periodical
issue, then the system should react accordingly:
Prompt for the periodical work of which this new work is a specific issue.
Prompt for the issue-specific details (number, date).
Make it easy for the user to lookup details entered against the
periodical work. For example, editors will typically be entered
against the periodical and "inherited" by each issue in the tenure
date range of the editors. When deciding whether to add extra editors
against a particular issue, the user must be able to quickly and easily
check which editors are already in effect, and decide to update
editor details against either or both of the periodical work and the
new issue in hand.
Because the user will often want to now record manifestations of
existing or new works as being "partOf" this periodical issue manifestation,
the system should remember this periodical issue as a likely
"container" when the user wants to record new manifestations of works.
Sensible defaults. Where the context of the application
permits the system should choose sensible defaults. For example, when creating
a relationship link (or creation event!) between a work and an agent, the
role of the agent should default to "creator". The language of an
expression should default to "English".
Notes at various levels. The system must allow the user to
record notes at various levels, such as at the work level, each
expression and manifestation level, and each agent, topic (thesaurus entry),
relationship and event
level. The notes can be of 3 specific types:
a general public note, designed to be shown as part of the detailed record
view - for example, "Although nominally the new-talent editor, he is known to
have spent most of the 1960's completely sozzled and was about as welcome
at an editorial meeting as a pokie at a parson's picnic". But beware of
making actionable statements.
a public source note, describing where this information was sourced
a system-internal note, not designed for 'public consumption', but
possibly clarifying some fact or making some observation
Each of the 3 types can occur multiple times and is 'stamped' with the
userid and date time of creation.
Update History. The system must make it easy for the user
to see the update history of the information in front of them - who
updated what and when.
Mass updates. The system must support the common mass
updates which must be applied to multiple records. In some ways
this may be relatively easier in this system as information is
stored in as few places as possible and referenced (pointed to)
rather than being copied. Hence, if the volume/issue information
for a periodical issue needs to be changed, it just needs to be
changed in one place, as all the 'partOf' manifestation records
of the manifestations that periodical issue contains point to
the periodical issue, rather than duplicating information about it.
An anticipated common mass update will result from ongoing
thesaurus/topic revision, where one term replaces another.
[Any other common cases which need to be dealt with initially?]
An update scenario - adding a new manifestation
Click a "+" against one of the existing manifestations
to create another one by cloning the selected manifestation.
A new manifestation appears on the screen. (The mockup page
lets you see how this might happen - what follows next has not been
implemented in the mockup.)
Probably amend the title, edition and ISBN/ISSN field if they
are not applicable for this manifestation
Edit the publication event. This will open a new popup
window lookup something like this:
Because you're telling
the system that you want to define an event with a
manifestation the system will
default the type of event to 'publication' (but you
can change it if you want it to be something else, such as
'printer').
Because the event type is 'publication', the system will
present you with a way to select publishers as being the
agent in this event.
Now, the system will know all the agents that have ever
be assigned as publishers, and it will know the last
10 or 20 publishers which you personally have assigned.
So, it would probably be a good idea if it:
lt you select very easily from the last 5 or 10 or 20 (?) or publishers
maybe be showing them in a list
let you search very easily from all the publishers in the
system, maybe by entering the first 2 or 3 characters
of their name, to which the system would respond by
showing all the publishers starting with those characters
and allowing you to select one (this would be then added to
your list of recently assigned publishers, making it even
faster to select it next time...)
if you need to assign an agent which isn't normally a
publisher, it should let you search all agents
if you need to create a new agent (new publisher), it
should let you do that easily and without losing the
context of the manifestation you are working on - that
is, it should do it in popup-windows and not force
you to move off or close the manifestation window
it should allow you to specify special common cases
in some simple way, such as indictating something
was published by the author, or publisher is unknown.
let you assign a publication place and year. Again, places
will be selectable from the spatial topics, favouring those
that have ever been used as publishing places, and remembering
those which you have assigned recently.
let you assign source, public and system notes and
a workflow note to be emailed to a co-worker
The above screen-shot is quite large and busy. How could
it be simplified?
Don't show the most recently assigned publishers. Because
manifestations will be processed in a "random" order (with
regards to publishers), remembering the last 10 or 20
publishers doesn't help much. Instead, the indexer
must search for the publisher each time by entering
the start of the publisher's name
Move the publisher search to a popup. For example, the
"base" event screen might start like this:
The user wants to select a publisher starting with "Bloom"
so they enter the letters "Bloom" in the agent field
like this:
and then press the search button which pops up a window looking
like this:
This might look better, but is it "better" in practise
for an experienced user? This is hard to say - some people
don't mind popup windows, some people find them
distracting. Some people are most interested in minimizing
keystrokes, others prefer step by step approaches even
if it means more typing....
The same comments apply to the place of publication -
defer selection to a popup window?
Alternatively, as the publisher has a very strong
correlation with place of publication, the selection
of publisher could populate a selection list of likely
places (and allow for new places to be defined for the
first time).
If the Public and System notes and workflow notes are
only infrequently used, then they could be 'hidden' and
only shown when actually used, or when a 'show notes'
button was pressed (which would possibly show them
in a popup, ready to be manipulated).
Many of these issues will arise when the prototype
system is actually tested. The goal will be to refine
the prototype to produce a design which will
allow the people using it to be as productive as possible
when they are experienced users of the system.
That is, whilst it is important for the system to
be easily learnt, it would be a mistake to orientate the system
solely towards inexperienced users at the expense of
the productivity of experienced users.
Management and Coordination
Although the database is centralised, the operation of ALEG
is distributed across Australia. The database will be maintained
by different groups and the system must assist communication
between those groups and make appropriate adaptations to the
differences in the ways those groups will want to work.
The system will support distributed operation in these ways:
Different update/authorisation levels.
The system will define Roles which can be selectively assigned to the
staff of ALEG partners to allow them to update
information. Suggested roles are:
My personal experience in this area is that
trust is rarely, if ever, abused. That is, with a skilled
and dedicated team of people, they'll never
do the 'wrong thing' anyway, and time spent
putting lots of programming and administrative
effort into defining and maintaining permissions is better
spent putting in place a system which makes recovery from honest
mistakes as painless as possible...
Audit trail.
The system will provide an audit trail showing who did what and when.
Ideally, it will allow some limited 'undo' when someone mistakenly deletes
a topic, or merge two terms.
Workflow support.
The system will allow records to be flagged as incomplete or
needing intellectual input from a nominated person or persons
before they can be completed. The system will allow maintainers
to see what work is waiting for whose input. It will also support
the maintenance of internal notes being attached to an entity or
relationship to show the history of discussions on an issue.
Identification of entities as part of a collection.
Entities can be identified as belonging to a
specialist collection. It is proposed that the topic map architecture
be used to assign one or more collection topics to an entity,
making possible the identification and extraction of those entities as part
of a collection.
Non-public entities.
The system will support marking entities and relationships as
being in a non-public state, where the information they contain
can only be seen by ALEG partner staff.
No training required
Doubtless, a few obligatory help pages will be constructed and
a tip-of-the-day may lurk unobtrusively at the bottom
of the home page, but noone will be expected to refer to these
resources in order to use the public ALEG system. (The maintenance
system will require documentation and training however, as that
user interface will be much more complex and powerful.)
Searching and browsing
The system will offer the user two intertwined access modes, searching
and browsing. More on this below.
Clean layout, fast loading
Sorry, no, changed our mind - messy, impenetrable and sluggish as treacle
Navigation
Page headers and footers will show the user
the context of the current page and allow them to move
rapidly up the content hierarchy and previous page
and to the home page, heeding the design advice
of Jakob Nielsen on
navigation and
on providing an
unsuprisingly user interface.
Browser aware
Users with more recent browsers will be able to receive an enhanced
browsing experience
Different browsers have different capabilities. Although
Internet Explorer 5 arguably has the greatest capabilities and
greatest speed, is free and available for most operating systems,
some users choose not to install it, or can't for whatever
reason.
System designers like to be able to offer the "best"
user interface, and not just one best one circa 1995
lowest-common-denominater web browser technology. With Internet
Explorer now bundled with Windows, and having massive market share,
it seems reasonable to target the system to IE4 and IE5 users.
But Netscape is more widely used in academia than in the general
population, and in any case, a system can't ignore 15%-20% of its
potential users.
The approach taken with the CSIRO
web site was to generate the content in XML and to then translate
it into different versions of HTML/DHTML depending on the
client capabilities. This approach has some drawbacks:
the site
can look slightly different to different users
web caches are
nullified because the content cannot be reasonably cached as
used by users with different browsers
some things are just fundamentally impossible in some browsers
and so it is more than a matter of styling to provide the same
information
but nevertherless is better than the alternatives of not optimising
presentation or ignoring some users.
The layers programming model used by Netscape Navigator 4.x is
defunct - it will not be supported by future versions of
the browser. The new version of Navigator, the open-source
Mozilla browser,
has been long-delayed but is now in alpha
test. Although there were initial hopes that IE and Navigator
would converge to implement a common standard, those
aspirations are fading,
forcing projects like ALEG to choose between a lowest-common-demoninator
approach or supporting multiple browsers for a long, long time.
Hence, given market share and programming realities, the ALEG user
interface will have as a primary target
users of IE4/5 but produce a suitable (if not optimised) version
for other browsers and will support any browser capable of
rendering nested tables (Netscape 2 and above).
User configurable
ALEG will offer a default user interface, rendered using slightly
different technqiues dependent on browser capability as described
above. However, some users may which to configure the user
interface, for example:
to use or not use frames
to use a version which does not require any
client-side javascript capability
Search on every page
A search box and button will be prominently visible on every page.
Sometimes a search may be performed on the whole site, or within
some clearly identified context (this technique is used by the popular
directories, eg this
page offers a search on the whole Google Directory, or just within the
romance section; this
page is Yahoo!'s less explicit equivalent.
ALEG branding
The web site is the only contact most users with have with the
ALEG project. The ALEG user interface will consistently and
simply promote an ALEG identity through a simple logo (yet to
be designed) and style (colour scheme, layout).
Searching and browsing
Users can discover the contents of the ALEG data base by
searching and browsing.
Searching refers to the entry of some text by the user
describing what they are look for. Some search systems
only allow the user to specify text. The system then returns the
results which it evaluates as best matching what the user
is looking for, ranked in an order which the system evalautes
are most likley to meet the user's expectations. Popular
examples of this apparently simple search strategy are
Google and the main
search at the top of the Amazon.com
site.
"Apparently simple", because the system has to work out
what the user means (which is anything but simple).
For example, enter "fence" into the Amazon.com
search engine. Now try "Harry Potter". The Amazon search engine groups
likely results into categories: books, music, DVD's, videos,
electronics, software etc.
Google relies on a
recursive-citation-ranking algorithm
(paper in PostScript format)
to order the pages matching a search criteria, but has recently
augmented the search results with a category result where the
search term contains a known topic (eg,
search
Google for "Harry Potter").
Another approach is to get the user to "help" the search
engine by telling it more about the context of the search
phrase they're entering, and possibly supplying filters
to reduce the returned results.
For example, Amazon offers a detailed and "power search" if you
look
hard enough.
For many sites, such as the British
Library OPAC, getting the user to supply lots of information
up front in return for providing an accurate search is the modus
operdani.
Directory browsing takes a different approach of
classifing of all material (often in many ways)
and presenting the user with a hierarchical classification
directory. Popular examples include
Yahoo! and the
Open Directory Project
(also hosted as the Google
Directory).
What approach should ALEG take? The decision doesn't
have to be made now, and can be changed during
or after implementation. Here are some discussion
notes:
If you don't provide reasonable query analysis and
result ranking, you're going to need to let the user
specify scope and filters. Building systems which provide
good query analysis and result ranking takes time, so making
the user do it themselves could be a viable cop-out.
(Them's fightin words...)
Even if you provide excellent general ranking, there will
always be occassions where a "hand-crafted" search by the
user could have returned a more precise match
ALEG will have topics galore, but will we build
enough hierarchy into the topic maps/thesaurii to support
an appropriate fan-out of topics at each level
(not too many, not too few)?
Searching within a scope narrowed down by a directory
browse works very well (for the patient at least).
Step-wise refinement and query set manipulation was
popular with search engines of 5 and more years ago. Is
it still a useful technique, or is it just as easy to
do the search again with another search term appended?
Some approaches...
Simple search
Simple search specifying very broad scope
Getting context from the user more explicitly
Getting lots of context from the user
What comes back?
The hyperlinked items in the above displays take the user to
a page containing the described information (Patrick White's
biographical data, the page describing "Voss" as known
to ALEG, etc). The crudely drawn + and - buttons just
open and close parts of the information hierarchy shown to
the user. (Doing this is relatively straight forward with
IE5, but so difficult as to be impractical under Netscape.
Hence, users with older browsers would not see this
dynamic display of information, but would either get a long
list or hyperlinks which would result in another page being
sent from the server when clicked. The directory approach
of Yahoo! would suggest that work forms may be sub-directories,
reviews of those works sub-sub directories, etc.)
What happens when the user clicks on the "Reviews of Works"
button, with 1134 results? Well, 1134 is too big a list to
show, so these would have to be grouped somehow, probably
based on the works being reviewed.
That seems straight
forward, but may not be a general solution. For example,
"Patrick White as subject" describes 256 works. How should
they be grouped?
The user should also be able to search within their
current scope. So, if the user has somehow positioned
to "Voss", they should be able to search everything "under"
that position for some other term, say, "David Malouf",
and presumably find the relationship between "Voss the novel"
and "Voss the opera".
It is clear how to do this
with the directoy paradigm. For example, imagine you
were on the Open Directory Project's
Nevil
Shute page, and you entered "Alice Springs" in the
search box and made sure the option "Search only in Shute, Nevil"
was displayed (give it a go!) - you'd probably be pretty
happy with the results. But how to do it with a + and - tree
directory (like Windows Explorer) isn't quite so obvious.
The Windows paradigm would have users right-mouse clicking the
item and choosing "Find" from the popup menu which would open a dialog
box and put the search results in a new window.
Locating holdings
For some manifestations of works, ALEG will record information about where
it has been sighted. This information will be shown (decoded) to the user.
However, a goal of ALEG is to offer detailed and relevance-ranked holdings
information to user, highlighting "local" and "nearby" holdings at the
top of the holdings list.
It was never imagined that ALEG would store this information, as it is
already available in NLA's Kinetica. However, how best to acquire this
information and show it to the user is still undecided. The issues include:
Access to Kinetica holdings information costs money. ALEG would
probably have to pass these charges on to the user, hence forcing
subscription access, at least for users incurring costs.
ALEG could issue Z39.50 queries to leading libraries (University and
State, perhaps others) rather than issue the request to Kinetica. This
is obviously more expensive in total resource terms than issuing a single
request to Kinetica (but will not incur financial charges), and could be problematic due to diverse Z39.50 target
implementations and Z39.50 target availability
Users will sometimes be interested in the availability of a work
or expression, rather than a specific manifestation. This will mean
that ALEG normally wouldn't be able to issue a query on a unique
identifier (such as an ISBN (!!)), instead trying for a match on
author and title. Hence, the results may be far from ideal.
A useful service would ranking the holdings to show those
'nearest' to the user at the top of the list. This is impossible
unless we know quite a bit about the user's location. Maybe we can
get this information from the user's profile (registered user) or maybe we
can deduce it from the IP address (a hit and miss affair, especially
with multiple campuses, ISP's, wireless networks).
For these reasons, we are not hopeful that holdings information
will be one of the great features and attractions of ALEG
by Janurary 2001, although we will work on approaches to these
problems over the next few months.
Accessing archive items and manuscript collections
As discussed in the data model(needs updating) archive items will be held in
Encoded Archival Description (EAD) format
as part of the ALEG database, pending a more permanent home. Initially,
these resource will not be searchable but will be accessible only
from the agent associated with the archive item collection, and occassionally, from
a work associated with some items in the archive item collection. Where
archival information is available it will appear as a hyperlink from
the agent or work page, and the user will receive a simple formatted list of
archival item records, grouped by record type. This will be achieved by
formatting the EAD with a simple XSL stylesheet
into HTML.
Information about manuscript collections is statically recorded
in ALEG. That is, ALEG does not dynamically search RAAM
and present what it finds. Rather, the indexer decides whether to
link a manuscript collection to an agent (or possibly a work)
and this information is presented to the user, typically as a note and
an optional hyperlink to a specific resuorce in RAAM or elsewhere.
A task for a future enhancement, maybe in concert with the
redevlopment of RAAM could be to have a dynamic link, or maybe
an alert when new material was added to RAAM.
Access restrictions
Whether ALEG is completely or partially a walled garden has yet to
be decided, and probably won't be fully decided by the time implementation
must start (if ever!).
However, to allow for possible access restriction requirements,
the system:
must support registration of users. The
data model inventory
describes a simple data structure to represent "customers".
must support identification of users ("logging on") at least
to some parts of the system.
must support the logging of what users do where that action may
give rise to some charges (searching on Kinetica??).
must support restricting access to some information to registered
users, or some group of anonymous users based on connection IP address
(for example, users from ALEG partners may have complete anonymous
access to the system). The information to be restricted has yet to
be decided...
need not implement an interface with an external accounting system
for the automatic generation of invoices and reconciliation of
payments, as this will probably be manual or semi-automated and is
outside the scope of this system for the time being.
Multiple Views
As mentioned above, ALEG will enable
entities to be grouped into collections. Collections
can be extracted by their owners for their own formatting and
publication, or can be exposed with their own identity on
the ALEG site.
For example, assuming that a South Australian Women Writing
collection is created, then it may be decided to:
Put an icon/link on the ALEG home page in the "specialist collection
information" part of that page
Link to a page containing an appropriate banner,
acknowledgements, background etc and an alphabetical hyperlinked list
of South Australian Women Writers
Link each writer to a page containing bio info and hyperlinked list of
works
Link each work to a page showing work details
Enable a simple author/title search over the collection.
An option to link back to ALEG for more complex, complete searching.
Branding.
Each agent and work, expression, manifestation entity can be
identified with one or more partner institutions as being the
contributors to that entity. (Agent biographies can also be
separately so identified.) As part of that identification,
the contributing institution can tick a checkbox which
says: "this institution should be explicitly acknowledged
as a contributor to this entity when it is display in detail".
When required, the ALEG interface will then display a subtle
hyperlinked acknowledgement (text and/or icon) which the user
can follow to an institution-specific page on ALEG which
would typically contain a blurb about that institution's
contribution to ALEG and further links. This page would be
of the institution's own design.
Source.
ALEG will have a provision for a public 'source note' to be recorded
on each entity and relationship. Where a public source note is
available it will be shown in the detail view of an entity/relationship.
The initials of the ALEG maintainer creating the source note will
by shown and hyperlinked to a maintainer-specified page on ALEG
showing a happy snap, resume, favorite cocktail recipe, whatever..
What a fine way to personalise ALEG and give a human voice to
all that information!
Logging
The system will generate two types of logs:
Because the ALEG interface will be running under a web server, a standard
web log (in
common
log format) will be produced. Freely available and flexible log
analysis tools such as
Analog can read such a
log to produce basic access statistics by system function based on
IP address/network, and, if user authentication is required, by userid.
"Interesting" parts of the system will log
search terms and search options for latter analysis to help understand
how people are using the system and identify areas requiring
improvement (such as new thesaurus entries for "see also" terms).
General Principles
The primary rationale for establishing the Australian Literature
Electronic Gateway is:
To enhance research and learning in Australian literature
In addition, the project has a number of subsidiary rationales:
To integrate and maximise the utility of existing infrastructure and content
To provide a publishing vehicle for the results of long term research
To capitalise on the development of new technologies which support these aims
To fulfil and deliver on obligations to the ARC and partners
To deliver a public good in the public interest
The delivery of a public good in the public interest itself involves:
Developing long term sustainability of the public good
Commitment to a co-operative, collaborative, non-profit ethos
Ensuring that financial decisions, including achieving optimal cost recovery,
support but do not drive the Gateway's strategic goals and primary rationales
Intellectual Property
The Gateway will encompass three distinct sets of Intellectual Property in support
of its primary and subsidiary rationales
The IP residing in the system design
The IP residing in records and other information provided by partners
The IP residing in records and other information sourced from third parties
The Gateway system will support identification of the IP residing in partner and third
party records and other information (see Branding and Source Identification), and an IP register will be maintained by the project
managers. All IP will be protected and formalised using the Blake Dawson Waldron Draft Gateway Agreement
as a basis for formalisation of ownership, relationships and licensed uses of IP.
Public Access
The Gateway system will support partial or complete access restrictions (see Access Restrictions) to
all Intellectual Property (such as records), and will support flexibility in setting such
restrictions. The project's aim will be to maximise the amount of quality information available
free of charge - and which will substantially populate generalist directories such as Google's Open Directory Project -
whilst retaining sufficient 'value added' attractions behind the 'walled garden' to achieve
optimal cost recovery.
See the Business Model
for fuller discussion of
business issues in relation to the Service, Partner and Data Management modes.
ALEG as a Z39.50 target
ALEG will host a Z39.50 target implementing the
Bath
profile, initially just for Functional Area A (Basic Bibliographic Search
and Retrieval) at Conformance Level 0, as described
in
section 5.A.0 of the Bath Profile, with support for the
XML
DTD
and Simple Unstructured Text Record Syntax (SUTRS) and MARC21.
Extracting information from ALEG
The ALEG system will licence partners to
extract records from the database in XML format. The criteria
for extract (ie, how to specify what is to extracted) has yet to
be decided, but typically examples could be:
All agents associated with a specific attribute or Topic (or set of
attribute Topics) in
a specific role (or sets of Roles) and the works associated with those
agents in a specific relationship (or set of relationships).
(For example, all agents with a gender of "Female" and either a birthplace
of "South Australia" or some period of residency in "South Australia" and
all the works they have "created" or "editted".)
All works and/or agents associated with a nominated
collection.
All archival items associated with a nominated agent and/or work. (This
would allow the production of the Lu Rees author files.)
Publishing the ALEG Topic Map
Initially Topic Maps will be used as a useful abstraction and
design principle and implementation technique rather than as
public way to access the ALEG database. However, the development
and implementation of the system will be undertaken cognisant of
the great potential of making Topic Maps publically available,
exposing the base resources to be described to external Topic
Maps and merging Topic Maps from different sources.