Taking RDF and Topic Maps seriously
- what happens when you drink the Kool Aid
Presentation to AusWeb02 conference, July 2002 to accompany
the paper.
Kent Fitch, AustLit Project AustLit Project,
Academy Library,
UNSW@ADFA,
Australian Defence Force Academy, Canberra, ACT, 2600. Email:
k.fitch@adfa.edu.au
Overview of AustLit
- a database of bibliographic citations and a developing body of
full text for almost 400 000 Australian creative and critical works,
and to biographical and organisational information on more than 50 000 Australian authors
and literary organisations.
- for more information:
Metadata
Godfrey Rust, Technical Coordinator, INDECS project
Metadata 2010, presentation to British Library Seminar, Sept 99:
Powerpoint
presentation available from the The Internet Archive
Metadata not descriptions
- identifiers, not words
- relationships, not labels
- events, not things
Datamodels
Events as "first class objects"
AustLit chose:
- IFLA's FRBR plus
- Events as "first class objects" (INDECS, Harmony projects)
- Everything as a topic
Topic Maps
Imagery from Steve Pepper's The TAO
of Topic Maps
Topic Maps -v- RDF: a non-issue for us; we use the idea of "Topics" and "Associations"
from Topic Maps as they reinforce the simple entity but rich relationships
feel of the Harmony model, where "it" is ALL in the relationships between things.
Diversion: Emancipating Instances from the Tyranny of Classes - Jeffrey Parsons/Yair Wand
When does a human play these roles:
Author? Publisher? Biographer? Artist? Sponsor? Subject? Parent?
When does an organisation a play these roles:
Person? Author? Publisher? Biographer? Artist? Sponsor? Subject? Parent?
Q
What is a publisher?
A2 An instance of class-type "publisher"
A1
Any thing that plays a role of publisher
Parsons/Wand propose a 2 layer system:
- things with properties
- classes of interets defined by things with have sets of properties
For example:
- Authors are defined as creators of works
- Publishers are defined as embodiers of expressions into manifestations
- Biographers are defined as authors of works which have a work-type property of
"biography" (or even "purer", as authors of works which have a person as a
biographical subject, where a person is defined as ...{a thing with certain set of
properties})
What does this suggest for Topic Maps? Perhaps that "topic type" is
unnecessary and even an impediment to information modelling:
- why have to decide on the "type" of a topic?
- isn't it better to have the "type" implied by its properties?
AustLit and the "Semantic web"
We'd like to be able to "automatically"/"dynamically":
- link to library holdings of works/expressions/manifestations (eg, using NLA's Kinetica)
- link to full text holdings works/expressions/manifestations (eg, SETIS, Project Guttenberg,
copyright holders' pay-per-view databases...)
- link to manuscript archives (NLA RAAM, Guide to Australian Literary Manuscripts, ...)
- link to related databases (AusStage, publisher catalogues, ...)
- link to current content of web-published newspapers and journals
- allow our data to be extracted/merged/reused (eg, with information from an Indigenous writers
database)
But how could this ever happen?
Only with:
- identifiers, not words (ie, shared, immutable identifiers for everything)
- relationships, not labels (ie, explicitly identified roles and relationships)
- events, not things (ie, explicitly identified events of sufficient granularity to
identify inputs and outputs and rights holders)
Good news
- FRBR helps a lot
- Harmony helps a lot
- Core philosophies behind Topic Maps helps a lot (unambiguous identification of
topics, reification of associations)
- Parsons/Wand helps a lot (provides class independence and prevents premature classification)
Bad news (sharing is hard)
- shared immutable identifiers are hard (expensive to coordinate and maintain)
- metadata and shared vocabularies are hard:
About the Presenter
Kent Fitch has worked as a programmer for over 20 years. Trained in Unix at UNSW in the
1970's, he has worked in applications, database, networks, systems programming
using a wide variety of tools. Since 1983 he has been a principal of the 3 person
Canberra software development company, Project Computing Pty Ltd. He has developed
many successful commercial systems and communications packages and custom software
for many clients. He has been developing software for web sites since 1993 and
currently specialises in Java programming, applications of XML and RDF/Topic Maps
and web based user interfaces. Aside from AustLit, Kent contributes to open source projects
and is currently working on a web site archiver.