AustLit logo
Issue Details: First known date: 2011... 2011 Sustainable Data From Digital Research: Humanities Perspectives on Digital Scholarship
The material on this page is available to AustLit subscribers. If you are a subscriber or are from a subscribing organisation, please log in to gain full access. To explore options for subscribing to this unique teaching, research, and publishing resource for Australian culture and storytelling, please contact us or find out more.

AbstractHistoryArchive Description

'Academic fieldwork data collections are often unique and unrepeatable records of highly significant events collected at considerable expense of researcher time, effort and resources. While fieldworkers have been quick to take advantage of digital technologies to enable them to collect and organise their data, standards and workflows are only now beginning to emerge to assist researchers to submit their data for archiving and access. This collection of refereed papers from the conference of the same name held at the University of Sydney in December 2006 provides a record of recent research practice by fieldworkers in linguistics, botany and anthropology, and by archive and repository managers.' (Publication summary)


  • Contents indexed selectively.


* Contents derived from the Melbourne, Victoria,:The University of Melbourne , 2011 version. Please note that other versions/publications may contain different contents. See the Publication Details.
Introduction, Nick Thieberger , single work criticism (p. iv-1)
The ‘Language Archiving Technology’ Solutions for Sustainable Data from Digital Research, Sebastian Drude , Daan Broeder , Paul Trilsbeek , single work criticism

'Since the late 1990s, the technical group at the Max-Planck-Institute for Psycholinguistics has worked on solutions for several of the questions addressed in this paradisec-meeting, in particular, how to guarantee long-time-availability of digital research data for future research. The support for the well-known DOBES (Documentation of Endangered Languages) programme has greatly inspired and advanced this work, and lead to the ongoing development of a whole suite of tools for annotating, cataloguing and archiving multi-media data. At the core of the LAT tools is the IMDI metadata schema, now being integrated into a larger network of digital resources in the European CLARIN project. The multi-media annotator ELAN (with its web-based cousin ANNEX) is now well known not only among documentary linguists. Other tools such as the lexical database tool LEXUS, the related knowledge-space builder VICOS and others are not yet widely used. With further development and integration with other tools they also have the potential for being useful tools for representing non-time-related linguistic data. We aim at present an overview of the solutions, both achieved and in development, for creating and exploiting sustainable digital data, in particular in the area of documenting languages and cultures, and their interfaces with related other developments.' (Publication abstract)

(p. 1-23)
Going beyond Archiving – A Collaborative Tool for Typological Research, Alexander Borkowski , Andrea Schalley , single work criticism

'The work described in this paper aims to outline some of the design aspects for a collaborative tool for typological research. This tool is designed to allow for the collation, from multiple contributors, of linguistic examples and their analysis with regards to an open set of variation dimensions of both onomasiological and semasiological nature. The resulting knowledge base combines linguistically relevant categories of human conceptualisation (e.g. in-group, such as ethnic or family group, categories) together with their linguistic coding (e.g. in gender affixes, verbal agreement), all based on actual linguistic examples from diverse natural languages as its underlying data-driven foundation. The system is based on Semantic Web technology and hence can be queried in a flexible way that allows for combining any variation dimensions within a query (e.g. it allows to answer questions such as which languages exhibit joint attention marking by way of verbal suffixing). We will focus on design aspects relating to sustainable data. How can sustainable data for such a project be delimited? Surely, this encompasses commonly accepted aspects such as standards conformity, longevity, and accessibility, which we will address in the paper. Additionally and in particular, however, we will argue that user orientation and involvement is a critical factor. Following on from this, the tool is designed in a way that it (i) does not require linguistic users to be trained extensively in system usage, (ii) allows linguists to deploy their standard methods of data entry (e.g. interlinear glossing), and (iii) provides contributors with immediate integration of their own with previously entered data and access to the resulting analysis (i.e. querying) and research potential. The paper will roughly be structured as follows: We will describe the background and aims of the project, and contextualise it in relation to other similar projects. We will then concentrate on how sustainability is addressed, discussing a number of different facets of sustainability. This includes data storage formats, user interface and workflow modelling, knowledge base design, and system features (in particular system output). We will also outline some problems that have arisen so far and close with an outlook on future development.' (Publication abstract)

(p. 24-46)
Culture Documentation and Linguistic Elicitation, Anthony Jukes , single work criticism
'This paper will show how well-filmed short videos of endangered cultural practices can be used for eliciting procedural/cultural narratives as linguistic data, as well as providing visually appealing material for ethnography, culture documentation, and cultural/eco tourism. By recording narrations as a separate soundtrack (cued by the visual stimulus) researchers are able to collect explanations by different speakers representing different age groups, genders, dialects, or in different languages from different regions or even different countries. Taking traditional usage of the sugar palm in Sulawesi, Indonesia as a test case, I demonstrate data collected in a representative sample of languages, and discuss the technical challenges of a truly multilingual multimedia corpus.' (Publication abstract)
(p. 47-62)
Looking at Language : Appropriate Design for Sign Language Resources in Remote Australian Indigenous Communities, Jenny Green , Gail Woods , Ben Foley , single work criticism

'Sign languages, or iltyem-iltyem angkety, are in daily use in Arandic speaking communities of Central Australia. They are a form of communication used alongside other semiotic systems, including speech, gesture and drawing practices. Whereas sign languages used in deaf communities operate without any connection to speech, these 'alternate' handsign languages are used in various contexts by people who also use spoken language. They are culturally valued and highly endangered, yet there has been little or no systematic documentation of Arandic sign since Kendon (1988). In this paper we describe a pilot program to record Arandic sign languages, conducted by a community language team, funded by the Maintenance of Indigenous Languages and Records (MILR) program and by the Endangered Languages Documentation Program (ELDP), and auspiced by the Batchelor Institute (BIITE). Research into various aspects of multimodal communication brings with it many theoretical and practical challenges. New technologies and the ever-expanding potentials of data annotation systems create a plethora of choices and huge volumes of recorded material. Whereas the use of film in language documentation has recently become de rigueur, at least in some circles, it is often only as an adjunct to studies of spoken language. When the visual is foregrounded, as it is in sign and gesture research, additional layers of complexity are added that impact on all aspects of the documentation process. How, for example, do we balance the desire for naturalistic visual data with the need for visually 'clean' images? What lessons can linguists learn from ethnocinematographers (Dimmendaal 2010)? What kinds of resources will benefit the community and a range of users (scholarly, archival, educational etc), as well as satisfying community aspirations for medium and long-term engagement with their audio-visual language materials? How do we ensure that our methodologies are robust enough to allow comparisons between primary sign language corpora and alternate sign language ones?

' We discuss these issues and various others encountered in our research, including our field methodologies, annotation of film data, community consultations and ethical considerations, and issues that have arisen in designing an interactive sign language website for use as a teaching/learning resource in Arandic schools. Although the creation and management of digital archives for primary sign languages have been documented before (see Johnston & Schembri 2006), 'alternate' sign languages have received little attention.' (Publication abstract)

(p. 63-86)
Bringing Research and Researchers to Light: Current and Emerging Challenges for a Discipline-based Knowledge Resource, Kerry Kilner , Roger Osborne , single work criticism

'Australian literary studies have, in the past decade, been greatly assisted by AustLit: The Australian Literature Resource (, a multi-institutional collaboration between researchers, librarians and software designers from ten universities and the National Library of Australia. Under the leadership of The University of Queensland, this collaboration has produced a web-based research environment that supports a wide range of projects and publications across a diverse array of fields in Australian literary and narrative cultures while also becoming a key resource for teaching and general information. AustLit has consistently worked to integrate the research output of associated projects and is currently planning to expand its position in the community with a new open access and open contribution model. A major innovation in data management and maintenance, the AustLit Research Community structure supports the study of Australian literary and story-making cultures by providing a web-based environment where segments of these cultures can be explored and presented as distinct topics within a larger knowledge framework. Scholars are able to build datasets, annotate, analyse and present that data in a range of ways, and publish scholarly interpretations of their findings in the form of peer reviewed articles. The incorporation of these research-rich datasets into AustLit contributes to an overarching goal of building a comprehensive database of information about Australian writers, writing and print culture more broadly. With a recent decision to move from the current access model as a subscription service, available to relatively few users, to an open access and open contributions model incorporating content produced by a network of volunteers, AustLit is now facing a significant new challenge. The Aus-e-Lit Project has delivered innovative tools and services that will enable AustLit users to engage more directly with AustLit data and to contribute to a Research Commons with collaborative annotations and richly described collections of internet resources. This paper will report on the implications that these innovations bring to current and future research practices. It will consider the successes and challenges that AustLit faces with its aim to be the definitive virtual research environment and information resource for Australian literary, print, and narrative culture, not only for scholars in the field but for students of all levels and the general public.' (Publication abstract)

(p. 153-170)
Sharing Humanities Data for E-research : Conceptual and Technical Issues, Toby Burrows , single work criticism
'The humanities, as defined by the Australian Academy of the Humanities, encompass the following disciplines: Archaeology; Asian Studies; Classical Studies; English; European Languages and Cultures; History; Linguistics; Philosophy, Religion and the History of Ideas; Cultural and Communication Studies; the Arts. Researchers in some of these fields employ quantitative and qualitative methodologies similar to those used in the sciences and social sciences, but most research in the humanities is perceived as distinctive and different from research in other fields, both in its methodologies and in its approach to data. Archiving and sharing humanities data for reuse by other researchers is crucial in the development and application of e-research in the humanities. There has been considerable debate about the applicability of e-research in the humanities, particularly around the relevance of programmes to digitize source materials on a large scale. Conceptualized and designed properly, however, a humanities data archive can provide the platform on which data-intensive e-research can be based, and to which e-research processes and tools can be applied. This paper looks at the distinctive characteristics of humanities data, and examines how various models of the humanities research process help in understanding the meaning of 'data' in the humanities. It reviews existing services and approaches to building data archives and e-research services for the humanities, and the assumptions they make about the nature of data. It also analyses some conceptual and technical frameworks which could serve as the basis for future developments, focusing particularly on the place of Linked Open Data in building large-scale humanities e-research environments.' (Publication abstract)
(p. 171-185)
Excellence in Research for Australia and Sustainable Data, Simon Musgrave , John Hajek , single work criticism
'We are now entering the second round of assessment under the Excellence in Research for Australia (ERA) model. A number of writers have drawn attention to the problems inherent in this model for the humanities and social sciences (Cooper and Poletti 2011, Dobson 2011, Genoni and Haddow 2009) but what has not received attention (as far as we are aware) is the impact this model is likely to have on the type of work being highlighted and encouraged in this meeting. ERA relies to a large extent on journal rankings and as a result of its first iteration, many academics are experiencing pressure to direct their publications to highly-ranked journals. We suggest that this pressure is already disadvantageous to innovative work utilising digital data. Prestigious journals are, by their nature, conservative institutions and, at least in the humanities, are unlikely to encourage new models for disseminating results. Whilst journals such as Science and Nature routinely host supporting materials for papers on their websites, such practices are uncommon in the humanities. We are aware of a single journal in our field (Language Documentation and Conservation) which is highly ranked in the ERA process (it has an A ranking) and also encourages such publication. New journals which are published online and other alternative modes of disseminating scholarly work will inevitably have to wait some time to achieve any recognition on the ranking lists; indeed as the lists aim to maintain a proportion of journals at each level, it will be very hard for new publications to achieve a high ranking as that must be at the expense of an established publication. Thus the ERA model will tend to discourage innovative modes of publication. Additionally, the model gives no recognition to the idea which underlies much of the work presented in this forum, that making data widely available to colleagues is an inherently worthy activity. The experience of the British assessment exercise on which ERA is based was that researchers were placed under considerable pressure to ensure that their limited research time was geared to producing outputs which would be visible and valuable for assessment. Producing, curating and sharing sustainable data are activities which will struggle to meet these criteria.' (Publication abstract)
(p. 186-205)

Publication Details of Only Known VersionEarliest 2 Known Versions of

Last amended 8 Feb 2016 07:36:26