Incorporating Semantics and Metadata as Part of the Article Authoring Process

authors Fernicola, Pablo F.
year 2009
source ELPUB2009. Rethinking Electronic Publishing: Innovation in Communication Paradigms and Technologies - Proceedings of the 13th International Conference on Electronic Publishing held in Milano, Italy 10-12 June 2009 / Edited by: Susanna Mornati and Turid Hedlund. ISBN 978-88-6134-326-6, 2009, pp. 367-378
summary The ongoing shift in the delivery of publications, and in the consumption of content, from print to digital presents an opportunity to streamline the publishing workflow and to optimize the authoring process with digital content as the primary output, including the capture of semantics and metadata as part of authoring and the preservation of this data to the archival copy of the document.  In addition to the shift in how content is delivered and consumed, a significant development in the last few years has been the release of new versions of word processors with native file formats based on XML.  The use of XML in the authoring file format, combined with extensibility in its content model, will enable a greater level of content semantics and metadata to be expressed directly by authors.  The level of interoperability enabled by XML-based word processing file formats will make it possible to preserve the semantics and metadata as documents go through the submission and review process, make it through the publishing workflow and are ultimately archived, likely also in an XML based format.  This article describes the design considerations and possible benefits of the Article Authoring Add-in for Word 2007 to the scholarly publishing community, in particular for workflows focused on the production of documents for digital delivery and consumption, as well as for the XML based archival of publications.  The second Beta release of the add-in is available as a free download (http://research.microsoft.com/authoring), and it is currently being evaluated by the scholarly publishing community, with the involvement of publishers, archives, information repositories, and early adopters.  In addition to facilitating the creation of structured documents, and enabling semantics and metadata to be more easily captured during authoring, the add-in provides the ability to open and save files from Word 2007 into the XML format defined by the National Center for Biotechnology Information of the National Library of Medicine.  The add-in extends the file format used by Word 2007, as well as its user interface, to tailor the authoring experience for the different audiences involved in the publishing workflow.  As the add-in is adopted across multiple publications, authors will benefit from a consistent baseline experience, simplifying the authoring process and enabling a shift towards emphasising the expression of semantics over presentation by authors.
keywords Semantics; Metadata; NLM XML; Scholarly Publishing; Word
series ELPUB:2009
type normal paper
email pablofe@microsoft.com
content file.pdf (512,984 bytes)
urn:nbn urn:nbn:se:elpub-152_elpub2009
last changed 2009/09/08 17:27
