Dana Pearson, MLIS
library metadata and XML services
metadata and XML services for libraries
I offer support for libraries seeking to expand the scope of their discovery systems by including freely available online resources and other bibliographic services.
I have worked with a variety of metadata formats from Dublin Core to RDF in metadata harvesting projects such as HathiTrust, PubMed Central (articles) and the Project Gutenberg. I have also developed a crosswalk that can be used with MarcEdit to transform ONIX for Books into MARC.
Libraries are transforming such metadata into MARC but it requires programming expertise that most libraries lack. It is my goal to provide that expertise with fees that are affordable for most libraries.
- creating metadata crosswalks
- harvesting metadata for freely available online resources
- metadata analysis - elements, attributes and content
- correcting character encoding errors
- mapping metadata content to standards for dates, languages, etc.
- merging content from other XML files
- transforming metadata in spreadsheets or delimiter-separated values into MARC or other metadata formats
MARC bibliographic services
I provide global editing services for MARC bibliographic and authority records. MARC records, especially eBooks from publishers or distributors, often require editing beyond additions or modifications required for local cataloging practices such as proxy server address prepended to URLs.
- global editing of bibliographic and authority records
- merging external content into MARC records with XSLT stylesheets
- moving content to new fields, subfields
- creating new fields, subfields based on content in multiple fields
- resolving character encoding problems especially for MARC-8 systems
my toolbox: XSLT, MarcEdit, regular expressions
Most metadata standards use the Extensible Markup Language (XML) which is rather more like a meta-language used to create markup languages.
The Extensible Stylesheet Language - Transformations (XSLT), is a language for transforming XML documents into other XML documents, e.g., ONIX for Books to MARC. The Library of Congress has created an XML framework for MARC (ISO 2709).
Many librarians have learned XSLT, especially those working with digital libraries or repositories and those harvesting metadata for library systems.
I use an XML integrated development environment (IDE), Stylus Studio. The stylesheets can be used with MarcEdit's XML conversion utility.
MarcEdit, developed by Terry Reece, Head of Digital Initiatives, The Ohio State University, is a free, multi-faceted MARC editor with powerful global editing features. One particularly notable feature is support for regular expressions.
I use regular expressions to extend MarcEdit's basic find and replace utilities for matching, replacing or moving requirements within and across fields and subfields.
Creating content in control fields or subfields based on existing content in another field or creating entire fields based on content found in different fields is an example of more complex global editing requirements that can be easily accomplished with a stylesheet and MarcEdit's XML conversion utility.