This work is licensed under a
Creative Commons / Attribution-ShareAlike 3.0 Unported License.
This specification defines a set of classes and properties used to represent documents and their structure.
This document is based on the current practices used by The Muninn Project to represent documents and digitally scanned images within the catalog.
This project was created as a result of some of the data and modeling problems encountered in the Muninn WW1 Project. The ontology records the bibliographic, provenance and digitization infromation of archival documents. It provides classes that are subclasses of FOAF for better compatibility while adding some needed functionality to manage documents.
The major differences with core FOAF classes has to do with properties that support document pages, digital representation of each of the pages, digital rights support and support for forms.
The initial design objectives for the ontology were:
The ontology is a limited representation of the full document structure in that it represents the physical parts of the document and not its logical organization. The properties for a page reflect this in its previous_page and next_page organisation.
Class organization revolves around a Document class that is the union of the FOAF Document class and the Creative Commons Works class for maximal compatibility. A Page class provides a container for each page of the document who's digitized image is in turn provided by an Image subclassed from the FOAF Image class. The Form class is another subclass of a Document (as some forms tend to be full blow documents themselves) but with additional properties intended to document their use and issuing organization.
The ontology makes extensive use of subclassing because of the different layers of a document whose different representation require multiple layers of meta-data. For example, which some works are clearly in the public domain some archival organization claim copyright on the photograph of the document itself. This allows the representation of situations where the contents of the document have been transcribed / OCR'd from a source image that is not publically available.
The markup described within this ontology is a document which models a physical document whose physical pages are represented by a set of images. To avoid confusion we use the term modeled document to reference the complete physical document while a representation references a single digital image of a part of the physical document.
A collection class represents any sort of container of documents, both as a model or an actual collection.
The handling of multiple sources of documents sometimes requires different levels of access controls for different representation of the document. It is common for a historical document to be owned by a museum, its contents to be out of copyright while the images of the document be copyrighted by the photographer.
This ontology provides the combined properties of FOAF, CC Works and some properties borrowed from Dublin Core. Together they provide a powerful description markup for the different parts of the document. The initial properties on the document are:
These properties are the cornerstone for the definition of the ownership (and authority) over the document. The date information is important in that it defines the begining of the countdown clock after which copyright can expire.
Within the Creative Commons Works class there exists a license property that references a License class. While this allows someone to instantiate any type of license, the design is not ideal in that the cc:deprecatedOn property references the license and not the application of a license to the work.
Currently the known instances of the License Class are:
The Creative Commons License Class contains the following properties.
The CC Work class makes use of these last two properties which are inverse of each other. The ontology also has an additional property derived from Dublin Core accessRights which combines both properties into a single directive.
Currently the access control and rights properties of the ontology are only used to markup the information. However, a future direction is the use of this information to infer using the reasonner the appropriate distribution and access rights based on the date, location and original licensing of the document.
Classes: Collection, Document, Image, Page, Text_Snippet,
Properties: accessRights, authoredBy, back_page, contains, contains_document, custodian, date_copyrighted, date_created, date_digitized, date_published, date_retrieved, depiction, description, document_contained_in, editor, filled_out_form, first_page, format, front_page, hasAuthored, location, next_page, pages, previous_page, publisher, raw_text, size_bytes, source, title, url, x_pixels, y_pixels,
Instances: Australia_Copyright_Act_1905, Australia_Copyright_Act_1912, Australia_Copyright_Act_1968, Australian_Crown_Copyright, British_Copyright_Act_1842, British_Copyright_Act_1911, British_Copyright_Act_1956, British_Copyright_Act_1988, British_Crown_Copyright, British_India_Crown_Copyright, British_Indian_Copyright_Act_1914, Canadian_Copyright_Act_1922, Canadian_Copyright_Act_1985, Canadian_Crown_Copyright, ForwardToOriginal, ForwardToPublisher, German_Empire_Copyright, Indian_Copyright_Act_1957, New_Zealand_Copyright_Act_1994, New_Zealand_Crown_Copyright, Newfoundland_And_Labrador_Crown_Copyright, Restricted, US_Copyright_Act_1998, US_Copyright_Act_of_1909,
A few examples are presented here
Here is a very basic representation of an image:
<documents:Image rdf:ressource="http://rdf.muninn-project.org/ww1/2011/11/11/image/34"> <documents:size_bytes>39475</documents:size_bytes;> <documents:x_pixels>256</documents:x_pixels> <documents:y_pixels>256</documents:y_pixels> <documents:format>image/png</documents:format> <documents:thumbnail> <documents:Image rdf:ressource="http://rdf.muninn-project.org/ww1/2011/11/11/image/34/thumbnail"/> <documents:thumbnail> <!-- Wikimedia isn't on dbpedia --> <documents:source rdf:ressource="http://commons.wikimedia.org/wiki/File:Crystal_Project_My_documents.png"/> <documents:date_retrieved rdf:datatype="http://www.w3.org/2001/XMLSchema#date>2012-02-12<documents:date_retrieved> <documents:creator rdf:ressource="http://dbpedia.org/page/Everaldo_Coelho"/> <documents:license rdf:ressource="http://rdf.muninn-project.org/ontologies/documents#lgpl"/> <documents:date_published rdf:datatype="http://www.w3.org/2001/XMLSchema#date>2007-06-16<documents:date_published/> <documents:description>An icon from the Crystal Project icon theme.</documents:description> <!-- Webserver will serve copy of picture to anyone. --> <documents:accessRights rdf:ressource="http://creativecommons.org/ns#Distribution"/> </documents:Image>
Here is a very basic representation of an Canadian Expeditonary Force attestation form from the Great War:
<documents:Document rdf:ressource="http://rdf.muninn-project.org/ww1/2011/11/11/image/34"> <documents:size_bytes>39475</documents:size_bytes;> <documents:x_pixels>256</documents:x_pixels> <documents:y_pixels>256</documents:y_pixels> <documents:format>image/png</documents:format> <documents:thumbnail> <documents:Image rdf:ressource="http://rdf.muninn-project.org/ww1/2011/11/11/image/34/thumbnail"/> <documents:thumbnail> <!-- Wikimedia isn't on dbpedia --> <documents:source rdf:ressource="http://commons.wikimedia.org/wiki/File:Crystal_Project_My_documents.png"/> <documents:date_retrieved rdf:datatype="http://www.w3.org/2001/XMLSchema#date>2012-02-12<documents:date_retrieved> <documents:creator rdf:ressource="http://dbpedia.org/page/Everaldo_Coelho"/> <documents:license rdf:ressource="http://rdf.muninn-project.org/ontologies/documents#lgpl"/> <documents:date_published rdf:datatype="http://www.w3.org/2001/XMLSchema#date>2007-06-16<documents:date_published/> <documents:description>An icon from the Crystal Project icon theme.</documents:description> <!-- Webserver will serve copy of picture to anyone. --> <documents:accessRights rdf:ressource="http://creativecommons.org/ns#Distribution"/> </documents:Image>
The Muninn RDF makes extensive use of header content negotiation in HTTP headers to provide the end user with exactly the type of content that is required. There are a few extensions that are available to force certain behaviours on the part of the server.
While a sparql server will allways return the appropriate content to a sparql client, it is sometimes convinient to force the server to send only the RDF metadata contents.
http://rdf.muninn-project.org/ww1/2011/11/11/image/34/about.rdf
While a sparql server will allways return the appropriate content to a sparql client, it is sometimes convinient to force the server to send only the RDF metadata contents.
http://rdf.muninn-project.org/ww1/2011/11/11/image/34/image
A 130x100 thumbnail image can be requested in the original format by requesting this specific name.
http://rdf.muninn-project.org/ww1/2011/11/11/image/34/thumbnail
The image can be requested in a specific graphics format by requesting it using the appropriate extension (eg: image.jpg for JPG, image.png for PNG, etc...). This conversion is done automatically if the http headers request a specific format.
http://rdf.muninn-project.org/ww1/2011/11/11/image/34/image.ext
A 130x100 thumbnail image can be requested in a specific graphics format by requesting it using the appropriate extension (eg: image.jpg for JPG, image.png for PNG, etc...). This conversion is done automatically if the http headers request a specific format.
http://rdf.muninn-project.org/ww1/2011/11/11/image/34/thumbnail.ext
URI: http://rdf.muninn-project.org/ontologies/documents#Collection
Collection - A model of a physical Collection of documents or a Collection of digital documents.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#Document
A Digital Document - A digital document made up of 1 or more pages.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#Image
A Digital Image - A digital image
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#Page
Page - A page of a document, may be double-sided.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#Text_Snippet
Text Snippet - A construction of string litterals. Not meant to represent a full document.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#accessRights
Access Rights - Meant as a side support to the rights property
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#authoredBy
Original author(s) of Document. Might have more than one. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#back_page
Back Page - The side of the physical piece of paper that should be read last.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#contains
A page of the document - no ordering on property. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#contains_document
Contains - Links a Document to this Collection.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#custodian
Custodian - The entity responsible for the documents when the original publisher does not control the works.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#date_copyrighted
Copyrighted Date - Copyright date is synonymous with creation date in most cases. Used primarily to support copyright status and distribution permissions.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#date_created
Creation Date - Date of this *record* created. (unstable / non-standard)
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#date_digitized
Digitization Date - Date that this document was digitized from a physical representation.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#date_published
Publication Date - Date of publication of the document being modeled.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#date_retrieved
Date Retrieved - Date that the modeled document was retrieved from another source (eg: downloaded from a web server).
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#depiction
Imaged copy of the page. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#description
Description -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#document_contained_in
Contained in - Links a Collection to this Document.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#editor
Editor - Original editor(s) of Document. Might have more than one.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#filled_out_form
Form - This document is a filled out form.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#first_page
First Page - Convinience method to find the first page to read.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#format
Mime-Type -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#front_page
Front Page - The side of the physical piece of paper that should be read first.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#hasAuthored
Original author(s) of Document. Might have more than one. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#location
Location -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#next_page
Next Page - The next physical page that a human reader would read, even if blank. This implies that this property is pointing to a page that has no front_page property.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#pages
Pages - Total number of pages (single or double sided) in the document. This count may or might not match the number of content properties if one-sided and double-sided documents are present.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#previous_page
Previous Page - The previous physical page that a human reader would have just read.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#publisher
Original publisher of modeled document, may be different than the publisher of the digital copy of the documents images. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#raw_text
Raw Text -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#size_bytes
Size of image in 8-bit bytes. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#source
Editor - Original Source of Document. Most likely a url or organization.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#title
Title -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#url
URL - Shortcut to the full url to the image ressource. Use this to avoid content negotiation.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#x_pixels
Width of image in Pixels. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ontologies/documents#y_pixels
Height of image in Pixels. -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Australia_Copyright_Act_1905
Australia Copyright Act 1905 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Australia_Copyright_Act_1912
Australian Copyright Act, 1912 - Essentially the British Copyright Act (1911), but under Australian Dominion.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Australia_Copyright_Act_1968
Australia Copyright Act 1968 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Australian_Crown_Copyright
Australian Crown Copyright - Copyright is owned by the crown, in the context of the state of Australia.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Copyright_Act_1842
British Copyright Act 1842 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Copyright_Act_1911
British Copyright Act, 1911 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Copyright_Act_1956
British Copyright Act 1956 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Copyright_Act_1988
British Copyright Act 1988 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Crown_Copyright
Crown Copyright (Britain) - Copyright is owned by the crown, in the context of the state of Great Britain (or the British Empire).
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_India_Crown_Copyright
Crown Copyright (British India) - Copyright is owned by the crown, in the context of the state of British India.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/British_Indian_Copyright_Act_1914
British Indian Copyright Act 1914 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Canadian_Copyright_Act_1922
Canadian Copyright Act 1922 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Canadian_Copyright_Act_1985
Canadian Copyright Act 1985 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Canadian_Crown_Copyright
Crown Copyright (Canada) - Copyright is owned by the crown, in the context of the state of Canada.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Permission/ForwardToOriginal
Forward to original copy - Forward to source URL
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Permission/ForwardToPublisher
Forward To Publisher - Webserver will forward request to content to the publisher or source organization.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/German_Empire_Copyright
German Empire Copyright -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Indian_Copyright_Act_1957
Indian Copyright Act 1957 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/New_Zealand_Copyright_Act_1994
New Zealand Copyright Act 1994 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/New_Zealand_Crown_Copyright
Crown Copyright (New Zealand) - Copyright is owned by the crown, in the context of the state of New Zealand.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/Newfoundland_And_Labrador_Crown_Copyright
Crown Copyright (Newfoundland and Labrador) - Copyright is owned by the crown, in the context of the state of Newfoundland and Labrador.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Permission/Restricted
Restricted Access - Webserver will not share content, unless authenticated.
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/US_Copyright_Act_1998
US Copyright Act 1998 -
No detailed documentation for this term.
URI: http://rdf.muninn-project.org/ww1/2011/11/11/Jurisdiction/US_Copyright_Act_of_1909
US Copyright Act of 1909 - Copyright lasts for 28 years, renewable once.
No detailed documentation for this term.
The ontology is a limited representation of an archival document.