Tag Archives: Archaeology Data Service

Colonisation of Britain

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

Colonisation of Britain Linked Data

Colonisation of Britain Linked Data

The Colonisation of Britain project was undertaken by Wessex Archaeology, commissioned by English Heritage, and involved the digitisation of the archive material of Roger Jacobi.

The project aimed to provide a comprehensive survey of the Upper Palaeolithic resource, with a pilot study of the Mesolithic in England involving three counties. A comprehensive survey of these periods (c.38,000-c 6500 BP) was considered of particular importance because many of the sites and find spots represent evidence for the recolonisation of Britain after the Last Glacial Maximum by hominid groups. Unlike earlier recolonisations this event is part of the most northerly early migration of social groups of anatomically modern humans.

The major primary source of data for the survey is the invaluable and extensive archive compiled over many years by the late Dr Roger Jacobi. Securing this archive was identified as a priority in itself. The project included the digitisation of the Jacobi Archive, both as an image dataset and as a structured relational database, enhanced with additional information from SMR/HERs.

One of the outputs from the project is a Linked Data version of the outputs and Archaeogeomancy were pleased to be commissioned to undertake this component.


The work was undertaken using the Stellar toolkit, produced by the Hypermedia Research Unit at the University of South Wales. This is a freely available toolkit which facilitates the creation of Linked Data resources from tabular data, either in the form of delimited text files or relational databases. The source material used was the outputs available from Wessex Archaeology and soon to be available from the Archaeology Data Service, where the Linked Data will also soon be available.

The method for working with the data using the Stellar toolkit involves two main processes. Firstly, the data is mapped to one or both of the ontologies supported by the toolkit, namely the CRM-EH (for archaeological fieldwork data) which is an extension to the other ontology supported, namely the CIDOC CRM which supports the full range of cultural heritage information. For the Colonisation of Britain project, a mapping was undertaken the the main CIDOC CRM ontology as the data does not relate to archaeological investigations, rather describes collections of objects from the Jacobi archive.

The second stage is to create templates representing this mapping for use with the Stellar application. This uses the templates and the source data to produce CIDOC CRM compliant Linked Data in the form of RDF files. The format of the templates is defined by the String Template system, “a java template engine (with ports for C#, Python) for generating source code, web pages, emails, or any other formatted text output“. The user defined templates created for this project reference the core Stellar templates written by Ceri Binding (University of South Wales) which do most of the heavy lifting.

A further stage was also undertaken to align geospatial elements of the source data with Ordnance Survey data. Where placenames occurred in the source data, these were converted to Ordnance Survey OpenSpace URIs using the Open Refine platform drawing on the OS Reconciliation API (see below for details). This included parish and county names which were linked to the OS BoundaryLine dataset. A further piece of work will be undertaken which will take this one step further and add GeoSPARQL nodes to the Linked Data resource, enabling the data to be more fully included in Linked Geospatial Data graphs.


This mapping describes the data as it is stored in the digitised version of the Jacobi archive; this means that some concepts are not fully resolved, for example places of origin, as it cannot be stated with certainty that any two identical place names refer to the same place. It is of course, still possible to query the semantic data using the longer chains so for example with respect to place names, whilst two occurrences of the same place name are represented in the output Linked Data as two distinct places, they can be linked by virtue of having the same place name appellation.

The heart of the Linked Data resource is each collection of artefacts described by the original card index. This is modelled as the CIDOC CRM concept E78 Collection. Each artefact collection is described by an index card (E31 Document) which documents the collection itself as well as the ‘site’ ie the place of origin (E53 Place). Where specific spatial coordinates exist, these have been included as appellations of the spatial nodes (E47 Spatial Coordinates). Spatial metadata such as precision is represented as classifications of the spatial nodes (using E54 Dimensions, E58 Measurement Units and E55 Type).

The artefact collection can form part of a larger collection, for example a named collection (E78 Collection) curated by a museum (E40 Legal Body), and is classified using the classificatory schemes used by both Jacobi and Wessex Archaeology through their enhancement (all using E55 Type). Additional information is stored as notes associated with particular concepts as appropriate. The index cards describe collections by material type so the resultant Linked Data does likewise, with each collection having a type of material (E57 Material).

Finally, assertions made through the project by the specialists were the product of a specific activity (E13 Attribute Assignment). This allows each collection to be associated with an archaeological period (eg Mesolithic) represented here as a purely temporal concept (E49 Time Appellation) and this linkage can be extended as required to link to eg the Heritage Data period resources, represented using SKOS.

Open Refine

The Open Refine platform is a very useful way of converting spatial data stored as text strings (eg names of counties, parishes, etc) into URIs suitable for inclusion in Linked Data resources. For details on how to use this, see the official Ordnance Survey documentation and also this very helpful blog by John Goodwin.


The Linked Data data can be visualised in a variety of ways. The image presented here have been created using Gruff to be demonstrative of the shape of the Linked Data. The image shown here illustrates the overall shape of the resource using one specific record; the collection, associated places, classifications, curatorial organisations and the project itself can all be seen. Also noteworthy are the nodes in the bottom right of the image which relate to the Ordnance Survey TOIDs referenced in this case.

Example showing the Linked Data for record 00748

Example showing the Linked Data for record 00748 (click for a larger view)


The Stellar templates used for the project are included here for information. They are dependent on the CRM-EH and CRM templates distributed with the Stellar Toolkit. Templates are distributed using a Creative Commons license so do make use of them for any purpose, the only restriction on usage is that proper attribution be made.

For more information on writing custom templates, see the Stellar Tools documentation, the String Template documentation or get in touch.


Thanks are due to Chris Brayne and Matt Leivers (Wessex Archaeology) for commissioning the project and doing all the really hard work successfully undertaking the main Colonisation of Britain project. A big thanks to Ceri Binding (University of South Wales) for support on the Stellar Toolkit and thanks also to Michael Charno (Archaeology Data Service) for liaising over ADS handover and specific requirements.

The post Colonisation of Britain appeared first on Archaeogeomancy: Digital Heritage Specialists.

GSTAR: investigation of methods for working with geosemantic data, integrating geospatial data with semantic data

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

Mapping rubble by Brian Hoffman

Mapping rubble by Brian Hoffman

The first investigation in the GeoSemantic Technologies for Archaeological Research (GSTAR) research project is nearing completion, an assessment of approaches to the integration of geospatial archaeological data into a semantic framework to provide geosemantic capabilities.

The investigation draws on archaeological excavation data lodged with the Archaeology Data Service (ADS) and made available as Linked Data (LD) through the ADS’s Linked Data platform. The data relates to the Cobham Golf Course site and was produced by Oxford Archaeology (OA) as part of the Channel Tunnel Rail Link (CTRL) project then turned into a Linked Data resource through the Semantic Technologies Enhancing Links and Linked data for Archaeological Resources (STELLAR) project, undertaken by the Hypermedia Research Unit at the University of South Wales (USW).

Mapping a feature by Wessex Archaeology

Mapping a feature by Wessex Archaeology

The GSTAR literature review identified two strands of integration approaches within published literature. Emerging from the semantic web and Linked Data communities, an approach involving the direct inclusion of geospatial data within semantic resources, leveraging World Wide Web Consortium (W3C) standards for Resource Description Framework (RDF) and Open Geospatial Consortium (OGC) standards for Well Known Text (WKT, part of the Simple Features specification) and GeoSPARQL. Emerging from the Geographic Information Science (GISc) community, approaches involving the use of Web Feature Services (WFS) within broader Spatial Data Infrastructures (SDI) running in parallel and linked to to semantic resources.

This initial GSTAR investigation looked at both these strands with a view to assessing suitable modes for subsequent use in the next phases of the GSTAR project. A WISSKI installation has also been setup to allow for the minting of any additional URIs needed.

GeoSPARQL route

This involved creating geosemantic data aligned with the CRM-EH extension to the CIDOC CRM ontology, stored within the Oracle Spatial & Graph platform and accessed via GeoSPARQL using an Oracle WebLogic web server and the Jena Framework.

[code language=”xml”]
<owl:Class rdf:about="http://purl.org/crmeh#EHE0022_ContextDepiction">
<rdfs:isDefinedBy rdf:resource="http://purl.org/crmeh#CRMEH"/>
<rdfs:subClassOf rdf:resource="http://erlangen-crm.org/110404/E47_Spatial_Coordinates"/>
<rdfs:label>Context Depiction</rdfs:label>
The Spatial co-ordinates of a Context, defining the actual spatial extent of the context. Usually recorded at the time of excavation or other investigative work
The OWL definition of the EHE0022 class used to describe depictions

Further triples were also added to describe the depiction using the GeoSPARQL ogc:hasGeometry and ogc:asWKT properties.

[code language=”xml”]
<owl:ObjectProperty rdf:about="#hasGeometry">
<rdfs:isDefinedBy rdf:resource=""/>
<rdfs:isDefinedBy rdf:resource="http://www.opengis.net/spec/geosparql/1.0"/>
<skos:prefLabel xml:lang="en">hasGeometry</skos:prefLabel>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2011-06-16</dc:date>
<dc:contributor>Matthew Perry</dc:contributor>
<dc:description xml:lang="en">
A spatial representation for a given feature.
<rdfs:range rdf:resource="#Geometry"/>
<rdfs:comment xml:lang="en">
A spatial representation for a given feature.
<rdfs:domain rdf:resource="#Feature"/>
<rdfs:label xml:lang="en">hasGeometry</rdfs:label>
<dc:creator>OGC GeoSPARQL 1.0 Standard Working Group</dc:creator>
<skos:definition xml:lang="en">
A spatial representation for a given feature.

The OWL definition of the hasGeometry property

GIS Server route

A second approach used the same base platform and data but accessed the geospatial component via WFS provided by GeoServer, drawing on the Oracle database.

Next steps

The results of this stage and the GSTAR project in general will be presented at the Computer Applications and Quantitative Methods in Archaeology (CAA) conference to be held in Paris, France during April 2014.

Thanks are due to the University of South Wales for funding the GSTAR project and to the people and organisations responsible for the tools, technologies and data being used. Also my PhD supervisor (Prof. Douglas Tudhope; USW), advisors (Dr Mark Ware and Dr Alex Lohfink: USW) and fellow researchers Ceri Binding (USW), Dr Andreas Vlachidis (USW) and Michael Charno (ADS) for their input.

The post GSTAR: investigation of methods for working with geosemantic data, integrating geospatial data with semantic data appeared first on Archaeogeomancy: Digital Heritage Specialists.