Tag Archives: GeoSPARQL

Extending CRMEH with GeoSPARQL

November 3, 2014GSTARGeographic Information Systems, geosemantic, GeoServer, GeoSPARQL, geospatial, Jena, Linked Data, Oracle, Parliament, PhD, Research News, Semantic Web, STELLAR, StringTemplateCripps P

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

One of the outputs from the Pilot Study was an approach to working with geospatial data within the broader framework provided by the CIDOC CRM ontology and the CRMEH archaeological extension. Whilst there is ongoing work by myself and others to add archaeological and spatio-temporal components directly to the CIDOC CRM, for the purposes of the GSTAR project, a lightweight approach has been developed and deployed to suit the needs of the project; CRMEH already adds archaeological excavation capabilities and the spatial extension presented here gives a range of geospatial capabilities, as provided by a mapping to GeoSPARQL.

Parential Advisory by Michel Dumontier

Research Context

For the purposes of the GSTAR project, there is a need to be able to incorporate into semantic resources rich geospatial data representing depictions of archaeological features, sites and monuments, also boundaries of activities and events plus locations where objects were discovered. Whilst the CRMEH was developed with spatial information at its core, this has not, to date, been formally expressed. This is now possible using GeoSPARQL.

During the early stages of the GSTAR project, related work became apparent, notably two extensions to the CIDOC CRM (of which CRMEH is itself an extension) pertaining to spatio-temporal information (CRM_geo) and archaeological excavation information (CRM_archaeo). These will ultimately offer greater research potential but the mapping presented here can be seen as an example of a simple, lightweight solution targeting the ‘low hanging fruit’ so often talked about with respect to ontologies and Linked Data; a mapping which meets the needs of the GSTAR project, retains compatibility with the CIDOC CRM and GeoSPARQL standards and provides core geospatial functionality for CRMEH albeit without the reasoning power (and associated complexity) of the two aforementioned extensions.

Application within GSTAR

The semantic resource will be used through the Case Studies planned for the GSTAR project to investigate the use of geosemantic tools for archaeological research. Two of these are focussing on the integration aspect, looking at what I have defined as ‘horizontal’ and ‘vertical’ integration using the spatial components of source data. Horizontal integration refers to linkages between inventories, ie from site finds inventories to museum object inventories to sites and monuments inventories. Vertical integration refers to linkages between primary and derived data, from fieldwork databases containing records of features and finds up to inventories of higher order data objects containing records derived from these primary observations.

Related Work I – CRM_geo

Work is ongoing to produce a spatio-temporal model through integration of GeoSPARQL and CIDOC CRM: the CRM_geo extension, currently in draft form. This promised to be an incredibly powerful resource capable of advanced spatio-temporal description and reasoning.

The work is described as follows:

CRM_geo is an extension for the CIDOC CRM to provide an “articulation” (linkage) between the standards of the geospatial and the cultural heritage community in particular between GeoSPARQL and CIDOC CRM. The model was developed from the analysis of the epistemological processes of defining, using and determining places. This means that we analyzed how a question, such as “is this the place of the Varus Battle” or “is this the place where Lord Nelson died”, can be verified or falsified, including geometric specifications. Consequently, we reached at a detailed model which seems to give a complete account of all practical components necessary to verify such a question, in agreement with the laws of physics, the practice of geometric measurement and archaeological reasoning. This model indeed appears to have the capability to link both ontologies and shows the way how to correctly reconcile data at any scale and time – not by inventing precision or truth that cannot be acquired, but by quantifying or delimiting the inherent indeterminacies, as it is good practice in natural sciences. This model aims at being a comprehensive theory from which mutually compatible simplification can be derived for implementations in more constraint environment, such at those lacking moving frames.

Related Work II – CRM_archaeo

Similarly, work is ongoing to produce an archaeological excavation model: the CRM_archaeo extension, currently in draft form. This promises to support description of and reasoning about archaeological excavation information from a range of recording methodologies.

This project is described as follows:

CRM_archaeo is an ontology and RDF Schema to encode metadata about the archaeological excavation process.

The goal of this model is to provide the means to document excavations in such a way that the following functionality is supported:

Maximize interpretation capability after excavation or to continue excavation Reason of excavation (goals). What is the archaeological question?

Possibility of knowledge revision after excavation

Comparing previous excavations on same site (space)

All kinds of comprehensive statistical studies (“collective behavior”)

My contribution to CRM_archaeo is running in parallel to my work on the CRMEH. Whilst ultimately there will need to be some decisions as to which extension to use for new projects and resources, there is currently a fair amount of data out in the wild which uses CRMEH and at least until CRM_archaeo is finalised and probably longer, there will be some co-existence of these two complimentary models. After all, the two models are very much related and oversight has been maintained to ensure a good degree of compatibility between them.

A lightweight mapping

A decision was made to create a lightweight mapping of CRMEH to GeoSPARQL rather than implement a combination of CRM_archaeo and CRM_geo for three main reasons:

Firstly, these extensions are centred on the core CIDOC CRM ontology rather than the CRMEH extension. As CRMEH is being used for the GSTAR project, their use would have required a mapping process anyway to ensure compliance.
Secondly, both of these ‘emerging’ standards are currently in draft form, in the process of being finalised and formally adopted. As such, they are not fixed yet and subject to review, improvement and change; Some components in particular still require more work to completion.
Finally, the degree to which the advanced features offered by these extensions could be made use of through the GSTAR project is uncertain. A lightweight mapping can be seen as an 80% or 90% solution, covering most eventualities and avoiding the overheads associated with the rather more complex extensions. But retaining overall compatibility.

Mapping rationale

The key spatial components needed are already present in CRMEH. There are two main components covering excavation data: the Context (aka Stratigraphic Unit; the atomic unit of archaeological recording) and the ContextDepiction (a depiction of the Stratigraphic Unit, typically a polygon shown in plan view). A Context is related to a ContextDepiction through the property Depicts / Is Depicted By with a Context being depicted by one or more depictions.

These extend from the core CIDOC CRM: the CRMEH class Context (EH0007) is a subclass of Place (E57) whilst ContextDepiction (EH0022) is a subclass of Place Appellation (E44). In GeoSPARQL, there are also two classes to describe spatial information with Features having some representation in the form of Geometry. There is a good alignment here between the CRMEH classes (or indeed the parent CIDOC CRM classes) and the GeoSPARQL classes, allowing the ontologies to be linked as described in the GeoSPARQL User Guide written by Dave Kolas and Robert Battle.

This is illustrated in the following diagram:

Alignment of CRMEH classes and properties with GeoSPARQL classes and properties

As shown in the diagram, the rdfs:subClassOf, rdfs:subPropertyOf and rdfs:isA relationships can be used to link the two ontologies. This maps the necessary classes and also allows instances of Context Depictions to behave as Simple Features as used within GeoSPARQL.

From mapping to RDF

This mapping allows Contexts to be depicted by one or more pieces of geometry, each instance of a ContextDepiction making use of an OGC Simple Features type (Point, Line, Polygon, etc) and represented using one of the standard formats, in this case WKT.

The mapping can also be applied at the broader CIDOC CRM level and inherited by the CRM EH (and other) classes and properties if this is advantageous.

With respect to data, resources can be created very simply by adding the class inheritance relationships once to a given resource then creating appropriate assertions relating to the ContextDepiction. This means in practice, GIS data can be converted very easily using a variety of tools (eg StringTemplate, Java(script), Python, VB or even Microsoft Excel) to produce suitable RDF of whatever flavour (ntriples, turtle, etc) ready for ingestion into a triple store.

An example of this for a single ContextDepiction is shown below:

@prefix crmeh: <http://purl.org/crmeh#> . @prefix geo: <http://www.opengis.net/ont/geosparql#> . @prefix gstar: <http://ld.gstar.archaeogeomancy.net/content/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix sf: <http://www.opengis.net/ont/sf#> . crmeh:EHE0007_Context rdfs:subclassOf geo:Feature . crmeh:EHP4i_is_depicted_by rdfs:subPropertyOf geo:hasGeometry . gstar:crmeh_EHE0022_123 a crmeh:EHE0022_ContextDepiction . gstar:crmeh_EHE0022_123 a sf:Polygon . gstar:crmeh_EHE0022_123 rdfs:label "Polygon Depiction of Context 123" . gstar:crmeh_EHE0022_123 crmeh:EHP4_depicts ads:EHE0007_123 . gstar:crmeh_EHE0022_123 geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/27700> POLYGON ((569170 169503,569170 169503,569170 169503,569170 169503,569170 169503,569170 169503,569170 169503,569170 169503,569171 169503,569170 169503,569170 169503,569170 169503,569170 169503,569170 169503,569170 169503))"^^sf:wktLiteral .
In the example (above), namespaces are shown in blue, subproperty/subclass relationships to be defined once in green and the block of RDF to be used for each ContextDepiction in red. NB The GSTAR namespace houses a WISSKI installation, currently not configured and acting as a placeholder only; the URIs do not resolve.

A working system

The system used for the Pilot Study took GIS data from the ADS archives, processed it as above and loaded it alongside the CRMEH RDF encoding and Erlangen CIDOC CRM RDF encoding. This was then successfully tested using a range of OGC spatial operators, SPARQL and GeoSPARQL queries.

Conclusions

For a fuller account of this, please see the Transfer Report when that is published more widely. Or wait a while longer for the thesis (due 2016).

In summary, whilst emerging standards building on the CIDOC CRM covering geospatial and archaeological excavation are forthcoming, a simple, lightweight approach can also be deployed for this use case to give a good range of functionality without the complexity, albeit sacrificing some semantic richness.

The mapping described here could also be applied directly to the parent CIDOC CRM classes/properties (rather than the child CRMEH classes/properties used here) to give a more generic linkage to GeoSPARQL suitable for use in a broader range of cases.

The post Extending CRMEH with GeoSPARQL appeared first on Archaeogeomancy: Digital Heritage Specialists.

From MPhil to PhD; GSTAR update

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

After a longer than anticipated gestation, my Transfer Report has left my hands and is working its way through the administrative system to be externally examined. Fingers crossed, this is one of my last posts as an MPhil student and I will soon (post viva) be a PhD student proper.

Time for some celebratory fireworks!

The Transfer Report included a condensed form of the literature review and also a detailed report on Pilot Study. This Pilot Study was designed to lay sound foundations for the PhD research and involved implementing a system using geosemantic technologies, primarily to investigate ways in which semantic and geospatial data can work together but also to help me get to grips with the subject area and technologies available.

The full report will be made available in due course, once it has been examined (viva scheduled for end of November) and any corrections completed, but for now here is an update on some of the key findings of the Pilot Study and conclusions drawn.

Conclusion one: Oracle is really complicated

Sean D. Tucker and the Oracle Challenger by Nathan Rupert

I started off with the idea that using Oracle for the research would be a really good idea. It is available for use under license for research purposes (OTN Developer License) and is the don of the database world. Furthermore, it does everything I required, all in one platform; no need to string bits of open source software together I thought with their undocumented ‘features’ and sparse documentation. After all, Oracle is a commercial system, an enterprise level system which supports all the relevant standards for geospatial, semantic and geosemantic data. It is capable of functioning as a triple store, a geospatial database and integrates with the Jena framework by means of a dedicated connector. The latest version (12c) has also been significantly redesigned and improved with respect to the Spatial and Graph components.

This is true, but being an enterprise level application, it also comes with considerable baggage. Notably, it is really, really complicated and much of this complexity is totally unnecessary for the likes of me undertaking a research project.

Now I don’t want to be unnecessarily critical of the platform but there are some real issues with using it for a research project such as GSTAR. Installation and configuration for starters is necessarily complex as it supports some seriously powerful tools such as security, distributed/pluggable databases, user/group roles and permissions not to mention Extended Data Types (essential for handling big data such as WKT geometries) and Indexing thereof. For a research project, components such as the enterprise level security are quite simply a hindrance rather than a help not to mention indexing. More critically, I found working with the Jena Connector and GeoSPARQL to be fraught with the (copious) documentation for the new version being a bit lacking; forums and blogs were of enormous help in fixing problems where the documentation wasn’t quite as helpful as it might have been for working with this latest version. No doubt this will bed down given time but being at the bleeding edge of such technology was not an ideal place to be.

Given I’m no longer using the Spatial and Graph components, the use of Oracle as the spatial database is no longer useful. Indeed, I won’t be using a spatial database as such with all data being prepared as Linked Geospatial Data within the triple store.

So, it’s been an experience but goodbye Oracle. Thanks for all the fish.

Conclusion two: Open Source software can be really good

‘If you want a culture of collaboration, you need to accept the LOLCats too’ by opensource.com

Still smarting from my Oracle experiences and quite a long way down the road with less than I had hoped to show for my troubles, I returned to my initial review of triple stores, looking for a suitable alternative. My requirements are quite specific: The platform needs to support big data, be responsive, support inferencing/reasoning and, crucially, provide good support for GeoSPARQL. I recalled various papers from my literature review extolling the virtues of Parliament, other folk having used it on similar research projects. It also has a thoroughbred pedigree, originating from research initially undertaken by DARPA through the DARPA Agent Markup Language (DAML) Program and is now used as the base for applications in a range of tough, testing environments by Raytheon BBN Technologies. So an impressive pedigree.

My concern regarding documentation, having worked with various Free and Open Source Software (FOSS) platforms over the years, was still niggling, but it had to be worth some testing. After all, quantity does not necessarily equate with quality, as the Oracle experience demonstrated. And there certainly isn’t quantity: the manual (a single rather short document) is smaller than the document for the Oracle Extended Data Types functionality! The key difference is Parliament is a one trick pony, and it does that trick very well; It does not try and be all things to all users. Installation and configuration was simple as pie, with the user guide providing all the key information without excess baggage. True, some of the latter sections of the user guide are yet to be written (almost a prerequisite for a FOSS application, a bit like for web 2.0 apps where permanent beta status is a badge of honour) but these focused on highly specific aspects of deployment irrelevant to my research.

So within a day, I had gone from review of my systems review to a working system.

Conclusion three: geosemantic applications using GeoSPARQL can really fly!

‘LOL Potential – Now LOL at Warp speed’ by Jen

One aspect to the Pilot Study was an investigation into different ways of integrating semantic and geospatial data. Without going into too much detail (I’ll post a version of the Transfer Report once it’s been examined, I’ve had my viva and everything is finalised), I had a suspicion that working with geospatial data using semantic tools and verbose, text based formats such as GML and WKT would be lacking in the performance department. Especially given some of the criticisms levelled at the performance of some SPARQL implementations and compared with the highly tuned geospatial tools found in GIS and dedicated geospatial platforms. So I wrote a Java application to test this hypothesis, comparing a ‘hybrid’ SPARQL+WFS system with a pure geosemantic system based around GeoSPARQL. The results of this showed very little difference in performance between the two approaches, potentially as any benefits of the optimised geospatial components appeared to be outweighed by overheads associated with additional middleware to process geosemantic queries for the GIS and then handle the WFS outputs to produce RDF. Given this lack of any significant benefits combined with the need for more complex systems architecture, I have opted for a ‘pure’ geosemantic basis for my next stages, based around Parliament, Jena, Joseki/Fuseki and GeoSPARQL, cutting out the need for any RDBMS, GIS and associated web servers.

Where next…?

The Grinder by Kalle Gustafsson

So, a big chunk of my research project is now complete and all being well, I should have my transfer from MPhil to PhD all signed and sealed in the near future. The Pilot Study has provided the groundwork for the next phases of work as detailed above and work on the next Case Studies is already well underway. I have the first tranche of data from Wiltshire Historic Environment Record in hand which is currently being processed to produce a geosemantic resource in Parliament; other data from archaeological units and museums is being sourced with the aim of completing this integration and preparation phase by Christmas.

Historic Environment Record data

The HER data is being prepared using the CIDOC CRM ontology with CRM EH extensions supported by a lightweight GeoSPARQL integration to provide the necessary geosemantic framework; more on the CRMEH-GeoSPARQL integration here. The production of Linked Data to feed Parliament is once again being accomplished using the workflow developed through the Pilot Study, based around the STELLAR toolkit and the StringTemplate engine.

Case Studies and further investigations

The Case Studies will then look at the integration of these datasets using inferencing/reasoning on the spatial and other facets, moving from fieldwork data up to heritage asset inventories and across to museum collections, specifically how such linked resources can be used to undertake archaeological research based on current archaeological research questions and also including the use of RDF mapping libraries and query mediation using (spatial) ontologies.

Thesis

I now have a draft chapter outline agreed for my thesis and already have tens of thousands of words to edit into it pertaining to the Literature Review, Pilot Study, introductory and methodology chapters. In other words, full steam ahead!

The post From MPhil to PhD; GSTAR update appeared first on Archaeogeomancy: Digital Heritage Specialists.

Colonisation of Britain

May 30, 2014GSTARArchaeology Data Service, Business News, GeoSPARQL, LD, LGD, Linked Data, Linked Geospatial Data, Open Refine, Open Space, Ordnance Survey, prehistory, Research News, Semantic Web, SKOS, stone tools, Wessex ArchaeologyCripps P

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

Colonisation of Britain Linked Data

The Colonisation of Britain project was undertaken by Wessex Archaeology, commissioned by English Heritage, and involved the digitisation of the archive material of Roger Jacobi.

The project aimed to provide a comprehensive survey of the Upper Palaeolithic resource, with a pilot study of the Mesolithic in England involving three counties. A comprehensive survey of these periods (c.38,000-c 6500 BP) was considered of particular importance because many of the sites and find spots represent evidence for the recolonisation of Britain after the Last Glacial Maximum by hominid groups. Unlike earlier recolonisations this event is part of the most northerly early migration of social groups of anatomically modern humans.

The major primary source of data for the survey is the invaluable and extensive archive compiled over many years by the late Dr Roger Jacobi. Securing this archive was identified as a priority in itself. The project included the digitisation of the Jacobi Archive, both as an image dataset and as a structured relational database, enhanced with additional information from SMR/HERs.

One of the outputs from the project is a Linked Data version of the outputs and Archaeogeomancy were pleased to be commissioned to undertake this component.

Method

The work was undertaken using the Stellar toolkit, produced by the Hypermedia Research Unit at the University of South Wales. This is a freely available toolkit which facilitates the creation of Linked Data resources from tabular data, either in the form of delimited text files or relational databases. The source material used was the outputs available from Wessex Archaeology and soon to be available from the Archaeology Data Service, where the Linked Data will also soon be available.

The method for working with the data using the Stellar toolkit involves two main processes. Firstly, the data is mapped to one or both of the ontologies supported by the toolkit, namely the CRM-EH (for archaeological fieldwork data) which is an extension to the other ontology supported, namely the CIDOC CRM which supports the full range of cultural heritage information. For the Colonisation of Britain project, a mapping was undertaken the the main CIDOC CRM ontology as the data does not relate to archaeological investigations, rather describes collections of objects from the Jacobi archive.

The second stage is to create templates representing this mapping for use with the Stellar application. This uses the templates and the source data to produce CIDOC CRM compliant Linked Data in the form of RDF files. The format of the templates is defined by the String Template system, “a java template engine (with ports for C#, Python) for generating source code, web pages, emails, or any other formatted text output“. The user defined templates created for this project reference the core Stellar templates written by Ceri Binding (University of South Wales) which do most of the heavy lifting.

A further stage was also undertaken to align geospatial elements of the source data with Ordnance Survey data. Where placenames occurred in the source data, these were converted to Ordnance Survey OpenSpace URIs using the Open Refine platform drawing on the OS Reconciliation API (see below for details). This included parish and county names which were linked to the OS BoundaryLine dataset. A further piece of work will be undertaken which will take this one step further and add GeoSPARQL nodes to the Linked Data resource, enabling the data to be more fully included in Linked Geospatial Data graphs.

Mapping

This mapping describes the data as it is stored in the digitised version of the Jacobi archive; this means that some concepts are not fully resolved, for example places of origin, as it cannot be stated with certainty that any two identical place names refer to the same place. It is of course, still possible to query the semantic data using the longer chains so for example with respect to place names, whilst two occurrences of the same place name are represented in the output Linked Data as two distinct places, they can be linked by virtue of having the same place name appellation.

The heart of the Linked Data resource is each collection of artefacts described by the original card index. This is modelled as the CIDOC CRM concept E78 Collection. Each artefact collection is described by an index card (E31 Document) which documents the collection itself as well as the ‘site’ ie the place of origin (E53 Place). Where specific spatial coordinates exist, these have been included as appellations of the spatial nodes (E47 Spatial Coordinates). Spatial metadata such as precision is represented as classifications of the spatial nodes (using E54 Dimensions, E58 Measurement Units and E55 Type).

The artefact collection can form part of a larger collection, for example a named collection (E78 Collection) curated by a museum (E40 Legal Body), and is classified using the classificatory schemes used by both Jacobi and Wessex Archaeology through their enhancement (all using E55 Type). Additional information is stored as notes associated with particular concepts as appropriate. The index cards describe collections by material type so the resultant Linked Data does likewise, with each collection having a type of material (E57 Material).

Finally, assertions made through the project by the specialists were the product of a specific activity (E13 Attribute Assignment). This allows each collection to be associated with an archaeological period (eg Mesolithic) represented here as a purely temporal concept (E49 Time Appellation) and this linkage can be extended as required to link to eg the Heritage Data period resources, represented using SKOS.

Open Refine

The Open Refine platform is a very useful way of converting spatial data stored as text strings (eg names of counties, parishes, etc) into URIs suitable for inclusion in Linked Data resources. For details on how to use this, see the official Ordnance Survey documentation and also this very helpful blog by John Goodwin.

Visualisation

The Linked Data data can be visualised in a variety of ways. The image presented here have been created using Gruff to be demonstrative of the shape of the Linked Data. The image shown here illustrates the overall shape of the resource using one specific record; the collection, associated places, classifications, curatorial organisations and the project itself can all be seen. Also noteworthy are the nodes in the bottom right of the image which relate to the Ordnance Survey TOIDs referenced in this case.

Example showing the Linked Data for record 00748 (click for a larger view)

Templates

The Stellar templates used for the project are included here for information. They are dependent on the CRM-EH and CRM templates distributed with the Stellar Toolkit. Templates are distributed using a Creative Commons license so do make use of them for any purpose, the only restriction on usage is that proper attribution be made.

For more information on writing custom templates, see the Stellar Tools documentation, the String Template documentation or get in touch.

CRM_COB_ARTEFACTS – The core Collection template
CRM_COB_ARTEFACT_COLLECTIONNOTES – Collection descriptive information
CRM_COB_ARTEFACT_ASSEMBLAGEDESCR – Collection descriptive information
CRM_COB_ARTEFACT_TAGS – Collection classifications
CRM_COB_SITE – The core Sites template
CRM_COB_SITE_SPATIAL – Sites spatial information
CRM_COB_SITE_SPATIAL_OS – Sites spatial information, for use with Open Refine
CRM_COB_SITE_NOTES – Sites descriptive information

Acknowledgements

Thanks are due to Chris Brayne and Matt Leivers (Wessex Archaeology) for commissioning the project and doing all the really hard work successfully undertaking the main Colonisation of Britain project. A big thanks to Ceri Binding (University of South Wales) for support on the Stellar Toolkit and thanks also to Michael Charno (Archaeology Data Service) for liaising over ADS handover and specific requirements.

The post Colonisation of Britain appeared first on Archaeogeomancy: Digital Heritage Specialists.

GSTAR: investigation of methods for working with geosemantic data, integrating geospatial data with semantic data

February 12, 2014GSTARArchaeology Data Service, CIDOC CRM, CRM EH, geosemantic, GeoSPARQL, geospatial, GIS Research Unit, Hypermedia Research Unit, Linked Data, ontology, Oracle, Research News, Semantic Web, Technology, University of South Wales, Web Feature Service, Well Known TextCripps P

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

Mapping rubble by Brian Hoffman

The first investigation in the GeoSemantic Technologies for Archaeological Research (GSTAR) research project is nearing completion, an assessment of approaches to the integration of geospatial archaeological data into a semantic framework to provide geosemantic capabilities.

The investigation draws on archaeological excavation data lodged with the Archaeology Data Service (ADS) and made available as Linked Data (LD) through the ADS’s Linked Data platform. The data relates to the Cobham Golf Course site and was produced by Oxford Archaeology (OA) as part of the Channel Tunnel Rail Link (CTRL) project then turned into a Linked Data resource through the Semantic Technologies Enhancing Links and Linked data for Archaeological Resources (STELLAR) project, undertaken by the Hypermedia Research Unit at the University of South Wales (USW).

Mapping a feature by Wessex Archaeology

The GSTAR literature review identified two strands of integration approaches within published literature. Emerging from the semantic web and Linked Data communities, an approach involving the direct inclusion of geospatial data within semantic resources, leveraging World Wide Web Consortium (W3C) standards for Resource Description Framework (RDF) and Open Geospatial Consortium (OGC) standards for Well Known Text (WKT, part of the Simple Features specification) and GeoSPARQL. Emerging from the Geographic Information Science (GISc) community, approaches involving the use of Web Feature Services (WFS) within broader Spatial Data Infrastructures (SDI) running in parallel and linked to to semantic resources.

This initial GSTAR investigation looked at both these strands with a view to assessing suitable modes for subsequent use in the next phases of the GSTAR project. A WISSKI installation has also been setup to allow for the minting of any additional URIs needed.

GeoSPARQL route

This involved creating geosemantic data aligned with the CRM-EH extension to the CIDOC CRM ontology, stored within the Oracle Spatial & Graph platform and accessed via GeoSPARQL using an Oracle WebLogic web server and the Jena Framework.

[code language=”xml”]
<owl:Class rdf:about="http://purl.org/crmeh#EHE0022_ContextDepiction">
<rdfs:isDefinedBy rdf:resource="http://purl.org/crmeh#CRMEH"/>
<rdfs:subClassOf rdf:resource="http://erlangen-crm.org/110404/E47_Spatial_Coordinates"/>
<rdfs:label>Context Depiction</rdfs:label>
<rdfs:comment>
The Spatial co-ordinates of a Context, defining the actual spatial extent of the context. Usually recorded at the time of excavation or other investigative work
</rdfs:comment>
</owl:Class>
[/code]
The OWL definition of the EHE0022 class used to describe depictions

Further triples were also added to describe the depiction using the GeoSPARQL ogc:hasGeometry and ogc:asWKT properties.

[code language=”xml”]
<owl:ObjectProperty rdf:about="#hasGeometry">
<rdfs:isDefinedBy rdf:resource=""/>
<rdfs:isDefinedBy rdf:resource="http://www.opengis.net/spec/geosparql/1.0"/>
<skos:prefLabel xml:lang="en">hasGeometry</skos:prefLabel>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2011-06-16</dc:date>
<dc:contributor>Matthew Perry</dc:contributor>
<dc:description xml:lang="en">
A spatial representation for a given feature.
</dc:description>
<rdfs:range rdf:resource="#Geometry"/>
<rdfs:comment xml:lang="en">
A spatial representation for a given feature.
</rdfs:comment>
<rdfs:domain rdf:resource="#Feature"/>
<rdfs:label xml:lang="en">hasGeometry</rdfs:label>
<dc:creator>OGC GeoSPARQL 1.0 Standard Working Group</dc:creator>
<skos:definition xml:lang="en">
A spatial representation for a given feature.
</skos:definition>
</owl:ObjectProperty>
[/code]

The OWL definition of the hasGeometry property

GIS Server route

A second approach used the same base platform and data but accessed the geospatial component via WFS provided by GeoServer, drawing on the Oracle database.

Next steps

The results of this stage and the GSTAR project in general will be presented at the Computer Applications and Quantitative Methods in Archaeology (CAA) conference to be held in Paris, France during April 2014.

Thanks are due to the University of South Wales for funding the GSTAR project and to the people and organisations responsible for the tools, technologies and data being used. Also my PhD supervisor (Prof. Douglas Tudhope; USW), advisors (Dr Mark Ware and Dr Alex Lohfink: USW) and fellow researchers Ceri Binding (USW), Dr Andreas Vlachidis (USW) and Michael Charno (ADS) for their input.

The post GSTAR: investigation of methods for working with geosemantic data, integrating geospatial data with semantic data appeared first on Archaeogeomancy: Digital Heritage Specialists.

Geosemantics; the story so far

April 17, 2013GSTARAllegroGraph, API, ArcGIS, database, Eclipse, Geographic Information Systems, GeoSPARQL, GIS, Java, JDK, Jena, Maven, OGC, Oracle, Prolog, RDF, Research News, SDK, WFSCripps P

Archaeogeomancy: Digital Heritage Specialists – archaeological geomatics – the majick of spatial data in archaeology – archaeological information systems for the digital age:

Semantic Web Rubik’s Cube by dullhunk

Into the second month of the PhD now and things are starting to coalesce and take shape. A framework for development, testing and deployment of proposed demonstrators is emerging and I’m making good headway demystifying the world of geosemantics (at least, it’s becoming clearer in my head!).

So, as well as continuing with the literature review, I’m knitting together a whole bunch of tools:

Java Development Kit (JDK) – the programming language at the heart of it all
Maven – a project management and comprehension tool
Eclipse – open development platform
Jena – a Java framework for building Semantic Web applications
Oracle 11g – relational Database Management System (RDBMS) with Spatial and Semantic components
D2RQ – a system for accessing relational databases as virtual, read-only RDF graphs.
AllegroGraph – a graph database
Prolog – logic programming
Protégé – ontology editor and knowledge-base framework
GeoSPARQL – query language for geospatial data stored as RDF
ArcGIS – Geographic Information System for data preparation, processing, etc
GeoServer – open source GIS server written in Java that allows users to share and edit geospatial data.

I’ll be posting more along the journey. Next steps will be to complete the literature review, submit stage reports and use some real archaeological data. Exciting stuff!

The post Geosemantics; the story so far appeared first on Archaeogeomancy: Digital Heritage Specialists.

Research Context

Application within GSTAR

Related Work I – CRMgeo

Related Work II – CRMarchaeo

A lightweight mapping

Mapping rationale

From mapping to RDF

A working system

Conclusions

Share this:

Conclusion one: Oracle is really complicated

Conclusion two: Open Source software can be really good

Conclusion three: geosemantic applications using GeoSPARQL can really fly!

Where next…?

Historic Environment Record data

Case Studies and further investigations

Thesis

Share this:

Method

Mapping

Open Refine

Visualisation

Templates

Acknowledgements

Share this:

GeoSPARQL route

GIS Server route

Next steps

Share this:

Share this:

Related Work I – CRM_geo

Related Work II – CRM_archaeo