Data Linking by Indirect Spatial referencing frameworks - Places names and other core data

4 - 5 September 2018 (ISC-PIF, 113 Rue Nationale, 75013 Paris, FRANCE)

Co-organized by EuroGeographics and EuroSDR.

ISC-PIF, 113 Rue Nationale, 75013 Paris, France
September 4th - 5th 2018 (starting at 1:30 pm the first day and finish 4:00 pm the second day)


Contact:

This seminar aims at studying the usage of indirect spatial referencing systems – place names and other core data - for data linking.

Program:

Following the classical structure of EuroSDR and Eurogeographics seminars, this seminar will be organised around presentations of participants’ experiences in the domain of data registration and interconnection based on indirect reference systems, on scientific literature reviews and discussions to propose answers to the following questions: 

  • What are the new requirements for indirect spatial referencing services or data linking for our societies, 
  • What does it imply in terms of data structure, services and organisation? 
  • How can EuroSDR contribute in engaging scientific communities and developers on pending issues and new challenges based on ‘real problems illustrated with data’.

On the last day, a specific wrap up session is devoted to discussions, drafting a EuroSDR report on this topic, designing a 1 year challenge.

Day 1: September 4th (13:30 – 18:00)

13:15-13:30: Registration, welcome coffee

General introduction (pdf)
Bénédicte Bucher 

This seminar aims at studying the usage of indirect spatial referencing systems – place names and other core data - for data linking. Whereas direct referencing refers to traditional geocoding, i.e directly attaching coordinates to an entity; indirect spatial referencing is achieved by associating an entity with a spatial entity through reference of a URI only. In some contexts, this methodology has several pros like human readability of location, single storage of authoritative data and more consistent and accurate resources. For example, place names are one of the most ancient and widely used frameworks to locate as well as to interconnect a wide range of information assets, like databases, structured web content, pdf reports, regulations and maps. The scientific and technological evolutions in information infrastructure, information retrieval, as well as in knowledge representation and machine learning have offered even more opportunities in using commonly used identifiers of spatial entities, like place names but also administrative units, buildings with identifiers, as a key component to locate and interconnect pieces of information.  

Georef, Service and Development Platform
Esa Tiainen (pdf), Thomas Ellett (pdf)

Gazetteers for linking text to space: experiences with contrasting corpora (pdf)
Elise Acheson, University of Zurich, Switzerland 

Gazetteers play a central role in many text-to-space workflows, including for toponym recognition (identifying possible toponyms within texts) and toponym resolution (resolving toponyms to a unique identifier and potentially linking to spatial representations). In this talk we will discuss requirements for gazetteers used in Geographical Information Retrieval (GIR), specifically in the context of georeferencing a range of corpora. Our first use case will highlight requirements and challenges for the detailed annotation and spatial grounding of a geographically-focused corpus containing many fine-grained toponyms, such as documents describing Swiss landscapes. A second use case will look at the process of linking scientific articles to space, and thus dealing with a more global, yet more common, set of locations. Finally we will examine the difficult task of linking social media content to space, which presents a range of challenges including global locations, varying granularities, and very limited context. In each case, gazetteer coverage, completeness, and organization affect the ease of implementing solutions and the success that they can achieve.

Designing data projects: how to value geographical heritage data with state of the art solutions? (pdf)
Julien Homo, Kévin Darty, Foxcub, France 

Foxcub is a young data agency accompanying organizations to use state of the art data solutions, including to enhance their interconnections.

Assessing the importance of named places: benefits and difficulties (pdf)
Dominique Laurent, IGN France 

Named places and their geographical names are used for two main purposes: as search criteria (e.g. in gazetter, GeoPortals) and for mapping. The first use case requires data completeness (users willing to find the named place associated with any geographical name) whereas the second use case require selection criteria (as it is frequently impossible to display all names and named places in the limited extent of a paper sheet or of a map screen). In a first step, mapping agencies have selected relevant named places for maps at some given scale(s), following a cartographic viewpoint. However, this selection is very specific, both to a territory (and so difficult to harmonise across Europe) and to a scale or limited set of scales. In a second step, one of the objectives of the INSPIRE Directive is to make existing data interoperable. However, regarding theme Geographical Names, the data specifications have just included attributes about the least and more detailed viewing resolution, without any guidelines about how to interpret these subjective cartographic notions. This is why, in a third step, the UN-GGIM: Europe Working Group on core data is proposing a more objective approach, by encouraging the estimation of the importance of the named place in the real-world, following a topographic, database  viewpoint. The “Recommendation for content – Spatial Core data theme GeographicalNames” document is promoting the capture of quantifiable criteria measuring the importance of the named place in real world, such as its area (by capturing “true” geometry) or its population (for populated places).  However, there are some remaining issues and questions to be addressed, possibly by the research community. Capturing the “true” geometry of named places is both a challenge (how to do it in a reliable way whereas many named places, such as mountain chains or seas, have a fuzzy geometry?) and an opportunity (how much it could improve the linking by indirect spatial referencing?).  

The other potential research topic is related to the objective selection criteria: in addition to area and population, other criteria (e.g. touristic interest) have to be identified and methods of assessment have to be found.

Discussion on data linking by place names 

- What are the new requirements for data linking by place names for our societies? 

- What does it imply in terms of data structure, services and organization? 

- How can EuroSDR contribute in engaging scientific communities and developers on pending issues and new challenges based on ‘real problems illustrated with data’.


Day 2: September 5th (9:00-15:00)

Finnish Linked Data pilots (pdf)
Kai Koistinen, National Land Survey, Finland 

This presentation will give an overview and some live demos on Linked geospatial data pilots implemented in Finland. Piloted themes include Geographic Names, Buildings, Administrative units and Statistical units.

The challenge of linking or integrating data on Buildings (pdf)
Dominique Laurent,  IGN France 

In many countries, data on theme Buildings is scattered between different data producers and different products. Typically, there may be data on Buildings in cadastral, mapping or statistical agencies, in Housing Ministry, in local governments …. There are also lots of documents that might be linked to Buildings data, such as building permits, energy performance assessment reports, evacuation plans. Most users would like to get access to the available information in an easy way, either by information of interest being integrated in a single data set or by information of interest being linked to reference geometric representation(s). 

However, this may be quite difficult to achieve due to the fact that there is no a single view on buildings: the same real-world entity may be considered as various features according various stakeholders, i.e. data producers will likely use different geometric representations and even different segmentations of buildings. For instance, the CityGML standard doesn’t provide any clear guidelines about use of the Building and BuildingPart concepts. The benefits and difficulties of integrating or linking data from various products on Building theme have been identified by several initiatives, such as the UN-GGIM: Europe Working Group on core data or the French Working Group on unique identification of Buildings. Research may be required to investigate both organizational issues (how to ensure efficient cooperation between various data producers?) and technical issues (which are the most frequent segmentation practices? which linkage mechanisms, e.g. address or unique identification of buildings, are the most efficient?).

Administratieve Units as Linked Open Data - A casestudy from the Norwegian Mapping Authority (pdf)
Thomas Ellett, Kartverket, Noway 

In 2017, the Norwegian Mapping Authority started work on a Linked Open Data project to distribute Administrative Units data through the RDF framework. The specific use case was to store administrative unit values in DCAT metadata as URI’s, thus enabling better consistency of data, better handling of versioning and additional information made available to the end user through deferenceable URI’s. The whole project has been completed using open source software and libraries, from Protégé with an Ontop plugin for ontology development and data transformation, to Virtuoso for RDF data storage and OpenApi and the Linked Data Theatre for data access endpoints. This presentation will cover some basic theory on Linked Open Data and RDF, before delivering information about the technical elements of the project, both successes and challenges, and present information on how the general infrastructure has been setup and give a live demonstration of the different endpoints available. 

Wikidata, a short introduction (pdf)
Julien Boisset, Wikimedia foundation, France

Linear indirect reference systems to interconnect data in transportation applications, Alain Chaumet, ENSG-Valilab (pdf)

Location referencing practices are issued from two main knowledge domains which need location designation to develop their daily activities. The first knowledge domain has a description objective to maintain order in a given area. This official and state power approach has developed common techs for military mapping and cadaster.  The first examples are the Mediterranean maps, worlds maps and local descriptions, litteral or by drawing, of the agricultural land. This knowledge domain refers to scientific fields needed to build step by step geodetic networks and maps with national coverage. 

The second knowledge domain is information related to transportation ways.  Information is points of interest and towns with their specific designations and the distances between these locations. The set of measured distances between known locations builds an indirect location system. The first known example of this kind of location systems is the Peutinger table which was very useful for travelers of the antique world.

Recent technics should allow to merge both domains and develop consistent and easy to use translation application between direct reference systems (lambda, phi or X,Y) and indirect systems commonly used. We will present an indirect location systems panorama and associated problems.  We will list the expected benefits of merging direct and indirect location systems in the context of two running projects, the european EU-EIP (European ITS Platform) and the French LaSDIM  (Large Scale Data Information for Mobility).  


Lunch

Wrap up, drafting position papers and challenges