Land Book Data Reuse | Land Portal

Introduction


All the statistical data is available as Linked Open Data (LOD), using the Land Book LOD data model.


You can reuse the Land Book data in your own website, project, research and more (read more about our licenses).


 


Is the data available? where? how?


All our statistical data is accessible as Linked Open Data.


A SPARQL endpoint (http://landportal.info/sparql) is provided to query the data.


The SPARQL queries that have been used to retrieve the stats data in the country pages are available on gitHub.


 


Land Portal Linked Open Data: generation and consume flow


 



The process to offer all the Land Portal data as Linked Open Data is not a trivial task. Some task are run for this purpouse. Let's have a one on them (see the numbers in the circles):


  1. The first process is to generate RDF from the statistical data that is not available as Linked Open Data in the Land Book LOD Data Model. The statistical  data, that comes from a variety of datasets and in a divertisty of formats (excel files, CSV, APIs, JSON, XML...) is passed throught a one of the landbook-importers (available on the github repository). So, after the import process, a list of RDF files are generated and ready to be uploaded to the triple store.
  2. The generated RDF files (that contains the statistical information) are uploaded to Virtuoso, a triple store (also know as RDF store), where the information can be queried using the SPARQL protocol.
  3. This process is focused in uploading the data from landportal.info into the Virtuoso triple store in the Land Portal LOD data model. landportal.info (this portal that is running over a Drupal instance) hosts a lot of data saved in a MySQL database. In order to push all this data into the Land Portal Virtuoso a process is runned. This process, using the SPARQL Update protocol, is a combination of a customized fork of the rdfx module (that shapes the RDF generated by Drupal) and a customized fork of the RDF Drupal Indexer module. The latter module uses the Drupal Search API to publish triples to a triple store (in a selected graph).

  4. In the moment that all the data is available on the Virtuoso triple store, it can be queried using the SPARQL procotol through the SPARQL endpoint (http://landportal.info/sparql

 


Land Book LOD Data Model


The Land Book LOD Data Model is organized around country-based datasets and indicators, following widely adopted standards for statistical data.


The Land Book LOD data model (part of the whole Land Portal LOD data model) is designed on top of the following existing vocabularies:


  • Dublin Core for properties common to most resources
  • RDF Data Cube provides a means to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts using RDF
  • Computex (Computing Statistical Indexes) can be seen as an extension of RDF Data Cube vocabulary to handle statistical indexes.
  • SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations.
  • The OWL-Time ontology is an ontology of temporal concepts, for describing the temporal properties of resources in the world
  • The Schema.org vocabulary for properties of all relevant entities (creative works, persons, organizations, events, places)
  • The SKOS vocabulary for all related concepts 

Table 1. Namespaces used in the Land Book LOD data model














Prefix Namespace
cex http://purl.org/weso/computex/ontology#
dct http://purl.org/dc/terms/
lb http://purl.org/weso/landbook/ontology#
owl http://www.w3.org/2002/07/owl#
qb http://purl.org/linked-data/cube#
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs http://www.w3.org/2000/01/rdf-schema#
schema http://schema.org/
sdmx-attribute http://purl.org/linked-data/sdmx/2009/attribute#
skos http://www.w3.org/2004/02/skos/core#
time http://www.w3.org/2006/time#

Entity: Dataset


A dataset is a collection of data, published or curated by a single agent (source), and available for access or download in one or more formats (definition from DCAT).


The fields of a dataset are:


  • Label: Label of the dataset.
  • Description: Description of the dataset.
  • ID: Internal ID of the dataset.
  • Logo: The logo of the dataset
  • License: The license of the dataset
  • Copyright details: Detailed copyright statements to be highlighted
  • Organization: Organization that publish the dataset.
  • Related Themes: Themes related to the dataset.
  • Related LandVoc Concepts: LandVoc concepts related to the dataset.

RDF types:



skos:Concept, qb:DataSet, dcat:Dataset


 


URI pattern:



http://data.landportal.info/dataset/{dataset-ID}


Values: Taxonomy: Dataset













Properties RDF predicates Predicate type (details)
Label

skos:prefLabel, dct:title, rdfs:label

literal
Description

skos:definition, dct:description, rdfs:comment

literal
ID

skos:notation, dct:identifier

literal
Organization (publisher) dct:publisher resource (Entity: Organization)
Themes dct:subject, schema:about (Drupal) resource (Entity: LandVoc Theme)
Related concepts (LandVoc) dct:subject, schema:about (Drupal) resource (Entity: LandVoc Concept)
Logo schema:image, schema:logo resource
License dct:license resource (Entity: License)
Copyright details dc:rights literal
See Also rfds:seeAlso resource (Drupal)

Entity: Indicator


A statistical indicator is a data element that represents statistical data for a specified time, place, and other characteristics (definition from OECD). Currently, the place is limited to the country level.


The fields of an indicator are:


  • Label: Label of the indicator.
  • Description: Description of the indicator.
  • Picture: Image to describe the indicator.
  • Dataset: Dataset with data of this indicator
  • ID: Internal ID of the indicator.
  • Min: Minimun possible value (integer) of the indicator.
  • Max: Maximun possible value (integer) of the indicator.
  • Measurement unit: Measurement unit, like % or hectares, of the indicator.
  • has Coded Value: The values for this indicator is taken from some controlled term list (could be characters, colors, strings, numbers...) Read more
  • High / Low: High means it is better to have a high value, low means the best value is the lowest one (like in rankings).
  • Related Themes: Themes related to the indicator.
  • Related LandVoc Concepts: LandVoc concepts related to the indicator.

RDF types:



skos:Concept, cex:Indicator


URI pattern:



http://data.landportal.info/indicator/{indicator-ID}


Values: Taxonomy: Indicator












Properties RDF predicates Predicate type (details)
Label

skos:prefLabel, dct:title, rdfs:label

literal
Description

skos:definition, dct:description, rdfs:comment

literal
ID

skos:notation, dct:identifier

literal
Dataset dct:source resource (Entity: Dataset)
Measurement unit sdmx-attribute:unitMeasure literal
Themes dct:subject, schema:about resource (Entity: LandVoc Theme)
Related concepts (LandVoc) dct:subject, schema:about resource (Entity: LandVoc Concept)
Picture schema:image resource
See also rdfs:seeAlso resource (Drupal)

 



Entity: Observation


An Observation represent a single indicator value for a given year and area.


We consider three main dimensions for each observation:


  • Indicator: The reference indicator
  • Area: A geographic area (a country or a region)
  • Time: The time that is referred by that observation (usually a year or time interval). Use Time Ontology in OWL.

And each observation has a value:


  • Value: could be numeric or not (xsd:integer, xsd:double, xsd:string)

Also each observation has:


  • Label: A label generated using the pattern "Value of {Region} in {Time} for indicator {Indicator}"@en
  • Note: An optional comment or note about the observation.
  • Dataset: Dataset to which an observation belongs
  • Timestamp: Date when the observation was generated
  • Computation from which this observation has been obtained, from a closed list (with rdf:type cex:ObsStatus, like cex:Raw)
  • Observation status: Observation status code obtained from a close list

 


RDF types:



qb:Observation


URI pattern:



http://data.landportal.info/dataset/{dataset-ID}/observation/{observation-ID}


 













Properties RDF predicates Predicate type (details)
Indicator cex:ref-indicator resource (Entity: Indicator)
Area cex:ref-area resource (Entity: Region)
Time cex:ref-time resource (time:DateTimeInterval)
Value cex:value

literal (xsd:integer, xsd:double, xsd:string)

Label rdfs:label literal
Note rdfs:comment literal
Dataset qb:dataSet resource (Entity: Dataset)
Timestamp dct:issued literal (xsd:date)
Computation obtained from cex:computation resource (with rdf:type cex:ObsStatus)
Observation status codes sdmx-concept:obsStatus literal (code list)

Entity: Region (Land Book)


RDF types:



lb:Country


URI pattern:



http://data.landportal.info/geo/{ISO 3166-1 alpha-3 code (aka ISO3)}


ValuesTaxonomy: Regions (only with ISO3 code)





Properties RDF predicates Predicate type
Name dct:title, rdfs:label literal
ISO 3166-1 alpha-3 code (aka ISO3) dct:identifier, skos:notation, geonames:countryCode literal

 


Previous documentation


This is the old documentation, that is is going to be updated soon http://landportal.github.io/landbook-doc/data/


 


Visualizations


You can use our visualization library: View COuntry DAta (js-view-coda), which is available on GitHub


https://github.com/landportal/js-view-coda/


All the visualizations you saw on the Land Book pages are using this library.


 


Country portfolios


A short video tutorial to introduce our new country pages.



 


Share it, get in touch. Questions? Contact us!


Are you are using our data for your project? Share the love with us! If you want, we can even broadcast your audience and share it with people who are intereset in the same land-related issues as you.


Don't hesitate to conctact the Land Portal team if you have any question or suggestion.


 

Compartilhe esta página