Introduction
All the statistical data is available as Linked Open Data (LOD), using the Land Book LOD data model.
You can reuse the Land Book data in your own website, project, research and more (read more about our licenses).
Is the data available? where? how?
All our statistical data is accessible as Linked Open Data.
A SPARQL endpoint (http://landportal.info/sparql) is provided to query the data.
The SPARQL queries that have been used to retrieve the stats data in the country pages are available on gitHub.
Land Portal Linked Open Data: generation and consume flow
The process to offer all the Land Portal data as Linked Open Data is not a trivial task. Some task are run for this purpouse. Let's have a one on them (see the numbers in the circles):
- The first process is to generate RDF from the statistical data that is not available as Linked Open Data in the Land Book LOD Data Model. The statistical data, that comes from a variety of datasets and in a divertisty of formats (excel files, CSV, APIs, JSON, XML...) is passed throught a one of the landbook-importers (available on the github repository). So, after the import process, a list of RDF files are generated and ready to be uploaded to the triple store.
- The generated RDF files (that contains the statistical information) are uploaded to Virtuoso, a triple store (also know as RDF store), where the information can be queried using the SPARQL protocol.
-
This process is focused in uploading the data from landportal.info into the Virtuoso triple store in the Land Portal LOD data model. landportal.info (this portal that is running over a Drupal instance) hosts a lot of data saved in a MySQL database. In order to push all this data into the Land Portal Virtuoso a process is runned. This process, using the SPARQL Update protocol, is a combination of a customized fork of the rdfx module (that shapes the RDF generated by Drupal) and a customized fork of the RDF Drupal Indexer module. The latter module uses the Drupal Search API to publish triples to a triple store (in a selected graph).
-
In the moment that all the data is available on the Virtuoso triple store, it can be queried using the SPARQL procotol through the SPARQL endpoint (http://landportal.info/sparql)
Land Book LOD Data Model
The Land Book LOD Data Model is organized around country-based datasets and indicators, following widely adopted standards for statistical data.
The Land Book LOD data model (part of the whole Land Portal LOD data model) is designed on top of the following existing vocabularies:
- Dublin Core for properties common to most resources
- RDF Data Cube provides a means to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts using RDF
- Computex (Computing Statistical Indexes) can be seen as an extension of RDF Data Cube vocabulary to handle statistical indexes.
- SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations.
- The OWL-Time ontology is an ontology of temporal concepts, for describing the temporal properties of resources in the world
- The Schema.org vocabulary for properties of all relevant entities (creative works, persons, organizations, events, places)
- The SKOS vocabulary for all related concepts
Table 1. Namespaces used in the Land Book LOD data model
Prefix | Namespace |
---|---|
cex | http://purl.org/weso/computex/ontology# |
dct | http://purl.org/dc/terms/ |
lb | http://purl.org/weso/landbook/ontology# |
owl | http://www.w3.org/2002/07/owl# |
qb | http://purl.org/linked-data/cube# |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
schema | http://schema.org/ |
sdmx-attribute | http://purl.org/linked-data/sdmx/2009/attribute# |
skos | http://www.w3.org/2004/02/skos/core# |
time | http://www.w3.org/2006/time# |
Entity: Dataset
A dataset is a collection of data, published or curated by a single agent (source), and available for access or download in one or more formats (definition from DCAT).
The fields of a dataset are:
- Label: Label of the dataset.
- Description: Description of the dataset.
- ID: Internal ID of the dataset.
- Logo: The logo of the dataset
- License: The license of the dataset
- Copyright details: Detailed copyright statements to be highlighted
- Organization: Organization that publish the dataset.
- Related Themes: Themes related to the dataset.
- Related LandVoc Concepts: LandVoc concepts related to the dataset.
RDF types:
skos:Concept, qb:DataSet, dcat:Dataset
URI pattern:
http://data.landportal.info/dataset/{dataset-ID}
Values: Taxonomy: Dataset
Properties | RDF predicates | Predicate type (details) |
---|---|---|
Label |
skos:prefLabel, dct:title, rdfs:label |
literal |
Description |
skos:definition, dct:description, rdfs:comment |
literal |
ID |
skos:notation, dct:identifier |
literal |
Organization (publisher) | dct:publisher | resource (Entity: Organization) |
Themes | dct:subject, schema:about (Drupal) | resource (Entity: LandVoc Theme) |
Related concepts (LandVoc) | dct:subject, schema:about (Drupal) | resource (Entity: LandVoc Concept) |
Logo | schema:image, schema:logo | resource |
License | dct:license | resource (Entity: License) |
Copyright details | dc:rights | literal |
See Also | rfds:seeAlso | resource (Drupal) |
Entity: Indicator
A statistical indicator is a data element that represents statistical data for a specified time, place, and other characteristics (definition from OECD). Currently, the place is limited to the country level.
The fields of an indicator are:
- Label: Label of the indicator.
- Description: Description of the indicator.
- Picture: Image to describe the indicator.
- Dataset: Dataset with data of this indicator
- ID: Internal ID of the indicator.
- Min: Minimun possible value (integer) of the indicator.
- Max: Maximun possible value (integer) of the indicator.
- Measurement unit: Measurement unit, like % or hectares, of the indicator.
- has Coded Value: The values for this indicator is taken from some controlled term list (could be characters, colors, strings, numbers...) Read more
- High / Low: High means it is better to have a high value, low means the best value is the lowest one (like in rankings).
- Related Themes: Themes related to the indicator.
- Related LandVoc Concepts: LandVoc concepts related to the indicator.
RDF types:
skos:Concept, cex:Indicator
URI pattern:
http://data.landportal.info/indicator/{indicator-ID}
Values: Taxonomy: Indicator
Properties | RDF predicates | Predicate type (details) |
---|---|---|
Label |
skos:prefLabel, dct:title, rdfs:label |
literal |
Description |
skos:definition, dct:description, rdfs:comment |
literal |
ID |
skos:notation, dct:identifier |
literal |
Dataset | dct:source | resource (Entity: Dataset) |
Measurement unit | sdmx-attribute:unitMeasure | literal |
Themes | dct:subject, schema:about | resource (Entity: LandVoc Theme) |
Related concepts (LandVoc) | dct:subject, schema:about | resource (Entity: LandVoc Concept) |
Picture | schema:image | resource |
See also | rdfs:seeAlso | resource (Drupal) |
Entity: Observation
An Observation represent a single indicator value for a given year and area.
We consider three main dimensions for each observation:
- Indicator: The reference indicator
- Area: A geographic area (a country or a region)
- Time: The time that is referred by that observation (usually a year or time interval). Use Time Ontology in OWL.
And each observation has a value:
- Value: could be numeric or not (xsd:integer, xsd:double, xsd:string)
Also each observation has:
- Label: A label generated using the pattern "Value of {Region} in {Time} for indicator {Indicator}"@en
- Note: An optional comment or note about the observation.
- Dataset: Dataset to which an observation belongs
- Timestamp: Date when the observation was generated
- Computation from which this observation has been obtained, from a closed list (with rdf:type cex:ObsStatus, like cex:Raw)
- Observation status: Observation status code obtained from a close list
RDF types:
qb:Observation
URI pattern:
http://data.landportal.info/dataset/{dataset-ID}/observation/{observation-ID}
Properties | RDF predicates | Predicate type (details) |
---|---|---|
Indicator | cex:ref-indicator | resource (Entity: Indicator) |
Area | cex:ref-area | resource (Entity: Region) |
Time | cex:ref-time | resource (time:DateTimeInterval) |
Value | cex:value |
literal (xsd:integer, xsd:double, xsd:string) |
Label | rdfs:label | literal |
Note | rdfs:comment | literal |
Dataset | qb:dataSet | resource (Entity: Dataset) |
Timestamp | dct:issued | literal (xsd:date) |
Computation obtained from | cex:computation | resource (with rdf:type cex:ObsStatus) |
Observation status codes | sdmx-concept:obsStatus | literal (code list) |
Entity: Region (Land Book)
RDF types:
lb:Country
URI pattern:
http://data.landportal.info/geo/{ISO 3166-1 alpha-3 code (aka ISO3)}
Values: Taxonomy: Regions (only with ISO3 code)
Properties | RDF predicates | Predicate type |
---|---|---|
Name | dct:title, rdfs:label | literal |
ISO 3166-1 alpha-3 code (aka ISO3) | dct:identifier, skos:notation, geonames:countryCode | literal |
Previous documentation
This is the old documentation, that is is going to be updated soon http://landportal.github.io/landbook-doc/data/
Visualizations
You can use our visualization library: View COuntry DAta (js-view-coda), which is available on GitHub
https://github.com/landportal/js-view-coda/
All the visualizations you saw on the Land Book pages are using this library.
Country portfolios
A short video tutorial to introduce our new country pages.
[video:https://youtu.be/3xRe5JAbli8]
Share it, get in touch. Questions? Contact us!
Are you are using our data for your project? Share the love with us! If you want, we can even broadcast your audience and share it with people who are intereset in the same land-related issues as you.
Don't hesitate to conctact the Land Portal team if you have any question or suggestion.