Connecting to the API

The API entry point is located at:

http(s)://api.uis.unesco.org/sdmx

An API key is required to make requests to the API. To obtain a key you need to sign in and subscribe to the ‘Starter’ product in the Products page.

Once you have a key, you can specify it using the query string parameter subscription-key or using the header parameter Ocp-Apim-Subscription-key.

API Definition

The API definition is available in WADL and Swagger format from the API page. You will need to be signed in and then click on the UIS SDMX API.

There you will see the list of endpoints offered by the API and download the definition files.

Specifying Response Format

The API supports many response formats in order to specify response format, the Accept HTTP Header can be used to specify format.  The following Accept headers are supported:

SDMX Formats

application/vnd.sdmx.structure;version=edi

application/vnd.sdmx.structure+xml;version=1.0

application/vnd.sdmx.structure+xml;version=2.0

application/vnd.sdmx.structure+xml;version=2.1

JSON Format

application/vnd.sdmx.json

RDF Formats

application/vnd.rdf+xml

application/vnd.rdf+json

application/vnd.rdf+turtle

Excel Format

application/vnd.xlsx

Alternatively the format query parameter can be used to specify response format.  The following values are supported:

sdmx (latest version)

sdmx-1.0

sdmx-2.0

sdmx-2.1

sdmx-edi

sdmx-edi-lenient

sdmx-json

rdf-turtle

rdf- json

rdf- xml

xlsx

Example:

format=sdmx-edi

Note: for the remainder of this document, the SDMX-JSON format will be used.

Specifying Response Language

To request content in specify a specific locale the Accept-Language HTTP Header can be used.  In addition the query parameter ‘locale’ can be used for example:

http://api.uis.unesco.org/sdmx/codelist/all/all/latest?format=sdmx-json&locale=fr

The locale query parameter takes precedence over the Accept-Language header.  If the language does not exist for a given name or description, any other provided language will be used, defaulting to English.

If no language is specified all available languages will be present in the response message (with the exception of a data response whose labels are in a single language defaulting to English).

Reducing Network Bandwidth using Compression

The SDMX API supports GZIP compression on all responses.  GZIP provides a substantial reduction in network traffic by compressing the content of the message before sending it to the client.  GZIP is natively supported by all major browsers.  To make use of gzip compression, the Accept-Encoding  HTTP Request Header should include gzip.

Cross-Origin Resource Sharing  (CORS)

The API supports queries which originate domains external to the hosted domain.   

Client Side Caching

The client side application can make use of HTTP 304 (resource not modified) responses from the server in order to reduce network traffic for repeated queries. 

The server will respond to web service queries and in the HTTP Header include an ETag header, for example:

ETag: MTQ4MDA4NDg5NjcxN180

When the client requests the same resource, it includes the If-None-Match HTTP Header in the request, with the value of the ETag from the previous response.

If-None-Match: MTQ4MDA4NDg5NjcxN180

If the server finds the resource has not been modified since it last issues the ETag, then it will respond with a HTTP 304 status, indicating to the client that it should use its cached copy.  If the server finds that the server copy has been modified, it will respond to the query as normal with a HTTP 200 (Ok) status.

All major Web Browsers and JavaScript frameworks will make use of the ETag caching solution natively so will not require any specific coding.

Data Discovery

In SDMX terminology a Dataflow is used to represent a ‘Universe of Data’ for a particular domain. Most people think of a Dataflow as a Dataset, however in SDMX a Dataset is an instance of data for a given point in time, for example a dataset is the output of a data query.  The Dataflow on the other hand, represents a Queryable domain for which data exists.  For example ‘Feature Film’, ‘Education’ and ‘Exchange Rates’ are all example of Dataflows for which a user may wish to query data.

The first step in data discovery is this determining which Dataflow the client is interested in.  The second step is to allow the client to query the Dataflow for data.  The second step may involve the client can place filters on the Dataflow, for example the query for ‘Feature Film’ may include filters on ‘Film Type’ and ‘Cinema Type’. Each Dataflow will have its own set of ‘Dimensions’ to which filters can be applied.

List All Dataflows

To list all Dataflows set the structure type to dataflow, and all other path parameters to all.

http://api.uis.unesco.org/sdmx/dataflow/all/all/latest?format=sdmx-json

The following JSON Object will be returned, which contains an array of Dataflows. 

{
   "Dataflow" : [  …. array of dataflows ….]
}

Each Dataflow has an id, urn, name, owning agency, version, and a reference to the underlying Data Structure Definition (DSD)  which is used to describe the data.

{
  "id":"CE",
  "urn":"urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=UNESCO:CE(1.0)",
  "names":[
    {"locale":"en","value":"Cultural employment"},
    {"locale":"es","value":"Empleo cultural"}
  ],
  "agencyId":"UNESCO",
  "version":"1.0",
  "isFinal":false,
  "dataStructureRef":"urn:sdmx:org.sdmx.infomodel.datastructure.DataStructure=UNESCO:CE(1.0)"
}

List Dataflows By Category

Dataflows can be grouped into Categories.  Categories provide a hierarchical grouping of Dataflows, for example the image below shows a simple category tree, where each Category links to one or more Dataflows.

A Category Scheme is a container for Categories.  A Category Scheme can contain any number of Categories, and each Category may contain any number of child Categories. A Categorisation links a Category to a Dataflow.  This model is shown in the image below.

To obtain the information from the server requires two separate web service calls, one for the Dataflow and related Categorisations as shown below:

http://api.uis.unesco.org/sdmx/dataflow/all/all/latest/?references=categorisation&format=sdmx-json

The response is and array of Categorisation and Dataflow Objects

{
"Categorisation" : [ ….]
"Dataflow" : [ ….]
}

Each Dataflow and Categorisation Object contains an id, urn, name, agency, version:

{
 "id": "EDU_REGIONAL_MODULE",
 "urn": "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=UNESCO:EDU_REGIONAL_MODULE(1.0)",
 "names": [{
 "locale": "en",
 "value": "Education: Regional module"
 }],
 "agencyId": "UNESCO",
 "version": "1.0",
 ...
}

In addition The Categorisation contains the following two properties, which are used to link a Dataflow to a Category by referncing their URN property.

"structureReference": "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=UNESCO:RD(1.0)",
"categoryReference": "urn:sdmx:org.sdmx.infomodel.categoryscheme.Category=UNESCO:THEME_TREE(1.0).STI.RD"

The next step is for the client to request all the Category Schemes, using the following syntax:

http://api.uis.unesco.org/sdmx/categoryscheme/all/all/latest&format=sdmx-json

The response from the server is an Object with a Category Scheme array each array element is a Category Scheme Object.

{
 "CategoryScheme": [ … ]
}

Each Category Scheme contains an id, urn, name, agency, and items array.

{
 "id": "GLOSSARY",
 "urn": "urn:sdmx:org.sdmx.infomodel.categoryscheme.CategoryScheme=UNESCO:GLOSSARY(1.0)",
 "names": [{
 "locale": "en",
 "value": "UIS Glossary"
 }],
 "agencyId": "UNESCO",
 "version": "1.0",
 "isFinal": false,
 "isPartial": false,
 "items": [ ... ]
}

Each item in the items array is a Category Object as shown below:

{
 "id": "EDU",
 "urn": "urn:sdmx:org.sdmx.infomodel.categoryscheme.Category=UNESCO:GLOSSARY(1.0).EDU",
 "names": [{
 "locale": "en",
 "value": "Education"
 }],
 "items": [ ... ]
}

Each Category has an id, urn, name and optionally an items array, which will contain child Category Objects.  The Category tree has no limit on depth.

The client can combine the Category Scheme, Cateogrisation, and Dataflow information to present to the user a grouping of Dataflows by Category.

Querying for Data

In order to query for data, the client must first identify the Dataflow to be queried. 

The second step is to apply query filters on the Dimensions and observations.  Observation filters include date restrictions and specifying the maximum number of Observations per Series.

In order to construct a data query, the client must first obtain information about what data exists for the Dataflow.

Determining the Universe of Data for a Dataflow

To determine the available data for a given Dataflow, the following query syntax is used.

Note: the response type must be SDMX-JSON.

SYNTAX:

http://api.uis.unesco.org/sdmx/data/{dataflow agency id},{dataflow id},{dataflow version}/all?format=sdmx-json&detail=structureOnly&dimensionAtObservation=AllDimensions
EXAMPLE:
http://api.uis.unesco.org/sdmx/data/UNESCO,CE,1.0/......SAL....EMP..BE..?format=sdmx-json&detail=structureOnly&dimensionAtObservation=AllDimensions

The response is an object with two properties, shown below:

{
"header" : { ….}
"structure" : { ….}
}

The structure object described the Dimensions for the Dataflow and the available Values for each Dimension:

"name": "Cultural employment",
"dimensions": {
"observation": [
 {
 "id": "FCS_DOMAIN",
 "name": "FCS domain",
 "keyPosition": 1,
 "role": null,
 "values": [
 {
 "id": "PRINT",
 "name": "Books and press"
 },
 {
 "id": "DESIGN",
 "name": "Design and creative services"
 }, ...
 ]
 },
 {
 "id": "CULT_TYPE_IND2",
 "name": "Type of industry of secondary job",
 "keyPosition": 2,
 "role": null,
 "values": [
 {
 "id": "NON_CULT",
 "name": "Non-cultural"
 }, {
 "id": "_T",
 "name": "Total"
 }, {
 "id": "CULT",
 "name": "Cultural"
 }
 ]
 },
 {
 "id": "TIME_PERIOD",
 "name": "Time period",
 "keyPosition": 16,
 "role": "time",
 "values": [
 {
 "id": "2009",
 "name": "2009"
 }, {
 "id": "2010",
 "name": "2010"
 }, ...
 ]
 }, ...
]

Building the Data Query

The client can build a data query, by replacing the ‘all’ path parameter with the selected dimension filters. 

SYNTAX:

http://api.uis.unesco.org/sdmx/data/{dataflow agency id},{dataflow id},{dataflow version}/{dimension filters}

DIMENSION FILTERS SYNTAX:

{code ids for dimension 1}.{code ids for dimension 2}.{code ids for dimension 3}.

Multiple code selected in the same Dimension are concatentated by the plus ‘+’ operator. The omission of a filter in a Dimension is interpreted as ‘no filter’ or ‘all’. For example to query a Dataflow with 6 Dimenions, where filters have been applied to the 1st and 4th Dimension.

DIMENSION FILTERS SYNTAX EXAMPLE:

PRINT...BE+BR+CL..

DATA QUERY EXAMPLE:

http://api.uis.unesco.org/data/UNESCO,CE,1.0/PRINT............BE+BR+CL..?format=sdmx-json&dimensionAtObservation=AllDimensions

Further restrictions can be applied to restrict the observations returned for each series

Parameter

Purpose

Allowed Values

startPeriod

Only return observations if the observation time if after the time specified.

Single time period (not a time range). Conforms to the ISO-8601 standard.

Supported time periods include

Period

Example

Annual

2009

Semester

2009-S1

Trimester

2009-T1

Quarterly

2009-Q1

Monthly

2009-01

Daily

2009-01-31

Date Time

2009--01T20:22:00

endPeriod

Only return observations if the observation time if before the time specified.

firstNObservations

The first ‘n’ observations to return for each matched series.

Positive Integer

lastNObservations

The last ‘n’ observations to return for each matched series.

Positive Integer

Server Feedback for Data Query Construction

As the user builds a data query by making code selections, the client can beneift from knowing metrics such as: how many series and observations match the current query criteria; which code selections remain valid based on the current query criteria; what available time periods exist for the current query criteria.

This information can be requested by the client, by submitting a valid data query and including the metrics request parameter.  If the detail is set to structureonly, the response message will omit the observation values which match the data query.

SYNTAX:

http://api.uis.unesco.org/sdmx/data/{dataflow agency id},{dataflow id},{dataflow version}/{ dimension filters }?format=sdmx-json&detail=structureonly&includeMetrics=true

EXAMPLE:

http://api.uis.unesco.org/sdmx/data/UNESCO,CE,1.0/PRINT............BE+BR+CL..?detail=structureOnly&includeMetrics=true&format=sdmx-json&dimensionAtObservation=AllDimensions

The response is shown below:

{
"header" : { ….}
"metrics" : { ….}
"structure" : { ….}
}

With metrics included, the header object contains the data reporting begin and end dates.  This information is the earliest and latest observation date.

{
 "id": "IDREF2edb6279-670f-4ff6-8672-fb43b0d01d08",
 "prepared": "2016-11-25T10:58:13",
 "test": false,
 "reportingBegin": "2013-01-01T00:00:00",
 "reportingEnd": "2014-01-01T23:59:59",
 "sender": {
 "id": "UNESCO-Registry-Staging"
 }
 }

This could be used to create a time slider for example:

The response now includes a metrics section, which contains the following information:

{
 "lastUpdated": "2016-11-16T19:28:23",
 "maxSeries": 1972,
 "actualSeries": 977,
 "maxObs": 12000,
 "actualObs": 4337
}

Metrics includes when the dataset was last updated, the actual number of series and observations that would be returned from the data query.

The max series is a calculated by calculating the theoretical maximum size of the cube, if every combination of available dimension selections contained data.  If the max series count is less then the actual series count, this indicates that certain Dimension values are not valid when combined with other Dimension values (for example ‘Germany’ may not include any data for ‘Animation’ Films in the dataset).

The maxObs count is a calculated theoretical maximum based on a complete cube.  This is based on the calculation of max series, and using the data start and end dates and frequency to calculate the maximum number of observations for each series.  For example if the data start date is 2010 and the end date is 2015 with an Annual frequency, each series could have a maximum of 5 observations.

 The metrics section can be used to prevent the client from creating large data queries.

The structure object in the response is the same as a ‘non metrics’ query, however new information is present for each Dimension and Dimension Values.

{
 "id": "FCS_DOMAIN",
 "name": "FCS domain",
 "keyPosition": 1,
 "maxObs": 355,
 "role": null,
 "values": [
 {
 "id": "PRINT",
 "name": "Books and press",
 "actualObs": 30,
 "inDataset": true
 }, {
 "id": "DESIGN",
 "name": "Design and creative services",
 "actualObs": 30,
 "inDataset": false
 }, {
 "id": "V_ARTS",
 "name": "Visual arts and crafts",
 "actualObs": 30,
 "inDataset": false
 }

The new information includes a maxObs calculation for each Dimension, this is the theoretical maximum.  Each Dimension value contains an actualObs calculation, which is the total number of observations that will be in the dataset for the given selection.  The inDataset is a true/false value indicating if the Dimension value will be included in the dataset if the data query is executed.  If the value is false, the client is aware that the Dimension value will not be in the response dataset, however is a valid future selection.  If a Dimension value which was previously present, is omitted from the values array, then the client is aware that the value is no longer a valid selection.

This information can be used to update the client’s dimension filters to represent what the valid choices are based on current query state.

Data Response Format

The SDMX-JSON dataset

{
"header" : { … }
"dataSets" : [ … ]
"structure" : { … }
}

The dataSets array will be of size 1, unless the query parameter includeHistory=true and the server is recording revisions to observation values. 

A Dataset object contains the following information

{
 "action": "Information",
 "observation": {
 "0:0:0:0": [ … ],
 "0:0:0:1": [ … ],
 "0:0:0:2": [ … ]
}

Each observation Object consists of one or more pointers which reference the structure section of the dataset response.  For example the observation:

"0:0:0:0"

Can be split into 4 pointers [0, 0, 0, 0].  The first pointer references the first dimension in the observation array in the structure section.  The value of the first pointer is ‘0’ which references the first value (zero indexed) for this Dimension, this is highlighted in RED in the code example below.  The second pointer references the second Dimension the value of ‘0’ references the first value for this Dimension, this is also highlighted in RED in the example below.

"structure": {
 "name": "Cultural employment",
 "description": null,
 "dimensions": {
 "dataset": [],
 "series": [],
 "observation": [
{
 "id": "FCS_DOMAIN",
 "name": "FCS domain",
 "keyPosition": 0,
 "role": null,
 "values": [
 {
 "id": "PRINT",
 "name": "Books and press"
 }, {
 "id": "DESIGN",
 "name": "Design and creative services"
 }, {
 "id": "V_ARTS",
 "name": "Visual arts and crafts"
 }
 }, {
 "id": "CULT_TYPE_IND2",
 "name": "Type of industry of secondary job",
 "keyPosition": 1,
 "role": null,
 "values": [{
 "id": "_T",
 "name": "Total"
 }]
 }

Each observation key is contains an array as follows:

["20812", 0, 0, 0, 0]

The first element in the array is the Observation value, the subsequent elements are pointers to the attributes section of the structure object, and follows the same rules as the observation key.  An example of the attributes section is shown below, with the first attribute ‘Observation Status’ being shown in the example.

"structure": {
 "name": "Cultural employment",
 "description": null,
 "dimensions": { … }
 "attributes" : {
 "observation" : [
 {
 "id": "OBS_STATUS",
 "name": "Observation status",
 "role": null,
 "values": [{
 "id": "A",
 "name": "Normal"
 }, {
 "id": "U",
 "name": "Low reliability: publishable but with caution"
 }, ...
 ]
}

The dataset can be processed to build any type of visualisation, shown below is an example of a pivot table

Where it is possible to provide a full breakdown of information for each individual observation value in the table, as shown below.