6 Data and Structure Query

Last modified by Artur on 2025/07/14 10:19

6.1 Scope of this Chapter

SDMX_2-1_User_Guide_draft_0-1_html_d80b73af81aff4b8.png

This chapter describes how SDMX opens data by standardising the means in which it can be discovered. By using SDMX, existing data can easily be exposed regardless of its current storage. The Error! Reference source not found. chapter has already detailed how existing data stores can be related to SDMX. This chapter will describe the further considerations of exposing such a data store with SDMX query interfaces. While reading this section, it will be useful to reference Annex 5 - Query Samples for detailed examples of both the REST and SOAP syntaxes.

6.2 Query Types

Part 7 of the SDMX standard defines the web services standards for querying data and metadata. There are two standard protocols for querying; REST and SOAP. Because the specific query interfaces are standardised in each of these protocols, a user can access any SDMX enable data source without knowing any details of it, outside of its existence. All of the information needed to query for the data can be found using standard SDMX query mechanisms.

The full details of the REST and SOAP protocols are described in detail in Part 7 of the SDMX standards documentation. However, the following section will briefly discuss the differences between the protocols and when it may be more appropriate to use one over the other.

6.2.1 REST and SOAP

The REST and SOAP protocols complement each other. They are effectively different means to the same end. Anything which is discoverable using SOAP is discoverable using REST. Both protocols share the same basic design principle, which is to have a predictable query mechanism which closely aligns to the SDMX information model. Therefore, if one understands the information model, one should be able to easily understand the query mechanisms. However, it should be noted that the focus of the REST protocol is simplicity. As such it only implements the core facets of the information model, which include identification and referencing. The SOAP query supports more detailed searches, allowing for a search based on all facets of the information model. For example, it is possible in SOAP to make a query based on the name of a structural metadata object, whereas REST does not support this.

6.3 Querying Structures

SDMX must always conform to a data structure definition, and as such the first step in exposing existing data as SDMX it to define the data structures for the available data. The Defining Data Structures chapter discusses the specific considerations of creating the data structure, but there are other considerations that are more specific to exposing existing data through SDMX queries.

When exposing data as SDMX it is important to consider how users might discover the data which is being sought. In most cases, users will first want to see which data structures are available from a given source and then get the specific data for a given structure. Therefore, the way in which these data structures are defined is very critical to making data easily discoverable.

One means in which users often find data is through the classification of the nature of data. For example, one might be seeking exchange rate data as detailed in the sample scenario in Annex 5 - Query Samples. In SDMX, this is achieved via categorisation of data sets, data structures, data flows, or provision agreements. A user will typically find the category which matches the type of data being sought, and then search for data structures which a categorised against this category. Therefore, it is often useful to categorise your data and structural metadata.

Annex 6: Worked Use Case shows data discovery using this “drill down” approach.

Another means in which users might discover data is through known concepts or code lists. This further emphasises the point made in Chapter 4, that standard concepts and code lists should be used whenever possible. For example, suppose a user seeking demographic data knows that they are looking for data which uses a life birth concept as a measure. The SDMX SOAP query allows for such queries. But this will only result in data being returned from the query if the standard concept is used.

6.4 Querying Data

Once a user has discovered the structural metadata on which data is based more specific queries can be made to get the specific data being sought. It is often true that a user may not wish to receive an entire data set, but rather data for particular keys within the data set. The structural metadata gives the user enough detail to inform how to query for data, but it does not make any guarantees about the presence of data. It is quite possible that the data source is a sparse cube - that is to say that there may not be data for every possible key permutation. To help users with this issue, there are two possible approaches.

The first approach is to describe the available content of the data source as a content constraint. A content constraint is structural metadata which details the key sets for which data is present. The constraint is attached to a particular data set, provision agreement, data flow, or data structure. Therefore, if the user application knows the structural metadata which the data is based on, it can query for the content constraint in order to determine which data is present.

The second approach is to support the data query mechanism which allows for only key values to be returned. In this approach, the user queries for data in the usual way, but specifies that the detail returned should contain no data or data attributes. This information is effectively equivalent to a content constraint but does not require the extra bit of structural metadata be maintained. These keys can be stored locally or cached by the application and processes to ensure that the user is presented with dimension value choices that will return data.

Finally, it is important to note that Part 7 of the specification is very clear in the fact that a query web service does not need to implement every feature of the various query protocols. It simply states that the query be accepted even if the result is an error stating that the given feature is not implemented. Therefore, the implementer need not be overwhelmed by all of the finer details of the queries.