10 Constraints

Last modified by Artur on 2025/07/14 10:19

10.1 Introduction

Constraints are used as a way to restrict what data can be reported, or to report what data exists in a given context.  There are three types of Constraint, which serve different purposes

An Availability Constraint defines the data that exists in the context of a data query.

They form part of the response message from the Availability REST API. Availability Constraints are dynamically generated by a system based on the data that exists and the query context.  Availability Constraints are therefore not Identifiable structures (they have no URN).

A Dimension Constraint is a property of a Dataflow, they are used to fix the Dimensions that they use in the Data Structure Definition which they use. Dimension Constraints enable Data Structure Definitions to evolve over time by having new Dimensions added, without having to undergo a major version change.  

A Reporting Constraint is used to define the set of allowed and/or disallowed values that can be reported in a data or metadata set.

10.2 Availability Constraint

An Availability Constraint is not a maintained structure, instead it is generated dynamically as a response to the availability REST API. The purpose of the Availability Constraint is to define the distinct set of values that have data over 1 or more Dimensions.  Unlike a Data and Metadata Constraint, which can attach to multiple Constrainable structures (of the same type), an Availability Constraint can only attach to only one structure.  The attachment defines the context of the response (data exists for components in the context of).  The subset of Constrainable structures the Availability Constraint can attach to are:

10.3 Dimension Constraint

A Dimension Constraint is a property of a Dataflow; its purpose is to explicitly list the Dimensions from the corresponding DSD that are being used by the Dataflow.  

Dimension Constraints were introduced in SDMX 3.1 and are not required for most Dataflows where the dataset must always contain the full complement of Dimensions as defined by the corresponding DSD. However, for some complex data collections, which may span long periods and where the full complement of required Dimensions are not necessarily known at design time, the DSD is subject to increasing its Dimensionality over time.  In this scenario it is possible to define the DSD as an evolving structure, this property tells the user that the DSD can have new Dimensions added without having to undergo a major version change; a DSD at version 1.0.0 for example would be able to add a new Dimension and move to version 1.1.0; a change that would not ordinarily be allowed.  A minor version change on the addition of a new Dimension is only possible if the DSD defines itself as an evolving structure.   This is a new property of the DSD introduced in version 3.1 to satisfy this use case.  The evolving structure  property is either true or false, defaulting to false if not specified.  Setting the evolving structure property to true requires a major version change, and therefore can only be introduced on an x.0.0 release (e.g. 1.0.0).  The evolving structure property can be set to false to indicate that there will be no additional Dimensions added to the Data Structure under the same major version number; setting the evolving structure property to false does not require require a major version change on the Data Structure.    

When a Dataflow references a DSD, late binding on the minor release, and the DSD has the evolving structure property set to true, then the Dataflow must contain a Dimension Constraint to protect its Dimensionality from changing over time without a version change.  

The Dimension Constraint provides the explicit list of Dimensions that the Dataflow uses from the DSD that it references.  This enables the DSD to evolve over time without breaking the compatibility of datasets against the Dataflow.  

Rules for a Dimension Constraint

Example Datasets with Evolving Structures

A dataset is built against a Data Structure Definition.  The dataset contains data for two DataflowsDataflows ‘DF_POP’ uses a Dimension Constraint which fixes its Dimensions to  FREQ and REF_AREA.  Dataflow ‘DF_POP_SA’ does not reference a Dimension Constraint, and as such includes all Dimensions as specified by the DSD.   

The resulting dataset contains values ‘’ for both the SEX and AGE Dimension for the series related to DF_POP. 

DataflowFREQ REF_AREA SEX AGE OBS_VALUE TIME_PERIOD UNIT
DF_POPAUK  6520226
DF_POPAFR  5020226
DF_POP_SAAUKM11.220226

10.4 Reporting Constraints

A Reporting Constraint is a Maintainable Artefact which restricts the values that can be reported in a dataset or metadata set based on one or more inclusion or exclusion rules.  

A reporting constraint is one of the following concrete types:

10.4.1 Data Constraint

A Data Constraint is used to add additional restrictions to the allowable values reported in a dataset.  Data Constraints can be applied to the follow structures which are collectively known as Constrainable structures:

  • Data Structure Definition
  • Dataflow
  • Provision Agreement
  • Data Provider

Note regardless of the Constrainable structure, the restricted values relate to  the allowable content for the Component of the DSD to which the constrained object relates.  

10.4.2 Metadata Constraint

A Metadata Constraint is used to add additional restrictions to the allowable values reported in a metadataset.  Metadata Constraints can be applied to the follow structures which are collectively known as Constrainable structures:

  • Metadata Structure Definition
  • Metadataflow
  • Metadata Provision Agreement
  • Metadata Provider

Note regardless of the Constrainable structure,  the restricted values relate to  the allowable content for the Component of the MSD to which the constrained object relates.  

10.4.3 Scope of a Constraint

A Constraint is used specify the content of a data or metadata source in terms of the component values or the keys. 

In terms of data the components are:

In terms of reference metadata the components are:

For a Constraint based on a DSD the Constraint can reference one or more of:

  • Data Structure Definition
  • Dataflow
  • Provision Agreement
  • Data Provider

For a Constraint based on an MSD the Constraint can reference one or more of:

  • Metadata Structure Definition
  • Metadataflow
  • Metadata Provision Agreement
  • Metadata Provider
  • Metadata Set

Furthermore, there can be more than one Constraint specified for a specific object e.g., more than one Constraint for a specific DSD.

In view of the flexibility of constraints attachment, clear rules on their usage are required. These are elaborated below.

10.4.4 Multiple Constraints

There can be many Constraints for any Constrainable Artefact (e.g., DSD), subject to the following restrictions:

10.4.4.1 Cube Region

A Constraint can contain multiple Member Selections (e.g., Dimensions).

  • A specific Member Selection (e.g., Dimension FREQ) can only be contained in one Cube Region for any one attached object (e.g., a specific DSD or specific Dataflow).
  • Component values within a Member Selection may define a validity period. Otherwise, the value is valid for the whole validity of the Cube Region.
  • For partial reference resolution purposes (as per the SDMX REST API), the latest non-draft Constraint must be considered.
  • A Member Selection may include wildcarding of values (using character ‘%’ to represent zero or more occurrences of any character), as well as cascading through hierarchic structures (e.g., parents in Codelist), or localised values (e.g., text for English only). Lack of locale means any language may match. Cascading values are mutual exclusive to localised values, as the former refer to coded values, while the latter refer to uncoded values.
  • Any values included in a Member Selection for Components with an array data type (i.e., Measures, Attributes or Metadata Attributes), will be applied as single values and will not be assessed combined with other values to match all possible array values. For example, including the Code ‘A’ for an Attribute will allow any instance of the Attribute that includes ‘A’, like [‘A’, ‘B’] or [‘A’, ‘C’, ‘D’]. Similarly, if Code ‘A’ was excluded, all those arrays of values would also be excluded.

10.4.4.2 Key Set

Key Sets will be processed in the order they appear in the Constraint and wildcards can be used (e.g., any key position not reference explicitly is deemed to be "all values"). 

As the Key Sets can be "included" or "excluded" it is recommended that Key Sets with wildcards are declared before KeySets with specific series keys. This will minimize the risk that keys are inadvertently included or excluded.

In addition, Attribute, Measure and Metadata Attribute constraints may accompany KeySets, in order to specify the allowed values per Key. Those are expressed following the rules for Cube Regions, as explained above.

Finally, a validity period may be specified per Key.

10.4.4 Versioning

When Data and Metadata Constraints are versioned, the latest version of the Constraint is used to generate the reporting restriction rules; all previous versions are for historical information only.

If restrictions are applicable to certain periods in time, the validFrom and validTo properties can be set on the specific values. This allows Constraints to evolve over time, increasing their version number as they do so, whilst being able to maintain a complete set of reporting restrictions for current and past datasets.

Example:

Data Constraint 1.0.0

ComponentValid ValueValid fromValid to
COUNTRYUK  
FR  
DE  

Data Constraint 1.1.0

ComponentValid ValueValid fromValid to
COUNTRYUK  
FR 2012
DE  

When both versions of the Data Constraint are in a system, an observation value reported against COUNTRY FR for time period 2013 would be deemed invalid as the 1.1.0 rule would be applied. 

10.4.6 Inheritance

10.4.6.1 Attachment levels of a Constraint

There are three levels of constraint attachment for which these inheritance rules apply:

It is not necessary for a Constraint to be attached to a higher level artefact. e.g., it is valid to have a Constraint for a Provision Agreement where there are no constraints attached the relevant Dataflow or DSD.

10.4.6.2 Cascade rules for processing Constraints

The processing of the constraints on either Dataflow/Metadataflow or Provision Agreement must take into account the constraints declared at higher levels. The rules for the lower-level constraints (attached to Dataflow/ Metadataflow and Provision Agreement) are detailed below.

Note that there can be a situation where a constraint is specified at a lower level before a constraint is specified at a higher level. Therefore, it is possible that a higher-level constraint makes a lower-level constraint invalid. SDMX makes no rules on how such a conflict should be handled when processing the constraint for attachment. However, the cascade rules on evaluating constraints for usage are clear – the higher-level constraint takes precedence in any conflicts that result in a less restrictive specification at the lower level.

10.4.6.3 Cube Region

It is not necessary to have a Constraint on the higher-level artefact (e.g., DSD referenced by the Dataflow), but if there is such a Constraint at the higher level(s) then:

Note that it is possible for a Constraint at a higher level to constrain, say, four Dimensions in a single Constraint, and a Constraint at a lower level to constrain the same four in two, three, or four Constraints.

10.4.6.4 Key Set

It is not necessary to have a Constraint on the higher-level artefact (e.g., DSD referenced by the Dataflow), but if there is such a Constraint at the higher level(s) then:

Note that a Key in a Key Set can have wildcarded Components. For instance, the Constraint may simply constrain the Dimension FREQ to "A", and all keys where the FREQ="A" are therefore valid. 

The following logic explains how the inheritance mechanism works. Note that this is conceptual logic and actual systems may differ in the way this is implemented.

  1. Determine all possible keys that are valid at the higher level.
  2. These keys are deemed to be inherited by the lower-level constrained object, subject to the Constraints specified at the lower level.
  3. Determine all possible keys that are possible using the Constraints specified at the lower level.
  4. At the lower level inherit all keys that match with the higher-level Constraint.
  5. If there are keys in the lower-level Constraint that are not inherited then the key is invalid (i.e., it is less restrictive).

10.4.7 Constraints Examples

10.4.7.1 Data Constraint and Cascading

The following scenario is used.

A DSD contains the following Dimensions:

  • GEO – Geography
  • SEX – Sex
  • AGE – Age
  • CAS – Current Activity Status

In the DSD, common code lists are used and the requirement is to restrict these at various levels to specify the actual code that are valid for the object to which the Constraint is attached. 

1750065279010-260.png

Figure 20: Example Scenario for Constraints 

Constraints are declared as follows:

1750065319060-899.png

Figure 21: Example Constraints

Notes:

AGE is constrained for the DSD and is further restricted for the Dataflow CENSUS_CUBE1.

  • The same Constraint applies to both Provision Agreements.

The cascade rules elaborated above result as follows:

DSD

Dataflow CENSUS_CUBE1

Dataflow CENSUS_CUBE2

Provision Agreement CENSUS_CUBE1_IT

Provision Agreement CENSUS_CUBE2_IT

The Constraints are defined as follows:

DSD Constraint

1750065367824-357.png

Dataflow Constraints

1750065392163-400.png

Provision Agreement Constraint

1750065461589-995.png

10.4.7.2 Combination of Constraints

The possible combination of constraining terms are explained in this section, following a few examples.

Let’s assume a DSD with the following Components:

DimensionFREQ
DimensionJD_TYPE
DimensionJD_CATEGORY
DimensionVIS_CTY
TimeDimensionTIME_PERIOD
AttributeOBS_STATUS
AttributeUNIT
AttributeCOMMENT
MetadataAttributeCONTACT
MeasureMULTISELECT
MeasureCHOICE

On the above, let’s assume the following use cases with their constraining requirements:

Use Case 1: A Constraint on allowed values for some Dimensions

R1: Allow monthly and quarterly data

R2: Allow Mexico for vis-à-vis country 

This is expressed with the following CubeRegion:

FREQM, Q
VIS_CTYMX

Use Case 2: A Constraint on allowed combinations for some Dimensions

R1: Allow monthly data for Germany 

R2: Allow quarterly data for Mexico

This is expressed with the following DataKeySet:

Key1FREQM
 VIS_CTYDE
Key2FREQQ
VIS_CTYMX

Use Case 3: A Constraint on allowed values for some Dimensions combined with allowed values for some Attributes R1: Allow monthly and quarterly data

R2: Allow Mexico for vis-à-vis country

R3: Allow present for status 

This may be expressed with the following CubeRegion:

FREQM, Q
VIS_CTYMX
OBS_STATUSA

Use Case 4: A Constraint on allowed combinations for some Dimensions combined with specific Attribute values

R1: Allow monthly data, for Germany, with unit euro

R2: Allow quarterly data, for Mexico, with unit usd

This may be expressed with the following DataKeySet:

Key1FREQM
VIS_CTYDE
UNITEUR
Key2FREQQ
VIS_CTYMX
UNITUSD

Use Case 5: A Constraint on allowed values for some Dimensions together with some combination of Dimension values

R1: For annually and quarterly data, for Mexico and Germany, only A status is allowed

R2: For monthly data, for Mexico and Germany, only F status is allowed

Considering the above examples, the following CubeRegions would be created:

CubeRegion1FREQQ, A
VIS_CTYMX, DE
OBS_STATUSA
CubeRegion2FREQM
VIS_CTYMX, DE
OBS_STATUSF

The problem with this approach is that according to the business rule for Constraints, only one should be specified per Component. Thus, if a software would perform some conflict resolution would end up with empty sets for FREQ and OBS_STATUS (as they do not share any values).

Nevertheless, there is a much easier approach to that; this is the cascading mechanism of Constraints (as shown in 10.4.7.1). Hence, these rules would be expressed into two levels of Constraints, e.g., DSD and Dataflows:

DSD CubeRegion:

FREQM, Q, A
VIS_CTYMX, DE
OBS_STATUSA, F

Dataflow1 CubeRegion:

FREQQ, A
VIS_CTYMX, DE
OBS_STATUSF

Dataflow2 CubeRegion:

FREQM
VIS_CTYMX, DE
OBS_STATUSA

Use case 6: A Constraint on allowed values for some Dimensions combined with allowed values for Measures

R1: Allow monthly data, for Germany, with unit euro, and measure choice is  'A' R2: Allow quarterly data, for Mexico, with unit usd, and measure choice is  'B'

This may be expressed with the following DataKeySet:

Key1FREQM
VIS_CTYDE
UNITEUR
CHOICEA
Key2FREQQ
VIS_CTYMX
UNITUSD
CHOICEB

Use Case 7: A Constraint with wildcards for Codes and removePrefix property

For this example, we assume that the VIS_CTY representation has been prefixed with prefix ‘AREA_’. In this Constraint, we need to remove the prefix.

R1: Allow monthly and quarterly data

R2: Allow vis-à-vis countries that start with M

R3: Remove the prefix ‘AREA_’

This may be expressed with the following CubeRegion:

FREQM, Q
VIS_CTY (removePrefix=’AREA_’)M%

Use Case 8: A Constraint with multilingual support on Attributes

R1: Allow monthly and quarterly data

R2: Allow Mexico for vis-à-vis country

R3: Allow a comment, in English, which includes the term adjusted for status 

This may be expressed with the following CubeRegion:

FREQM, Q
VIS_CTYMX
COMMENT (lang=’en’)%adjusted%

Use Case 9: A Constraint on allowed values for Dimensions combined with allowed values for Metadata Attributes R1: Allow monthly and quarterly data

R2: Allow Mexico for vis-à-vis country

R3: Allow John Doe for contact 

This may be expressed with the following CubeRegion:

FREQM, Q
VIS_CTYMX
CONTACTJohn Doe