Wiki source code of 10 Constraints

Version 1.1 by Helena on 2025/06/16 12:03

Show last authors
1 {{box title="**Contents**"}}
2 {{toc/}}
3 {{/box}}
4
5 1.
6 11. Introduction
7
8 Constraints are used as a way to restrict what data can be reported, or to report what data exists in a given context.  There are three types of Constraint, which serve different purposes
9
10 * Availability Constraint
11 * Dimension Constraint
12 * Reporting Constraints
13
14 An Availability Constraint defines the data that exists in the context of a data query.
15
16 They form part of the response message from the Availability REST API. Availability Constraints are dynamically generated by a system based on the data that exists and the query context.  Availability Constraints are therefore not Identifiable structures (they have no URN).
17
18 A Dimension Constraint is a property of a Dataflow, they are used to fix the Dimensions that they use in the Data Structure Definition which they use. Dimension Constraints enable Data Structure Definitions to evolve over time by having new Dimensions added, without having to undergo a major version change. 
19
20 A Reporting Constraint is used to define the set of allowed and/or disallowed values that can be reported in a data or metadata set.
21
22 1.
23 11. Availability Constraint
24
25 An Availability Constraint is not a maintained structure, instead it is generated dynamically as a response to the availability REST API. The purpose of the Availability Constraint is to define the distinct set of values that have data over 1 or more Dimensions.  Unlike a Data and Metadata Constraint, which can attach to multiple Constrainable structures (of the same type), an Availability Constraint can only attach to only one structure.  The attachment defines the context of the response (data exists for components in the context of).  The subset of Constrainable structures the Availability Constraint can attach to are:
26
27 * Data Structure Definition
28 * Dataflow
29 * Provision Agreement
30 *1. Dimension Constraint
31
32 A Dimension Constraint is a property of a Dataflow; its purpose is to explicitly list the Dimensions from the corresponding DSD that are being used by the Dataflow. 
33
34 Dimension Constraints were introduced in SDMX 3.1 and are not required for most Dataflows where the dataset must always contain the full complement of Dimensions as defined by the corresponding DSD. However, for some complex data collections, which may span long periods and where the full complement of required Dimensions are not necessarily known at design time, the DSD is subject to increasing its Dimensionality over time.  In this scenario it is possible to define the DSD as an evolving structure, this property tells the user that the DSD can have new Dimensions added without having to undergo a major version change; a DSD at version 1.0.0 for example would be able to add a new Dimension and move to version 1.1.0; a change that would not ordinarily be allowed.  A minor version change on the addition of a new Dimension is only possible if the DSD defines itself as an evolving structure.   This is a new property of the DSD introduced in version 3.1 to satisfy this use case.  The evolving structure  property is either true or false, defaulting to false if not specified.  Setting the evolving structure property to true requires a major version change, and therefore can only be introduced on an x.0.0 release (e.g. 1.0.0).  The evolving structure property can be set to false to indicate that there will be no additional Dimensions added to the Data Structure under the same major version number; setting the evolving structure property to false does not require require a major version change on the Data Structure.  
35
36 When a Dataflow references a DSD, late binding on the minor release, and the DSD has the evolving structure property set to true, then the Dataflow must contain a Dimension Constraint to protect its Dimensionality from changing over time without a version change. 
37
38 The Dimension Constraint provides the explicit list of Dimensions that the Dataflow uses from the DSD that it references.  This enables the DSD to evolve over time without breaking the compatibility of datasets against the Dataflow. 
39
40 ==== Rules for a Dimension Constraint ====
41
42 * A Dataflow must contain a Dimension Constraint if the DSD which it uses states that it is an evolving structure and the Dataflow is late binding on the minor release (latest minor release of a given major version, e.g. 1.0+.0)
43 * The Dimension Constraint can only include Dimensions from the DSD that is referenced by the Dataflow.
44 * A Dimension Constraint can only be changed if the Dataflow undergoes a major version change
45 * Datasets reported against the Dataflow must only contain reported values for the Dimensions specified in the Dimension Constraint. 
46 * When exporting data for the Dataflow, the dataset should only include the Dimensions specified by the Dimension Constraint.
47 * When exporting data for the DSD the dataset must contain the full set of Dimensions as specified by the DSD. The tilde ‘~~’ character is used to represent a value which is not present due to the Dimension not being included in the corresponding Dataflow.
48
49 ==== Example Datasets with Evolving Structures ====
50
51 A dataset is built against a Data Structure Definition.  The dataset contains data for two Dataflows.  Dataflows ‘DF_POP’ uses a Dimension Constraint which fixes its Dimensions to  FREQ and REF_AREA.  Dataflow ‘DF_POP_SA’ does not reference a
52
53 Dimension Constraint, and as such includes all Dimensions as specified by the DSD.  
54
55 The resulting dataset contains values ‘~~’ for both the SEX and AGE Dimension for the series related to DF_POP.
56
57
58 |**Dataflow**|(% colspan="7" %)**FREQ REF_AREA SEX AGE OBS_VALUE TIME_PERIOD UNIT**
59 |**DF_POP**|A|UK|~~|~~|65|2022|6
60 |**DF_POP**|A|FR|~~|~~|50|2022|6
61 |**DF_POP_SA**|A|UK|M|1|1.2|2022|6
62
63 1.
64 11. Reporting Constraints
65
66 A Reporting Constraint is a Maintainable Artefact which restricts the values that can be reported in a dataset or metadata set based on one or more inclusion or exclusion rules. 
67
68 A reporting constraint is one of the following concrete types:
69
70 * Data Constraint
71 * Metadata Constraint
72 *1.
73 *11. Data Constraint
74
75 A Data Constraint is used to add additional restrictions to the allowable values reported in a dataset.  Data Constraints can be applied to the follow structures which are collectively known as Constrainable structures:
76
77 * Data Structure Definition
78 * Dataflow
79 * Provision Agreement
80 * Data Provider
81
82 **Note** regardless of the Constrainable structure, the restricted values relate to  the allowable content for the Component of the DSD to which the constrained object relates. 
83
84 1.
85 11.
86 111. Metadata Constraint
87
88 A Metadata Constraint is used to add additional restrictions to the allowable values reported in a metadataset.  Metadata Constraints can be applied to the follow structures which are collectively known as Constrainable structures:
89
90 * Metadata Structure Definition
91 * Metadataflow
92 * Metadata Provision Agreement
93 * Metadata Provider
94
95 **Note** regardless of the Constrainable structure,  the restricted values relate to  the allowable content for the Component of the MSD to which the constrained object relates. 
96
97 1.
98 11.
99 111. Scope of a Constraint
100
101 A Constraint is used specify the content of a data or metadata source in terms of the component values or the keys.
102
103 In terms of data the components are:
104
105 * Dimension
106 * Time Dimension
107 * Data Attribute
108 * Measure
109 * Metadata Attribute
110 * DataKeySets: the keys are the content of the KeyDescriptor – i.e., the series keys composed, for each key, by a value for each Dimension.
111
112 In terms of reference metadata the components are:
113
114 * Metadata Attribute
115
116 For a Constraint based on a DSD the Constraint can reference one or more of:
117
118 * Data Structure Definition
119 * Dataflow
120 * Provision Agreement
121 * Data Provider
122
123 For a Constraint based on an MSD the Constraint can reference one or more of:
124
125 * Metadata Structure Definition
126 * Metadataflow
127 * Metadata Provision Agreement
128 * Metadata Provider
129 * Metadata Set
130
131 Furthermore, there can be more than one Constraint specified for a specific object e.g., more than one Constraint for a specific DSD.
132
133 In view of the flexibility of constraints attachment, clear rules on their usage are required. These are elaborated below.
134
135 1.
136 11.
137 111. Multiple Constraints
138
139 There can be many Constraints for any Constrainable Artefact (e.g., DSD), subject to the following restrictions:
140
141 **10.4.4.1 Cube Region**
142
143 A Constraint can contain multiple Member Selections (e.g., Dimensions).
144
145 * A specific Member Selection (e.g., Dimension FREQ) can only be contained in one Cube Region for any one attached object (e.g., a specific DSD or specific Dataflow).
146 * Component values within a Member Selection may define a validity period. Otherwise, the value is valid for the whole validity of the Cube Region.
147 * For partial reference resolution purposes (as per the SDMX REST API), the latest non-draft Constraint must be considered.
148 * A Member Selection may include wildcarding of values (using character ‘%’ to represent zero or more occurrences of any character), as well as cascading through hierarchic structures (e.g., parents in Codelist), or localised values (e.g., text for English only). Lack of locale means any language may match. Cascading values are mutual exclusive to localised values, as the former refer to coded values, while the latter refer to uncoded values.
149 * Any values included in a Member Selection for Components with an array data type (i.e., Measures, Attributes or Metadata Attributes), will be applied as single values and will not be assessed combined with other values to match all possible array values. For example, including the Code ‘A’ for an Attribute will allow any instance of the Attribute that includes ‘A’, like [‘A’, ‘B’] or [‘A’, ‘C’, ‘D’]. Similarly, if Code ‘A’ was excluded, all those arrays of values would also be excluded.
150
151 **10.4.4.2 Key Set**
152
153 Key Sets will be processed in the order they appear in the Constraint and wildcards can be used (e.g., any key position not reference explicitly is deemed to be "all values").
154
155 As the Key Sets can be "included" or "excluded" it is recommended that Key Sets with wildcards are declared before KeySets with specific series keys. This will minimize the risk that keys are inadvertently included or excluded.
156
157 In addition, Attribute, Measure and Metadata Attribute constraints may accompany KeySets, in order to specify the allowed values per Key. Those are expressed following the rules for Cube Regions, as explained above.
158
159 Finally, a validity period may be specified per Key.
160
161 1.
162 11.
163 111. Versioning
164
165 When Data and Metadata Constraints are versioned, the latest version of the Constraint is used to generate the reporting restriction rules; all previous versions are for historical information only.
166
167 If restrictions are applicable to certain periods in time, the validFrom and validTo properties can be set on the specific values. This allows Constraints to evolve over time, increasing their version number as they do so, whilst being able to maintain a complete set of reporting restrictions for current and past datasets.
168
169 Example:
170
171 Data Constraint 1.0.0
172
173 |Component|Valid Value|Valid from|Valid to
174 |(% rowspan="3" %)COUNTRY|UK| |
175 |FR| |
176 |DE| |
177
178 Data Constraint 1.1.0
179
180 |Component|Valid Value|Valid from|Valid to
181 |(% rowspan="3" %)COUNTRY|UK| |
182 |FR| |2012
183 |DE| |
184
185 When both versions of the Data Constraint are in a system, an observation value reported against COUNTRY FR for time period 2013 would be deemed invalid as the 1.1.0 rule would be applied.
186
187 1.
188 11.
189 111. Inheritance
190
191 **10.4.6.1 Attachment levels of a Constraint**
192
193 There are three levels of constraint attachment for which these inheritance rules apply:
194
195 • DSD/MSD – top level o Dataflow/Metadataflow – second level
196
197 § Provision Agreement – third level
198
199 It is not necessary for a Constraint to be attached to a higher level artefact. e.g., it is valid to have a Constraint for a Provision Agreement where there are no constraints attached the relevant Dataflow or DSD.
200
201 **10.4.6.2 Cascade rules for processing Constraints**
202
203 The processing of the constraints on either Dataflow/Metadataflow or Provision Agreement must take into account the constraints declared at higher levels. The rules for the lower-level constraints (attached to Dataflow/ Metadataflow and Provision Agreement) are detailed below.
204
205 Note that there can be a situation where a constraint is specified at a lower level before a constraint is specified at a higher level. Therefore, it is possible that a higher-level constraint makes a lower-level constraint invalid. SDMX makes no rules on how such a conflict should be handled when processing the constraint for attachment. However, the cascade rules on evaluating constraints for usage are clear – the higher-level constraint takes precedence in any conflicts that result in a less restrictive specification at the lower level.
206
207 **10.4.6.3 Cube Region**
208
209 It is not necessary to have a Constraint on the higher-level artefact (e.g., DSD referenced by the Dataflow), but if there is such a Constraint at the higher level(s) then:
210
211 * The lower-level Constraint cannot be less restrictive than the Constraint specified for the same Member Selection (e.g. Dimension) at the next higher level, which constrains that Member Selection. For example, if the Dimension FREQ is constrained to A, Q in a DSD, then the Constraint at the Dataflow or Provision Agreement cannot be A, Q, M or even just M – it can only further constrain A, Q.
212 * The Constraint at the lower level for any one Member Selection further constrains the content for the same Member Selection at the higher level(s).
213 * Any Member Selection, which is not referenced in a Constraint, is deemed to be constrained according to the Constraint specified at the next higher level which constraints that Member Selection.
214 * If there is a conflict when resolving the Constraint in terms of a lower-level Constraint being less restrictive than a higher-level Constraint, then the Constraint at the higher-level is used.
215
216 Note that it is possible for a Constraint at a higher level to constrain, say, four Dimensions in a single Constraint, and a Constraint at a lower level to constrain the same four in two, three, or four Constraints.
217
218 **10.4.6.4 Key Set**
219
220 It is not necessary to have a Constraint on the higher-level artefact (e.g., DSD referenced by the Dataflow), but if there is such a Constraint at the higher level(s) then:
221
222 * The lower-level Constraint cannot be less restrictive than the Constraint specified at the higher level.
223 * The Constraint at the lower level for any one Member Selection further constrains the keys specified at the higher level(s).
224 * Any Member Selection, which is not referenced in a Constraint, is deemed to be constrained according to the Constraint specified at the next higher level which constraints that Member Selection.
225 * If there is a conflict when resolving the keys in the Constraint at two levels, in terms of a lower-level constraint being less restrictive than a higher-level Constraint, then the offending keys specified at the lower level are not deemed part of the Constraint.
226
227 Note that a Key in a Key Set can have wildcarded Components. For instance, the Constraint may simply constrain the Dimension FREQ to "A", and all keys where the FREQ="A" are therefore valid.
228
229 The following logic explains how the inheritance mechanism works. Note that this is conceptual logic and actual systems may differ in the way this is implemented.
230
231 1.
232 11. Determine all possible keys that are valid at the higher level.
233 11. These keys are deemed to be inherited by the lower-level constrained object, subject to the Constraints specified at the lower level.
234 11. Determine all possible keys that are possible using the Constraints specified at the lower level.
235 11. At the lower level inherit all keys that match with the higher-level Constraint.
236 11. If there are keys in the lower-level Constraint that are not inherited then the key is invalid (i.e., it is less restrictive).
237 111. Constraints Examples
238
239 **10.4.7.1 Data Constraint and Cascading **The following scenario is used.
240
241 A DSD contains the following Dimensions:
242
243 * GEO – Geography
244 * SEX – Sex
245 * AGE – Age
246 * CAS – Current Activity Status
247
248 In the DSD, common code lists are used and the requirement is to restrict these at various levels to specify the actual code that are valid for the object to which the Constraint is attached.
249
250
251 **Figure 20: Example Scenario for Constraints **Constraints are declared as follows:
252
253
254 **Figure 21: Example Constraints**
255
256 Notes:
257
258 AGE is constrained for the DSD and is further restricted for the Dataflow CENSUS_CUBE1.
259
260 * The same Constraint applies to both Provision Agreements.
261
262 The cascade rules elaborated above result as follows:
263
264 DSD
265
266 * Constrained by eliminating code 001 from the code list for the AGE Dimension.
267
268 Dataflow CENSUS_CUBE1
269
270 * Constrained by restricting the code list for the AGE Dimension to codes 002 and 003 (note that this is a more restrictive constraint than that declared for the DSD which specifies all codes except code 001).
271 ** Restricts the CAS codes to 003 and 004.
272
273 Dataflow CENSUS_CUBE2
274
275 * Restricts the code list for the CAS Dimension to codes TOT and NAP.
276 ** Inherits the AGE constraint applied at the level of the DSD.
277
278 Provision Agreement CENSUS_CUBE1_IT
279
280 * Restricts the codes for the GEO Dimension to IT and its children.
281 ** Inherits the constraints from Dataflow CENSUS_CUBE1 for the AGE and CAS Dimensions.
282
283
284
285 Provision Agreement CENSUS_CUBE2_IT
286
287 * Restricts the codes for the GEO Dimension to IT and its children.
288 ** Inherits the constraints from Dataflow CENSUS_CUBE2 for the CAS Dimension.
289 ** Inherits the AGE constraint applied at the level of the DSD.
290
291 The Constraints are defined as follows:
292
293 DSD Constraint
294
295 Dataflow Constraints
296
297 Provision Agreement Constraint
298
299 **10.4.7.2 Combination of Constraints**
300
301 The possible combination of constraining terms are explained in this section, following a few examples.
302
303 Let’s assume a DSD with the following Components:
304
305 |Dimension|FREQ
306 |Dimension|JD_TYPE
307 |Dimension|JD_CATEGORY
308 |Dimension|VIS_CTY
309 |TimeDimension|TIME_PERIOD
310 |Attribute|OBS_STATUS
311 |Attribute|UNIT
312 |Attribute|COMMENT
313 |MetadataAttribute|CONTACT
314 |Measure|MULTISELECT
315 |Measure|CHOICE
316
317 On the above, let’s assume the following use cases with their constraining requirements:
318
319 **Use Case 1: A Constraint on allowed values for some Dimensions**
320
321 R1: Allow monthly and quarterly data
322
323 R2: Allow Mexico for vis-à-vis country
324
325 This is expressed with the following CubeRegion:
326
327 |FREQ|M, Q
328 |VIS_CTY|MX
329
330 **Use Case 2: A Constraint on allowed combinations for some Dimensions**
331
332 R1: Allow monthly data for Germany
333
334 R2: Allow quarterly data for Mexico
335
336 This is expressed with the following DataKeySet:
337
338 |Key1|FREQ|M
339 | |VIS_CTY|DE
340 |(% rowspan="2" %)Key2|FREQ|Q
341 |VIS_CTY|MX
342
343 **Use Case 3: A Constraint on allowed values for some Dimensions combined with allowed values for some Attributes** R1: Allow monthly and quarterly data
344
345 R2: Allow Mexico for vis-à-vis country
346
347 R3: Allow present for status
348
349 This may be expressed with the following CubeRegion:
350
351 |FREQ|M, Q
352 |VIS_CTY|MX
353 |OBS_STATUS|A
354
355 **Use Case 4: A Constraint on allowed combinations for some Dimensions combined with specific Attribute values**
356
357 R1: Allow monthly data, for Germany, with unit euro
358
359 R2: Allow quarterly data, for Mexico, with unit usd
360
361 This may be expressed with the following DataKeySet:
362
363 |(% rowspan="3" %)Key1|FREQ|M
364 |VIS_CTY|DE
365 |UNIT|EUR
366 |(% rowspan="3" %)Key2|FREQ|Q
367 |VIS_CTY|MX
368 |UNIT|USD
369
370 **Use Case 5: A Constraint on allowed values for some Dimensions together with some combination of Dimension values**
371
372 R1: For annually and quarterly data, for Mexico and Germany, only A status is allowed
373
374 R2: For monthly data, for Mexico and Germany, only F status is allowed
375
376 Considering the above examples, the following CubeRegions would be created:
377
378 |(% rowspan="3" %)CubeRegion1|FREQ|Q, A
379 |VIS_CTY|MX, DE
380 |OBS_STATUS|A
381 |(% rowspan="3" %)CubeRegion2|FREQ|M
382 |VIS_CTY|MX, DE
383 |OBS_STATUS|F
384
385 The problem with this approach is that according to the business rule for Constraints, only one should be specified per Component. Thus, if a software would perform some conflict resolution would end up with empty sets for FREQ and OBS_STATUS (as they do not share any values).
386
387 Nevertheless, there is a much easier approach to that; this is the cascading mechanism of Constraints (as shown in 10.4.7.1). Hence, these rules would be expressed into two levels of Constraints, e.g., DSD and Dataflows:
388
389 DSD CubeRegion:
390
391 |FREQ|M, Q, A
392 |VIS_CTY|MX, DE
393 |OBS_STATUS|A, F
394
395 Dataflow1 CubeRegion:
396
397 |FREQ|Q, A
398 |VIS_CTY|MX, DE
399 |OBS_STATUS|F
400
401 Dataflow2 CubeRegion:
402
403 |FREQ|M
404 |VIS_CTY|MX, DE
405 |OBS_STATUS|A
406
407 **Use case 6: A Constraint on allowed values for some Dimensions combined with allowed values for Measures**
408
409 R1: Allow monthly data, for Germany, with unit euro, and measure choice is  'A' R2: Allow quarterly data, for Mexico, with unit usd, and measure choice is  'B'
410
411 This may be expressed with the following DataKeySet:
412
413 |(% rowspan="4" %)Key1|FREQ|M
414 |VIS_CTY|DE
415 |UNIT|EUR
416 |CHOICE|A
417 |(% rowspan="4" %)Key2|FREQ|Q
418 |VIS_CTY|MX
419 |UNIT|USD
420 |CHOICE|B
421
422 **Use Case 7: A Constraint with wildcards for Codes and removePrefix property** For this example, we assume that the VIS_CTY representation has been prefixed with prefix ‘AREA_’. In this Constraint, we need to remove the prefix.
423
424 R1: Allow monthly and quarterly data
425
426 R2: Allow vis-à-vis countries that start with M
427
428 R3: Remove the prefix ‘AREA_’
429
430 This may be expressed with the following CubeRegion:
431
432 |FREQ|M, Q
433 |VIS_CTY (removePrefix=’AREA_’)|M%
434
435 **Use Case 8: A Constraint with multilingual support on Attributes**
436
437 R1: Allow monthly and quarterly data
438
439 R2: Allow Mexico for vis-à-vis country
440
441 R3: Allow a comment, in English, which includes the term adjusted for status
442
443 This may be expressed with the following CubeRegion:
444
445 |FREQ|M, Q
446 |VIS_CTY|MX
447 |COMMENT (lang=’en’)|%adjusted%
448
449 **Use Case 9: A Constraint on allowed values for Dimensions combined with allowed** values for Metadata Attributes R1: Allow monthly and quarterly data
450
451 R2: Allow Mexico for vis-à-vis country
452
453 R3: Allow John Doe for contact
454
455 This may be expressed with the following CubeRegion:
456
457 |FREQ|M, Q
458 |VIS_CTY|MX
459 |CONTACT|John Doe