Changes for page 12 Validation and Transformation Language (VTL)
Last modified by Artur on 2025/09/10 11:19
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -19,7 +19,6 @@ 19 19 This section does not explain the VTL language or any of the content published in the VTL guides. Rather, this is a description of how the VTL can be used in the SDMX context and applied to SDMX artefacts. 20 20 21 21 == 12.2 References to SDMX artefacts from VTL statements == 22 - 23 23 === 12.2.1 Introduction === 24 24 25 25 The VTL can manipulate SDMX artefacts (or objects) by referencing them through predefined conventional names (aliases). ... ... @@ -49,8 +49,10 @@ 49 49 50 50 The generic structure of the URN is the following: 51 51 52 -SDMXprefix.SDMX-IM-package-name.class-name=agency-id:maintainedobject-id (maintainedobject-version).*container-object-id.object-id51 +SDMXprefix.SDMX-IM-package-name.class-name=agency-id:maintainedobject-id 53 53 53 +(maintainedobject-version).*container-object-id.object-id 54 + 54 54 The **SDMXprefix** is "urn:sdmx:org", always the same for all SDMX artefacts. 55 55 56 56 The SDMX-IM-package-name** **is the concatenation of the string** **"sdmx.infomodel." with the package-name, which the artefact belongs to. For example, for referencing a Dataflow the SDMX-IM-package-name is "sdmx.infomodel.datastructure", because the class Dataflow belongs to the package "datastructure". ... ... @@ -71,19 +71,24 @@ 71 71 72 72 The maintainedobject-version is the version, according to the SDMX versioning rules, of the maintained object which the artefact belongs to (for example, possible versions might be 1.0, 2.3, 1.0.0, 2.1.0 or 3.1.2). 73 73 74 -The container-object-id does not apply to the classes that can be referenced in VTL Transformations, therefore is not present in their URN .75 +The container-object-id does not apply to the classes that can be referenced in VTL Transformations, therefore is not present in their URN 75 75 76 76 The object-id is the name of the non-maintainable artefact (when the artefact is maintainable its name is already specified as the maintainedobject-id, see above), in particular it has to be specified: 77 77 78 -* if the artefact is a Dimension, TimeDimension, Measure or DataAttribute (the object-id is the name of one of the artefacts above, which are data structure components) 79 +* if the artefact is a Dimension, TimeDimension, Measure or 80 + 81 +DataAttribute (the object-id is the name of one of the artefacts above, which are data structure components) 82 + 79 79 * if the artefact is a Concept (the object-id is the name of the Concept) 80 80 81 81 For example, by using the URN, the VTL Transformation that sums two SDMX Dataflows DF1 and DF2 and assigns the result to a third persistent Dataflow DFR, assuming that DF1, DF2 and DFR are the maintainedobject-id of the three Dataflows, that their version is 1.0.0 and their Agency is AG, would be written as{{footnote}}Since these references to SDMX objects include non-permitted characters as per the VTL ID notation, they need to be included between single quotes, according to the VTL rules for irregular names.{{/footnote}}: 82 82 83 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DFR(1.0.0)' <- 84 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF1(1.0.0)' + 85 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF2(1.0.0)' 87 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DFR(1.0.0)' <- 86 86 89 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF1(1.0.0)' + 90 + 91 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF2(1.0.0)' 92 + 87 87 === 12.2.3 Abbreviation of the URN === 88 88 89 89 The complete formulation of the URN described above is exhaustive but verbose, even for very simple statements. In order to reduce the verbosity through a simplified identifier and make the work of transformation definers easier, proper abbreviations of the URN are possible. Using this approach, the referenced artefacts remain intelligible in the VTL code by a human reader. ... ... @@ -92,13 +92,10 @@ 92 92 93 93 * The SDMXprefix can be omitted for all the SDMX objects, because it is a prefixed string (urn:sdmx:org), always the same for SDMX objects. 94 94 * The SDMX-IM-package-name** **can be omitted as well because it can be deduced from the class-name that follows it (the table of the SDMX-IM packages and classes that allows this deduction is in the SDMX 2.1 Standards - Section 5 - Registry Specifications, paragraph 6.2.3). In particular, considering the object classes of the artefacts that VTL can reference, the package is: 95 -** "datastructure" for the classes Dataflow, Dimension, TimeDimension, Measure, DataAttribute, 96 -** "conceptscheme" for the class Concept, 97 -** "codelist" for the class Codelist. 101 +** "datastructure" for the classes Dataflow, Dimension, TimeDimension, Measure, DataAttribute, o "conceptscheme" for the class Concept, o "codelist" for the class Codelist. 98 98 * The class-name can be omitted as it can be deduced from the VTL invocation. In particular, starting from the VTL class of the invoked artefact (e.g. dataset, component, identifier, measure, attribute, variable, valuedomain), which is known given the syntax of the invoking VTL operator{{footnote}}For the syntax of the VTL operators see the VTL Reference Manual{{/footnote}}, the SDMX class can be deduced from the mapping rules between VTL and SDMX (see the section "Mapping between VTL and SDMX" hereinafter){{footnote}}In case the invoked artefact is a VTL component, which can be invoked only within the invocation of a VTL data set (SDMX Dataflow), the specific SDMX class-name (e.g. Dimension, TimeDimension, Measure or DataAttribute) can be deduced from the data structure of the SDMX Dataflow, which the component belongs to.{{/footnote}}. 99 99 * If the agency-id is not specified, it is assumed by default equal to the agency-id of the TransformationScheme, UserDefinedOperatorScheme or RulesetScheme from which the artefact is invoked. For example, the agencyid can be omitted if it is the same as the invoking TransformationScheme and cannot be omitted if the artefact comes from another agency{{footnote}}If the Agency is composite (for example AgencyA.Dept1.Unit2), the agency is considered different even if only part of the composite name is different (for example AgencyA.Dept1.Unit3 is a different Agency than the previous one). Moreover the agency-id cannot be omitted in part (i.e., if a TransformationScheme owned by AgencyA.Dept1.Unit2 references an artefact coming from AgencyA.Dept1.Unit3, the specification of the agency-id becomes mandatory and must be complete, without omitting the possibly equal parts like AgencyA.Dept1){{/footnote}}. Take also into account that, according to the VTL consistency rules, the agency of the result of a Transformation must be the same as its TransformationScheme, therefore the agency-id can be omitted for all the results (left part of Transformation statements). 100 -* As for the maintainedobject-id, this is essential in some cases while in other cases it can be omitted: 101 -** if the referenced artefact is a Dataflow, which is a maintainable class, the maintainedobject-id is the dataflow-id and obviously cannot be omitted; 104 +* As for the maintainedobject-id, this is essential in some cases while in other cases it can be omitted: o if the referenced artefact is a Dataflow, which is a maintainable class, the maintainedobject-id is the dataflow-id and obviously cannot be omitted; 102 102 ** if the referenced artefact is a Dimension, TimeDimension, Measure, DataAttribute, which are not maintainable and belong to the DataStructure maintainable class, the maintainedobject-id is the dataStructure-id and can be omitted, given that these components are always invoked within the invocation of a Dataflow, whose dataStructure-id can be deduced from the SDMX structural definitions; 103 103 ** if the referenced artefact is a Concept, which is not maintainable and belong to the ConceptScheme maintainable class, the maintained object is the conceptScheme-id and cannot be omitted; 104 104 ** if the referenced artefact is a Codelist, which is a maintainable class, the maintainedobject-id is the codelist-id and obviously cannot be omitted. ... ... @@ -110,47 +110,51 @@ 110 110 111 111 For example, the full formulation that uses the complete URN shown at the end of the previous paragraph: 112 112 113 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DFR(1.0.0)' := 114 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF1(1.0.0)' + 115 -> 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF2(1.0.0)' 116 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DFR(1.0.0)' := 116 116 118 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF1(1.0.0)' + 119 + 120 +'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF2(1.0.0)' 121 + 117 117 by omitting all the non-essential parts would become simply: 118 118 119 - >DFR:=DF1 + DF2124 +DFR := DF1 + DF2 120 120 121 121 The references to the Codelists can be simplified similarly. For example, given the non-abbreviated reference to the Codelist AG:CL_FREQ(1.0.0), which is{{footnote}}Single quotes are needed because this reference is not a VTL regular name. 19 Single quotes are not needed in this case because CL_FREQ is a VTL regular name.{{/footnote}}: 122 122 123 - >'urn:sdmx:org.sdmx.infomodel.codelist.Codelist=AG:CL_FREQ(1.0.0)'128 +'urn:sdmx:org.sdmx.infomodel.codelist.Codelist=AG:CL_FREQ(1.0.0)' 124 124 125 125 if the Codelist is referenced from a RulesetScheme belonging to the agency AG, omitting all the optional parts, the abbreviated reference would become simply^^19^^: 126 126 127 - >CL_FREQ132 +CL_FREQ 128 128 129 129 As for the references to the components, it can be enough to specify the componentId, given that the dataStructure-Id can be omitted. An example of non-abbreviated reference, if the data structure is DST1 and the component is SECTOR, is the following: 130 130 131 - >'urn:sdmx:org.sdmx.infomodel.datastructure.DataStructure=AG:DST1(1.0.0).SECTOR'136 +'urn:sdmx:org.sdmx.infomodel.datastructure.DataStructure=AG:DST1(1.0.0).S 132 132 138 +ECTOR' 139 + 133 133 The corresponding fully abbreviated reference, if made from a TransformationScheme belonging to AG, would become simply: 134 134 135 - >SECTOR142 +SECTOR 136 136 137 137 For example, the Transformation for renaming the component SECTOR of the Dataflow DF1 into SEC can be written as{{footnote}}The result DFR(1.0.0) is be equal to DF1(1.0.0) save that the component SECTOR is called SEC{{/footnote}}: 138 138 139 - >'DFR(1.0.0)' := 'DF1(1.0.0)' [rename SECTOR to SEC]146 +'DFR(1.0.0)' := 'DF1(1.0.0)' [rename SECTOR to SEC] 140 140 141 141 In the references to the Concepts, which can exist for example in the definition of the VTL Rulesets, at least the conceptScheme-id and the concept-id must be specified. 142 142 143 143 An example of non-abbreviated reference, if the conceptScheme-id is CS1 and the concept-id is SECTOR, is the following: 144 144 145 - >'urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=AG:CS1(1.0.0).SECTOR'152 +'urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=AG:CS1(1.0.0).SECTOR' 146 146 147 147 The corresponding fully abbreviated reference, if made from a RulesetScheme belonging to AG, would become simply: 148 148 149 - >CS1(1.0.0).SECTOR156 +CS1(1.0.0).SECTOR 150 150 151 151 The Codes and in general all the Values can be written without any other specification, for example, the transformation to check if the values of the measures of the Dataflow DF1 are between 0 and 25000 can be written like follows: 152 152 153 - >'DFR(1.0.0)' := between ( 'DF1(1.0.0)', 0, 25000 )160 +'DFR(1.0.0)' := between ( 'DF1(1.0.0)', 0, 25000 ) 154 154 155 155 The artefact (Component, Concept, Codelist …) which the Values are referred to can be deduced from the context in which the reference is made, taking also into account the VTL syntax. In the Transformation above, for example, the values 0 and 2500 are compared to the values of the measures of DF1(1.0.0). 156 156 ... ... @@ -173,7 +173,6 @@ 173 173 In the body of the Rulesets, the Codes and in general all the Values can be written without any other specification, because the artefact, which the Values are referred (Codelist, Concept) to can be deduced from the Ruleset signature. 174 174 175 175 == 12.3 Mapping between SDMX and VTL artefacts == 176 - 177 177 === 12.3.1. When the mapping occurs === 178 178 179 179 The mapping methods between the VTL and SDMX object classes allow transforming a SDMX definition in a VTL one and vice-versa for the artefacts to be manipulated. It should be remembered that VTL programs (i.e. Transformation Schemes) are represented in SDMX through the TransformationScheme maintainable class which is composed of Transformations (nameable artefacts). Each Transformation assigns the outcome of the evaluation of a VTL expression to a result: the input operands of the expression and the result can be SDMX artefacts. Every time a SDMX object is referenced in a VTL Transformation as an input operand, there is the need to generate a VTL definition of the object, so that the VTL operations can take place. This can be made starting from the SDMX definition and applying a SDMX-VTL mapping method in the direction from SDMX to VTL. The possible mapping methods from SDMX to VTL are described in the following paragraphs and are conceived to allow the automatic deduction of the VTL definition of the object from the knowledge of the SDMX definition. ... ... @@ -206,12 +206,11 @@ 206 206 207 207 When transforming **from SDMX to VTL**, this method consists in leaving the components unchanged and maintaining their names and roles, according to the following table: 208 208 209 -(% style="width:468.294px" %) 210 -|(% style="width:196px" %)**SDMX**|(% style="width:269px" %)**VTL** 211 -|(% style="width:196px" %)Dimension|(% style="width:269px" %)(Simple) Identifier 212 -|(% style="width:196px" %)TimeDimension|(% style="width:269px" %)(Time) Identifier 213 -|(% style="width:196px" %)Measure|(% style="width:269px" %)Measure 214 -|(% style="width:196px" %)DataAttribute|(% style="width:269px" %)Attribute 215 +|**SDMX**|**VTL** 216 +|Dimension|(Simple) Identifier 217 +|TimeDimension|(Time) Identifier 218 +|Measure|Measure 219 +|DataAttribute|Attribute 215 215 216 216 The SDMX DataAttributes, in VTL they are all considered "at data point / observation level" (i.e. dependent on all the VTL Identifiers), because VTL does not have the SDMX AttributeRelationships, which defines the construct to which the DataAttribute is related (e.g. observation, dimension or set or group of dimensions, whole data set). 217 217 ... ... @@ -534,7 +534,6 @@ 534 534 It remains up to the SDMX-VTL definer also the assurance of the consistency between a VTL Ruleset defined on Variables and the SDMX Components on which the Ruleset is applied. In fact, a VTL Ruleset is expressed by means of the values of the Variables (i.e. SDMX Concepts), i.e. assuming definite representations for them (e.g. ISOalpha-3 for country). If the Ruleset is applied to SDMX Components that have the same name of the Concept they refer to but different representations (e.g. ISO-alpha-2 for country), the Ruleset cannot work properly. 535 535 536 536 == 12.4 Mapping between SDMX and VTL Data Types == 537 - 538 538 === 12.4.1 VTL Data types === 539 539 540 540 According to the VTL User Guide the possible operations in VTL depend on the data types of the artefacts. For example, numbers can be multiplied but text strings cannot. In the VTL Transformations, the compliance between the operators and the data types of their operands is statically checked, i.e., violations result in compile-time errors.