Changes for page 12 Validation and Transformation Language (VTL)
Last modified by Artur on 2025/09/10 11:19
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -2,9 +2,10 @@ 2 2 {{toc/}} 3 3 {{/box}} 4 4 5 -== 12.1 Introduction == 5 +1. 6 +11. Introduction 6 6 7 -The Validation and Transformation Language (VTL) supports the definition of Transformations, which are algorithms to calculate new data starting from already existing ones {{footnote}}The Validationand TransformationLanguageisa standardlanguagedesigned and publishedunder theSDMXinitiative. VTLisdescribedintheVTL UserandReferenceGuidesavailable onthe SDMXwebsitehttps://sdmx.org.{{/footnote}}. The purpose of the VTL in the SDMX context is to enable the:8 +The Validation and Transformation Language (VTL) supports the definition of Transformations, which are algorithms to calculate new data starting from already existing ones[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[1~]^^>>path:#_ftn1]](%%). The purpose of the VTL in the SDMX context is to enable the: 8 8 9 9 * definition of validation and transformation algorithms, in order to specify how to calculate new data from existing ones; 10 10 * exchange of the definition of VTL algorithms, also together the definition of the data structures of the involved data (for example, exchange the data structures of a reporting framework together with the validation rules to be applied, exchange the input and output data structures of a calculation task together with the VTL Transformations describing the calculation algorithms); ... ... @@ -12,30 +12,33 @@ 12 12 13 13 It is important to note that the VTL has its own information model (IM), derived from the Generic Statistical Information Model (GSIM) and described in the VTL User Guide. The VTL IM is designed to be compatible with more standards, like SDMX, DDI (Data Documentation Initiative) and GSIM, and includes the model artefacts that can be manipulated (inputs and/or outputs of Transformations, e.g. "Data Set", "Data Structure") and the model artefacts that allow the definition of the transformation algorithms (e.g. "Transformation", "Transformation Scheme"). 14 14 15 -The VTL language can be applied to SDMX artefacts by mapping the SDMX IM model artefacts to the model artefacts that VTL can manipulate {{footnote}}Inthischapter,inorderto distinguish VTL and SDMX modelartefacts,theVTL onesarewrittenin theArialfont while theSDMX onesinCourierNew.{{/footnote}}. Thus, the SDMX artefacts can be used in VTL as inputs and/or outputs of Transformations. It is important to be aware that the artefacts do not always have the same names in the SDMX and VTL IMs, nor do they always have the same meaning. The more evident example is given by the SDMX Dataset and the VTL "Data Set", which do not correspond one another: as a matter of fact, the VTL "Data Set" maps to the SDMX "Dataflow", while the SDMX "Dataset" has no explicit mapping to VTL (such an abstraction is not needed in the definition of VTL Transformations). A SDMX "Dataset", however, is an instance of a SDMX "Dataflow" and can be the artefact on which the VTL transformations are executed (i.e., the Transformations are defined on Dataflows and are applied to Dataflow instances that can be Datasets).16 +The VTL language can be applied to SDMX artefacts by mapping the SDMX IM model artefacts to the model artefacts that VTL can manipulate[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[2~]^^>>path:#_ftn2]](%%). Thus, the SDMX artefacts can be used in VTL as inputs and/or outputs of Transformations. It is important to be aware that the artefacts do not always have the same names in the SDMX and VTL IMs, nor do they always have the same meaning. The more evident example is given by the SDMX Dataset and the VTL "Data Set", which do not correspond one another: as a matter of fact, the VTL "Data Set" maps to the SDMX "Dataflow", while the SDMX "Dataset" has no explicit mapping to VTL (such an abstraction is not needed in the definition of VTL Transformations). A SDMX "Dataset", however, is an instance of a SDMX "Dataflow" and can be the artefact on which the VTL transformations are executed (i.e., the Transformations are defined on Dataflows and are applied to Dataflow instances that can be Datasets). 16 16 17 17 The VTL programs (Transformation Schemes) are represented in SDMX through the TransformationScheme maintainable class which is composed of Transformation (nameable artefact). Each Transformation assigns the outcome of the evaluation of a VTL expression to a result. 18 18 19 19 This section does not explain the VTL language or any of the content published in the VTL guides. Rather, this is a description of how the VTL can be used in the SDMX context and applied to SDMX artefacts. 20 20 21 -== 12.2 References to SDMX artefacts from VTL statements == 22 -=== 12.2.1 Introduction === 22 +1. 23 +11. References to SDMX artefacts from VTL statements 24 +111. Introduction 23 23 24 24 The VTL can manipulate SDMX artefacts (or objects) by referencing them through predefined conventional names (aliases). 25 25 26 26 The alias of an SDMX artefact can be its URN (Universal Resource Name), an abbreviation of its URN or another user-defined name. 27 27 28 -In any case, the aliases used in the VTL Transformations have to be mapped to the SDMX artefacts through the VtlMappingScheme and VtlMapping classes (see the section of the SDMX IM relevant to the VTL). A VtlMapping allows specifying the aliases to be used in the VTL Transformations, Rulesets {{footnote}}Seealsothesection"VTL-DL Rulesets"in theVTL Reference Manual.{{/footnote}} orUserDefined Operators{{footnote}}The VTLMappingsareusedalsofor User Defined Operators(UDO).Although UDOsare envisagedto bedefinedongenericoperands,so that the specific artefacts to be manipulated are passedas parametersattheirinvocation,itis also possiblethat anUDOinvokesdirectly some specific SDMX artefacts. These SDMX artefactshaveto bemapped to the correspondingaliasesusedinthedefinitionoftheUDO through the VtlMappingSchemeand VtlMappingclasses as well.{{/footnote}}to reference SDMX artefacts. A VtlMappingScheme is a container for zero or more VtlMapping.30 +In any case, the aliases used in the VTL Transformations have to be mapped to the SDMX artefacts through the VtlMappingScheme and VtlMapping classes (see the section of the SDMX IM relevant to the VTL). A VtlMapping allows specifying the aliases to be used in the VTL Transformations, Rulesets[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[3~]^^>>path:#_ftn3]](%%) or User Defined Operators[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[4~]^^>>path:#_ftn4]](%%) to reference SDMX artefacts. A VtlMappingScheme is a container for zero or more VtlMapping. 29 29 30 30 The correspondence between an alias and a SDMX artefact must be one-to-one, meaning that a generic alias identifies one and just one SDMX artefact while a SDMX artefact is identified by one and just one alias. In other words, within a VtlMappingScheme an artefact can have just one alias and different artefacts cannot have the same alias. 31 31 32 32 The references through the URN and the abbreviated URN are described in the following paragraphs. 33 33 34 -=== 12.2.2 References through the URN === 36 +1. 37 +11. 38 +111. References through the URN 35 35 36 36 This approach has the advantage that in the VTL code the URN of the referenced artefacts is directly intelligible by a human reader but has the drawback that the references are verbose. 37 37 38 -The SDMX URN {{footnote}}ForacompletedescriptionofthestructureoftheURN seetheSDMX 2.1 Standards- Section5 - RegistrySpecifications,paragraph 6.2.2 ("UniversalResource Name(URN)").{{/footnote}}(%style="font-size:12px" %)(%%)is the concatenation of the following parts, separated by special symbols like dot, equal, asterisk, comma, and parenthesis:42 +The SDMX URN[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[5~]^^>>path:#_ftn5]](%%) is the concatenation of the following parts, separated by special symbols like dot, equal, asterisk, comma, and parenthesis: 39 39 40 40 * SDMXprefix 41 41 * SDMX-IM-package-name ... ... @@ -43,7 +43,7 @@ 43 43 * agency-id 44 44 * maintainedobject-id 45 45 * maintainedobject-version 46 -* container-object-id {{footnote}}Thecontainer-object-idcan repeat andmaynotbepresent.{{/footnote}}50 +* container-object-id [[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[6~]^^>>path:#_ftn6]] 47 47 * object-id 48 48 49 49 The generic structure of the URN is the following: ... ... @@ -60,7 +60,7 @@ 60 60 61 61 The agency-id is the acronym of the agency that owns the definition of the artefact, for example for the Eurostat artefacts the agency-id is "ESTAT"). The agency-id can be composite (for example AgencyA.Dept1.Unit2). 62 62 63 -The maintainedobject-id is the name of the maintained object which the artefact belongs to, and in case the artefact itself is maintainable {{footnote}}i.e.,theartefactbelongstoamaintainableclass{{/footnote}}, coincides with the name of the artefact. Therefore the maintainedobject-id depends on the class of the artefact:67 +The maintainedobject-id is the name of the maintained object which the artefact belongs to, and in case the artefact itself is maintainable[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[7~]^^>>path:#_ftn7]](%%), coincides with the name of the artefact. Therefore the maintainedobject-id depends on the class of the artefact: 64 64 65 65 * if the artefact is a Dataflow, which is a maintainable class, the maintainedobject-id is the Dataflow name (dataflow-id); 66 66 * if the artefact is a Dimension, Measure, TimeDimension or DataAttribute, which are not maintainable and belong to the ... ... @@ -82,7 +82,7 @@ 82 82 83 83 * if the artefact is a Concept (the object-id is the name of the Concept) 84 84 85 -For example, by using the URN, the VTL Transformation that sums two SDMX Dataflows DF1 and DF2 and assigns the result to a third persistent Dataflow DFR, assuming that DF1, DF2 and DFR are the maintainedobject-id of the three Dataflows, that their version is 1.0.0 and their Agency is AG, would be written as {{footnote}}Sincethesereferencesto SDMX objects includenon-permitted charactersaspertheVTL IDnotation,theyneedto beincludedbetweensinglequotes, accordingto theVTLrulesforirregularnames.{{/footnote}}:89 +For example, by using the URN, the VTL Transformation that sums two SDMX Dataflows DF1 and DF2 and assigns the result to a third persistent Dataflow DFR, assuming that DF1, DF2 and DFR are the maintainedobject-id of the three Dataflows, that their version is 1.0.0 and their Agency is AG, would be written as[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[8~]^^>>path:#_ftn8]](%%): 86 86 87 87 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DFR(1.0.0)' <- 88 88 ... ... @@ -90,7 +90,9 @@ 90 90 91 91 'urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=AG:DF2(1.0.0)' 92 92 93 -=== 12.2.3 Abbreviation of the URN === 97 +1. 98 +11. 99 +111. Abbreviation of the URN 94 94 95 95 The complete formulation of the URN described above is exhaustive but verbose, even for very simple statements. In order to reduce the verbosity through a simplified identifier and make the work of transformation definers easier, proper abbreviations of the URN are possible. Using this approach, the referenced artefacts remain intelligible in the VTL code by a human reader. 96 96 ... ... @@ -99,10 +99,14 @@ 99 99 * The SDMXprefix can be omitted for all the SDMX objects, because it is a prefixed string (urn:sdmx:org), always the same for SDMX objects. 100 100 * The SDMX-IM-package-name** **can be omitted as well because it can be deduced from the class-name that follows it (the table of the SDMX-IM packages and classes that allows this deduction is in the SDMX 2.1 Standards - Section 5 - Registry Specifications, paragraph 6.2.3). In particular, considering the object classes of the artefacts that VTL can reference, the package is: 101 101 ** "datastructure" for the classes Dataflow, Dimension, TimeDimension, Measure, DataAttribute, o "conceptscheme" for the class Concept, o "codelist" for the class Codelist. 102 -* The class-name can be omitted as it can be deduced from the VTL invocation. In particular, starting from the VTL class of the invoked artefact (e.g. dataset, component, identifier, measure, attribute, variable, valuedomain), which is known given the syntax of the invoking VTL operator {{footnote}}For thesyntaxoftheVTL operatorsseetheVTL ReferenceManual{{/footnote}}, the SDMX class can be deduced from the mapping rules between VTL and SDMX (see the section "Mapping between VTL and SDMX" hereinafter){{footnote}}Incasetheinvokedartefactis a VTL component,which canbeinvokedonlywithintheinvocationof a VTL data set (SDMX Dataflow),the specific SDMX class-name(e.g. Dimension,TimeDimension,Measure or DataAttribute) canbe deduced fromthedatastructureof the SDMX Dataflow,whichthecomponentbelongsto.{{/footnote}}.103 -* If the agency-id is not specified, it is assumed by default equal to the agency-id of the TransformationScheme, UserDefinedOperatorScheme or RulesetScheme from which the artefact is invoked. For example, the agencyid can be omitted if it is the same as the invoking TransformationScheme and cannot be omitted if the artefact comes from another agency {{footnote}}Ifthe Agency iscomposite(forexample AgencyA.Dept1.Unit2), the agencyis considered differentevenif onlypart of the compositenameisdifferent(forexample AgencyA.Dept1.Unit3is a differentAgency thanthe previous one). Moreovertheagency-id cannotbe omittedinpart(i.e., if a TransformationSchemeowned by AgencyA.Dept1.Unit2 referencesanartefact comingfrom AgencyA.Dept1.Unit3,the specificationoftheagency-idbecomes mandatoryandmust be complete,without omittingthepossibly equalpartslikeAgencyA.Dept1){{/footnote}}. Take also into account that, according to the VTL consistency rules, the agency of the result of a Transformation must be the same as its TransformationScheme, therefore the agency-id can be omitted for all the results (left part of Transformation statements).108 +* The class-name can be omitted as it can be deduced from the VTL invocation. In particular, starting from the VTL class of the invoked artefact (e.g. dataset, component, identifier, measure, attribute, variable, valuedomain), which is known given the syntax of the invoking VTL operator[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[9~]^^>>path:#_ftn9]](%%), the SDMX class can be deduced from the mapping rules between VTL and SDMX (see the section "Mapping between VTL and SDMX" hereinafter)[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[10~]^^>>path:#_ftn10]](%%). 109 +* If the agency-id is not specified, it is assumed by default equal to the agency-id of the TransformationScheme, UserDefinedOperatorScheme or RulesetScheme from which the artefact is invoked. For example, the agencyid can be omitted if it is the same as the invoking TransformationScheme and cannot be omitted if the artefact comes from another agency[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[11~]^^>>path:#_ftn11]](%%). Take also into account that, according to the VTL consistency rules, the agency of the result of a Transformation must be the same as its TransformationScheme, therefore the agency-id can be omitted for all the results (left part of Transformation statements). 104 104 * As for the maintainedobject-id, this is essential in some cases while in other cases it can be omitted: o if the referenced artefact is a Dataflow, which is a maintainable class, the maintainedobject-id is the dataflow-id and obviously cannot be omitted; 105 -** if the referenced artefact is a Dimension, TimeDimension, Measure, DataAttribute, which are not maintainable and belong to the DataStructure maintainable class, the maintainedobject-id is the dataStructure-id and can be omitted, given that these components are always invoked within the invocation of a Dataflow, whose dataStructure-id can be deduced from the SDMX structural definitions; 111 +** if the referenced artefact is a Dimension, TimeDimension, Measure, 112 + 113 +DataAttribute, which are not maintainable and belong to the DataStructure maintainable class, the maintainedobject-id is the dataStructure-id and can be omitted, given that these components are always invoked within the invocation of a Dataflow, whose dataStructure-id can be deduced from the SDMX structural definitions; 114 + 115 +* 106 106 ** if the referenced artefact is a Concept, which is not maintainable and belong to the ConceptScheme maintainable class, the maintained object is the conceptScheme-id and cannot be omitted; 107 107 ** if the referenced artefact is a Codelist, which is a maintainable class, the maintainedobject-id is the codelist-id and obviously cannot be omitted. 108 108 * When the maintainedobject-id is omitted, the maintainedobject-version is omitted too. When the maintainedobject-id is not omitted and the maintainedobject-version is omitted, the version 1.0 is assumed by default. ... ... @@ -123,7 +123,7 @@ 123 123 124 124 DFR := DF1 + DF2 125 125 126 -The references to the Codelists can be simplified similarly. For example, given the non-abbreviated reference to the Codelist AG:CL_FREQ(1.0.0), which is {{footnote}}Singlequotes areneeded becausethisreferenceisnotaVTL regularname.19 Singlequotes arenot neededin thiscasebecause CL_FREQisa VTL regularname.{{/footnote}}:136 +The references to the Codelists can be simplified similarly. For example, given the non-abbreviated reference to the Codelist AG:CL_FREQ(1.0.0), which is[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[12~]^^>>path:#_ftn12]](%%): 127 127 128 128 'urn:sdmx:org.sdmx.infomodel.codelist.Codelist=AG:CL_FREQ(1.0.0)' 129 129 ... ... @@ -141,7 +141,7 @@ 141 141 142 142 SECTOR 143 143 144 -For example, the Transformation for renaming the component SECTOR of the Dataflow DF1 into SEC can be written as {{footnote}}Theresult DFR(1.0.0)isbe equal to DF1(1.0.0) save that thecomponentSECTORiscalledSEC{{/footnote}}:154 +For example, the Transformation for renaming the component SECTOR of the Dataflow DF1 into SEC can be written as[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[13~]^^>>path:#_ftn13]](%%): 145 145 146 146 'DFR(1.0.0)' := 'DF1(1.0.0)' [rename SECTOR to SEC] 147 147 ... ... @@ -161,38 +161,45 @@ 161 161 162 162 The artefact (Component, Concept, Codelist …) which the Values are referred to can be deduced from the context in which the reference is made, taking also into account the VTL syntax. In the Transformation above, for example, the values 0 and 2500 are compared to the values of the measures of DF1(1.0.0). 163 163 164 -=== 12.2.4 User-defined alias === 174 +1. 175 +11. 176 +111. User-defined alias 165 165 166 166 The third possibility for referencing SDMX artefacts from VTL statements is to use user-defined aliases not related to the SDMX URN of the artefact. 167 167 168 168 This approach gives preference to the use of symbolic names for the SDMX artefacts. As a consequence, in the VTL code the referenced artefacts may become not directly intelligible by a human reader. In any case, the VTL aliases are associated to the SDMX URN through the VtlMappingScheme and VtlMapping classes. These classes provide for structured references to SDMX artefacts whatever kind of reference is used in VTL statements (URN, abbreviated URN or user-defined aliases). 169 169 170 -=== 12.2.5 References to SDMX artefacts from VTL Rulesets === 182 +1. 183 +11. 184 +111. References to SDMX artefacts from VTL Rulesets 171 171 172 172 The VTL Rulesets allow defining sets of reusable Rules that can be applied by some VTL operators, like the ones for validation and hierarchical roll-up. A "Rule" consists in a relationship between Values belonging to some Value Domains or taken by some Variables, for example: (i) when the Country is USA then the Currency is USD; (ii) the Benelux is composed by Belgium, Luxembourg, Netherlands. 173 173 174 174 The VTL Rulesets have a signature, in which the Value Domains or the Variables on which the Ruleset is defined are declared, and a body, which contains the Rules. 175 175 176 -In the signature, given the mapping between VTL and SDMX better described in the following paragraphs, a reference to a VTL Value Domain becomes a reference to a SDMX Codelist, while a reference to a VTL Represented Variable becomes a reference to a SDMX Concept, assuming for it a definite representation {{footnote}}Rulesetsofthiskind cannotbereusedwhen thereferencedConcepthasadifferentrepresentation.{{/footnote}}.190 +In the signature, given the mapping between VTL and SDMX better described in the following paragraphs, a reference to a VTL Value Domain becomes a reference to a SDMX Codelist, while a reference to a VTL Represented Variable becomes a reference to a SDMX Concept, assuming for it a definite representation[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[14~]^^>>path:#_ftn14]](%%). 177 177 178 -In general, for referencing SDMX Codelists and Concepts, the conventions described in the previous paragraphs apply. In the Ruleset syntax, the elements that reference SDMX artefacts are called "valueDomain" and "variable" for the Datapoint Rulesets and "ruleValueDomain", "ruleVariable", "condValueDomain" "condVariable" for the Hierarchical Rulesets). The syntax of the Ruleset signature allows also to define aliases of the elements above, these aliases are valid only within the specific Ruleset definition statement and cannot be mapped to SDMX. {{footnote}}Seealsothesection"VTL-DL Rulesets"in theVTL ReferenceManual.{{/footnote}}192 +In general, for referencing SDMX Codelists and Concepts, the conventions described in the previous paragraphs apply. In the Ruleset syntax, the elements that reference SDMX artefacts are called "valueDomain" and "variable" for the Datapoint Rulesets and "ruleValueDomain", "ruleVariable", "condValueDomain" "condVariable" for the Hierarchical Rulesets). The syntax of the Ruleset signature allows also to define aliases of the elements above, these aliases are valid only within the specific Ruleset definition statement and cannot be mapped to SDMX.[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[15~]^^>>path:#_ftn15]] 179 179 180 180 In the body of the Rulesets, the Codes and in general all the Values can be written without any other specification, because the artefact, which the Values are referred (Codelist, Concept) to can be deduced from the Ruleset signature. 181 181 182 -== 12.3 Mapping between SDMX and VTL artefacts == 183 -=== 12.3.1. When the mapping occurs === 196 +1. 197 +11. Mapping between SDMX and VTL artefacts 198 +111. When the mapping occurs 184 184 185 185 The mapping methods between the VTL and SDMX object classes allow transforming a SDMX definition in a VTL one and vice-versa for the artefacts to be manipulated. It should be remembered that VTL programs (i.e. Transformation Schemes) are represented in SDMX through the TransformationScheme maintainable class which is composed of Transformations (nameable artefacts). Each Transformation assigns the outcome of the evaluation of a VTL expression to a result: the input operands of the expression and the result can be SDMX artefacts. Every time a SDMX object is referenced in a VTL Transformation as an input operand, there is the need to generate a VTL definition of the object, so that the VTL operations can take place. This can be made starting from the SDMX definition and applying a SDMX-VTL mapping method in the direction from SDMX to VTL. The possible mapping methods from SDMX to VTL are described in the following paragraphs and are conceived to allow the automatic deduction of the VTL definition of the object from the knowledge of the SDMX definition. 186 186 187 -In the opposite direction, every time an object calculated by means of VTL must be treated as a SDMX object (for example for exchanging it through SDMX), there is the need of a SDMX definition of the object, so that the SDMX operations can take place. The SDMX definition is needed for the VTL objects for which a SDMX use is envisaged {{footnote}}Ifacalculated artefactispersistent,itneedsa persistentdefinition,i.e. a SDMX definition ina SDMXenvironment. Inaddition,possiblecalculatedartefact that arenotpersistentmay requireaSDMX definition, forexamplewhentheresult ofanonpersistent calculation is disseminated through SDMX tools (likean inquirytool).{{/footnote}}.202 +In the opposite direction, every time an object calculated by means of VTL must be treated as a SDMX object (for example for exchanging it through SDMX), there is the need of a SDMX definition of the object, so that the SDMX operations can take place. The SDMX definition is needed for the VTL objects for which a SDMX use is envisaged[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[16~]^^>>path:#_ftn16]](%%). 188 188 189 189 The mapping methods from VTL to SDMX are described in the following paragraphs as well, however they do not allow the complete SDMX definition to be automatically deduced from the VTL definition, more than all because the former typically contains additional information in respect to the latter. For example, the definition of a SDMX DSD includes also some mandatory information not available in VTL (like the concept scheme to which the SDMX components refer, the ‘usage’ and ‘attributeRelationship’ for the DataAttributes and so on). Therefore the mapping methods from VTL to SDMX provide only a general guidance for generating SDMX definitions properly starting from the information available in VTL, independently of how the SDMX definition it is actually generated (manually, automatically or part and part). 190 190 191 -=== 12.3.2 General mapping of VTL and SDMX data structures === 206 +1. 207 +11. 208 +111. General mapping of VTL and SDMX data structures 192 192 193 -This section makes reference to the VTL "Model for data and their structure" {{footnote}}See theVTL2.0UserManual{{/footnote}}and the correspondent SDMX "Data Structure Definition"{{footnote}}See theSDMX StandardsSection2 – InformationModel{{/footnote}}.210 +This section makes reference to the VTL "Model for data and their structure"[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[17~]^^>>path:#_ftn17]](%%) and the correspondent SDMX "Data Structure Definition"[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[18~]^^>>path:#_ftn18]](%%). 194 194 195 -The main type of artefact that the VTL can manipulate is the VTL Data Set, which in general is mapped to the SDMX Dataflow. This means that a VTL Transformation, in the SDMX context, expresses the algorithm for calculating a derived Dataflow starting from some already existing Dataflows (either collected or derived). {{footnote}}BesidesthemappingbetweenoneSDMX Dataflow andone VTL Data Set,itisalso possibleto mapdistinctparts ofaSDMX Dataflowto different VTL DataSet,asexplainedinafollowingparagraph.{{/footnote}}212 +The main type of artefact that the VTL can manipulate is the VTL Data Set, which in general is mapped to the SDMX Dataflow. This means that a VTL Transformation, in the SDMX context, expresses the algorithm for calculating a derived Dataflow starting from some already existing Dataflows (either collected or derived).[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[19~]^^>>path:#_ftn19]] 196 196 197 197 While the VTL Transformations are defined in term of Dataflow definitions, they are assumed to be executed on instances of such Dataflows, provided at runtime to the VTL engine (the mechanism for identifying the instances to be processed are not part of the VTL specifications and depend on the implementation of the VTL-based systems). As already said, the SDMX Datasets are instances of SDMX Dataflows, therefore a VTL Transformation defined on some SDMX Dataflows can be applied on some corresponding SDMX Datasets. 198 198 ... ... @@ -204,9 +204,11 @@ 204 204 205 205 The possible mapping options are described in more detail in the following sections. 206 206 207 -=== 12.3.2 Mapping from SDMX to VTL data structures === 224 +1. 225 +11. 226 +111. Mapping from SDMX to VTL data structures 208 208 209 - ====12.3.3.1 Basic Mapping====228 +**12.3.3.1 Basic Mapping** 210 210 211 211 The main mapping method from SDMX to VTL is called **Basic **mapping. This is considered as the default mapping method and is applied unless a different method is specified through the VtlMappingScheme and VtlDataflowMapping classes. 212 212 ... ... @@ -228,7 +228,7 @@ 228 228 229 229 In SDMX 2.1 the MeasureDimension was a subclass of DimensionComponent like Dimension and TimeDimension. In the current SDMX version, this subclass does not exist anymore, however a Dimension can have the role of measure dimension (i.e. a Dimension that contributes to the identification of the measures). In SDMX 2.1 a DataStructure could have zero or one MeasureDimensions, in the current version of the standard, from zero to many Dimension may have the role of measure dimension. Hereinafter a Dimension that plays the role of measure dimension is referenced for simplicity as “MeasureDimension“, i.e. maintaining the capital letters and the courier font even if the MeasureDimension is not anymore a class in the SDMX Information Model of the current SDMX version. For the sake of simplicity, the description below considers just one Dimension having the role of MeasureDimension (i.e., the more simple and common case). Nevertheless, it maintains its validity also if in the DataStructure there are more dimension with the role of MeasureDimensions: in this case what is said about the MeasureDimension must be applied to the combination of all the 230 230 231 -MeasureDimensions considered as a joint variable {{footnote}}E.g.,ifinthedatastructurethere exist3 DimensionsC,D,E havingtherole of MeasureDimension,they should be consideredasa jointMeasureDimensionZ=(C,D,E);thereforewhenthedescriptionsays “each possible valueCj oftheMeasureDimension…”itmeans “each possiblecombinationof values (Cj, Dk,Ew) of the jointMeasureDimensionZ=(C,D,E)”.{{/footnote}}.250 +MeasureDimensions considered as a joint variable[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[20~]^^>>path:#_ftn20]](%%). 232 232 233 233 Among other things, the Pivot method provides also backward compatibility with the SDMX 2.1 data structures that contained a MeasureDimension. 234 234 ... ... @@ -383,19 +383,19 @@ 383 383 384 384 (corresponding to one VTL Data Set) or as the union of many sets of data observations (each one corresponding to a distinct VTL Data Set). 385 385 386 -As a matter of fact, in some cases it can be useful to define VTL operations involving definite parts of a SDMX Dataflow instead than the whole. {{footnote}}Atypicalexample of thiskindisthevalidation,and moreingeneral themanipulation,ofindividualtimeseries belongingto thesame Dataflow,identifiablethrough theDimensionComponents of theDataflowexcept the TimeDimension. Thecodingof thesekind of operationsmight be simplified by mappingdistincttimeseries(i.e. different parts of a SDMX Dataflow) to distinctVTL Data Sets.{{/footnote}}405 +As a matter of fact, in some cases it can be useful to define VTL operations involving definite parts of a SDMX Dataflow instead than the whole.[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[21~]^^>>path:#_ftn21]] 387 387 388 -Therefore, in order to make the coding of VTL operations simpler when applied on parts of SDMX Dataflows, it is allowed to map distinct parts of a SDMX Dataflow to distinct VTL Data Sets according to the following rules and conventions. This kind of mapping is possible both from SDMX to VTL and from VTL to SDMX, as better explained below. {{footnote}}Pleasenotethat thiskind of mappingis onlyanoptionatdisposalof the definerof VTL Transformations;infactit remainsalways possible to manipulatetheneeded parts of SDMX Dataflowsby meansof VTL operators(e.g. “sub”, “filter”, “calc”, “union”…), maintainingamappingone-to-onebetweenSDMX Dataflows andVTL Data Sets.{{/footnote}}407 +Therefore, in order to make the coding of VTL operations simpler when applied on parts of SDMX Dataflows, it is allowed to map distinct parts of a SDMX Dataflow to distinct VTL Data Sets according to the following rules and conventions. This kind of mapping is possible both from SDMX to VTL and from VTL to SDMX, as better explained below.[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[22~]^^>>path:#_ftn22]] 389 389 390 390 Given a SDMX Dataflow and some predefined Dimensions of its DataStructure, it is allowed to map the subsets of observations that have the same combination of values for such Dimensions to correspondent VTL datasets. 391 391 392 392 For example, assuming that the SDMX Dataflow DF1(1.0.0) has the Dimensions INDICATOR, TIME_PERIOD and COUNTRY, and that the user declares the Dimensions INDICATOR and COUNTRY as basis for the mapping (i.e. the mapping dimensions): the observations that have the same values for INDICATOR and COUNTRY would be mapped to the same VTL dataset (and vice-versa). In practice, this kind mapping is obtained like follows: 393 393 394 -* For a given SDMX Dataflow, the user (VTL definer) declares the DimensionComponents on which the mapping will be based, in a given order. {{footnote}}Thisdefinitionis madethrough theToVtlSubspace and ToVtlSpaceKey classes and/ortheFromVtlSuperspace and FromVtlSpaceKey classes, dependingon the directionofthemapping (“key”means “dimension”). Themappingof Dataflowsubsets canbeappliedindependentlyinthe two directions,also accordingto differentDimensions.When no Dimension is declared foragivendirection,itis assumed that the optionof mappingdifferent parts of a SDMX Dataflowto different VTL Data Setsisnotused.{{/footnote}}Following the example above, imagine that the user declares the Dimensions INDICATOR and COUNTRY.413 +* For a given SDMX Dataflow, the user (VTL definer) declares the DimensionComponents on which the mapping will be based, in a given order.[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[23~]^^>>path:#_ftn23]](%%) Following the example above, imagine that the user declares the Dimensions INDICATOR and COUNTRY. 395 395 * The VTL Data Set is given a name using a special notation also called “ordered concatenation” and composed of the following parts: 396 396 ** The reference to the SDMX Dataflow (expressed according to the rules described in the previous paragraphs, i.e. URN, abbreviated URN or another alias); for example DF(1.0.0); 397 -** a slash (“/”) as a separator; {{footnote}}Asaconsequence ofthis formalism,aslashin thenameoftheVTL DataSetassumesthespecific meaningof separatorbetweenthenameoftheDataflow andthevaluesof some ofitsDimensions.{{/footnote}}398 -** The reference to a specific part of the SDMX Dataflow above, expressed as the concatenation of the values that the SDMX DimensionComponents declared above must have, separated by dots (“.”) and written in the order in which these DimensionComponents are defined {{footnote}}Thisistheorderin whichthedimensionsare definedin theToVtlSpaceKey classorin theFromVtlSpaceKey class,dependingonthedirectionofthemapping.{{/footnote}}. For example416 +** a slash (“/”) as a separator; [[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[24~]^^>>path:#_ftn24]] 417 +** The reference to a specific part of the SDMX Dataflow above, expressed as the concatenation of the values that the SDMX DimensionComponents declared above must have, separated by dots (“.”) and written in the order in which these DimensionComponents are defined[[(% class="wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink wikiinternallink" %)^^~[25~]^^>>path:#_ftn25]](%%). For example 399 399 400 400 POPULATION.USA would mean that such a VTL Data Set is mapped to the SDMX observations for which the dimension //INDICATOR// is equal to POPULATION and the dimension //COUNTRY// is equal to USA. 401 401 ... ... @@ -928,4 +928,55 @@ 928 928 929 929 ---- 930 930 950 +[[~[1~]>>path:#_ftnref1]] The Validation and Transformation Language is a standard language designed and published under the SDMX initiative. VTL is described in the VTL User and Reference Guides available on the SDMX website [[https:~~/~~/sdmx.org>>url:https://sdmx.org/]][[.>>url:https://sdmx.org/]] 951 + 952 +[[~[2~]>>path:#_ftnref2]] In this chapter, in order to distinguish VTL and SDMX model artefacts, the VTL ones are written in the Arial font while the SDMX ones in Courier New 953 + 954 +[[~[3~]>>path:#_ftnref3]] See also the section "VTL-DL Rulesets" in the VTL Reference Manual. 955 + 956 +[[~[4~]>>path:#_ftnref4]] The VTLMappings are used also for User Defined Operators (UDO). Although UDOs are envisaged to be defined on generic operands, so that the specific artefacts to be manipulated are passed as parameters at their invocation, it is also possible that an UDO invokes directly some specific SDMX artefacts. These SDMX artefacts have to be mapped to the corresponding aliases used in the definition of the UDO through the VtlMappingScheme and VtlMapping classes as well. 957 + 958 +[[~[5~]>>path:#_ftnref5]] For a complete description of the structure of the URN see the SDMX 2.1 Standards - Section 5 - Registry Specifications, paragraph 6.2.2 ("Universal Resource Name (URN)"). 959 + 960 +[[~[6~]>>path:#_ftnref6]] The container-object-id can repeat and may not be present. 961 + 962 +[[~[7~]>>path:#_ftnref7]] i.e., the artefact belongs to a maintainable class 963 + 964 +[[~[8~]>>path:#_ftnref8]] Since these references to SDMX objects include non-permitted characters as per the VTL ID notation, they need to be included between single quotes, according to the VTL rules for irregular names. 965 + 966 +[[~[9~]>>path:#_ftnref9]] For the syntax of the VTL operators see the VTL Reference Manual 967 + 968 +[[~[10~]>>path:#_ftnref10]] In case the invoked artefact is a VTL component, which can be invoked only within the invocation of a VTL data set (SDMX Dataflow), the specific SDMX class-name (e.g. Dimension, TimeDimension, Measure or DataAttribute) can be deduced from the data structure of the SDMX Dataflow, which the component belongs to. 969 + 970 +[[~[11~]>>path:#_ftnref11]] If the Agency is composite (for example AgencyA.Dept1.Unit2), the agency is considered different even if only part of the composite name is different (for example AgencyA.Dept1.Unit3 is a different Agency than the previous one). Moreover the agency-id cannot be omitted in part (i.e., if a TransformationScheme owned by AgencyA.Dept1.Unit2 references an artefact coming from AgencyA.Dept1.Unit3, the specification of the agency-id becomes mandatory and must be complete, without omitting the possibly equal parts like AgencyA.Dept1) 971 + 972 +[[~[12~]>>path:#_ftnref12]] Single quotes are needed because this reference is not a VTL regular name. ^^19^^ Single quotes are not needed in this case because CL_FREQ is a VTL regular name. 973 + 974 +[[~[13~]>>path:#_ftnref13]] The result DFR(1.0.0) is be equal to DF1(1.0.0) save that the component SECTOR is called SEC 975 + 976 +[[~[14~]>>path:#_ftnref14]] Rulesets of this kind cannot be reused when the referenced Concept has a different representation. 977 + 978 +[[~[15~]>>path:#_ftnref15]] See also the section "VTL-DL Rulesets" in the VTL Reference Manual. 979 + 980 +[[~[16~]>>path:#_ftnref16]] If a calculated artefact is persistent, it needs a persistent definition, i.e. a SDMX definition in a SDMX environment. In addition, possible calculated artefact that are not persistent may require a SDMX definition, for example when the result of a nonpersistent calculation is disseminated through SDMX tools (like an inquiry tool). 981 + 982 +[[~[17~]>>path:#_ftnref17]] See the VTL 2.0 User Manual 983 + 984 +[[~[18~]>>path:#_ftnref18]] See the SDMX Standards Section 2 – Information Model 985 + 986 +[[~[19~]>>path:#_ftnref19]] Besides the mapping between one SDMX Dataflow and one VTL Data Set, it is also possible to map distinct parts of a SDMX Dataflow to different VTL Data Set, as explained in a following paragraph. 987 + 988 +[[~[20~]>>path:#_ftnref20]] E.g., if in the data structure there exist 3 Dimensions C,D,E having the role of MeasureDimension, they should be considered as a joint MeasureDimension Z=(C,D,E); therefore when the description says “each possible value Cj of the MeasureDimension …” it means “each possible combination of values (Cj, Dk, Ew) of the joint MeasureDimension Z=(C,D,E)”. 989 + 990 +[[~[21~]>>path:#_ftnref21]] A typical example of this kind is the validation, and more in general the manipulation, of individual time series belonging to the same Dataflow, identifiable through the DimensionComponents of the Dataflow except the TimeDimension. The coding of these kind of operations might be simplified by mapping distinct time series (i.e. different parts of a SDMX Dataflow) to distinct VTL Data Sets. 991 + 992 +[[~[22~]>>path:#_ftnref22]] Please note that this kind of mapping is only an option at disposal of the definer of VTL Transformations; in fact it remains always possible to manipulate the needed parts of SDMX Dataflows by means of VTL operators (e.g. “sub”, “filter”, “calc”, “union” …), maintaining a mapping one-to-one between SDMX Dataflows and VTL Data Sets. 993 + 994 +[[~[23~]>>path:#_ftnref23]] This definition is made through the ToVtlSubspace and ToVtlSpaceKey classes and/or the FromVtlSuperspace and FromVtlSpaceKey classes, depending on the direction of the mapping (“key” means “dimension”). The mapping of Dataflow subsets can be applied independently in the two directions, also according to different Dimensions. When no Dimension is declared for a given direction, it is assumed that the option of mapping different parts of a SDMX Dataflow to different VTL Data Sets is not used. 995 + 996 +[[~[24~]>>path:#_ftnref24]] As a consequence of this formalism, a slash in the name of the VTL Data Set assumes the specific meaning of separator between the name of the Dataflow and the values of some of its Dimensions. 997 + 998 +[[~[25~]>>path:#_ftnref25]] This is the order in which the dimensions are defined in the ToVtlSpaceKey class or in the FromVtlSpaceKey class, depending on the direction of the mapping. 999 + 1000 + 931 931 {{putFootnotes/}}