Changes for page 12 Validation and Transformation Language (VTL)
Last modified by Helena on 2025/09/10 11:19
Summary
-
Page properties (1 modified, 0 added, 0 removed)
-
Attachments (0 modified, 6 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -14,10 +14,8 @@ 14 14 15 15 The VTL language can be applied to SDMX artefacts by mapping the SDMX IM model artefacts to the model artefacts that VTL can manipulate{{footnote}}In this chapter, in order to distinguish VTL and SDMX model artefacts, the VTL ones are written in the Arial font while the SDMX ones in Courier New{{/footnote}}. Thus, the SDMX artefacts can be used in VTL as inputs and/or outputs of Transformations. It is important to be aware that the artefacts do not always have the same names in the SDMX and VTL IMs, nor do they always have the same meaning. The more evident example is given by the SDMX Dataset and the VTL "Data Set", which do not correspond one another: as a matter of fact, the VTL "Data Set" maps to the SDMX "Dataflow", while the SDMX "Dataset" has no explicit mapping to VTL (such an abstraction is not needed in the definition of VTL Transformations). A SDMX "Dataset", however, is an instance of a SDMX "Dataflow" and can be the artefact on which the VTL transformations are executed (i.e., the Transformations are defined on Dataflows and are applied to Dataflow instances that can be Datasets). 16 16 17 -The VTL programs (Transformation Schemes) are represented in SDMX through the TransformationScheme maintainable class which is composed of 17 +The VTL programs (Transformation Schemes) are represented in SDMX through the TransformationScheme maintainable class which is composed of Transformation (nameable artefact). Each Transformation assigns the outcome of the evaluation of a VTL expression to a result. 18 18 19 -Transformation (nameable artefact). Each Transformation assigns the outcome of the evaluation of a VTL expression to a result. 20 - 21 21 This section does not explain the VTL language or any of the content published in the VTL guides. Rather, this is a description of how the VTL can be used in the SDMX context and applied to SDMX artefacts. 22 22 23 23 == 12.2 References to SDMX artefacts from VTL statements == ... ... @@ -28,10 +28,8 @@ 28 28 29 29 The alias of an SDMX artefact can be its URN (Universal Resource Name), an abbreviation of its URN or another user-defined name. 30 30 31 -In any case, the aliases used in the VTL Transformations have to be mapped to the 29 +In any case, the aliases used in the VTL Transformations have to be mapped to the SDMX artefacts through the VtlMappingScheme and VtlMapping classes (see the section of the SDMX IM relevant to the VTL). A VtlMapping allows specifying the aliases to be used in the VTL Transformations, Rulesets{{footnote}}See also the section "VTL-DL Rulesets" in the VTL Reference Manual.{{/footnote}} or User Defined Operators{{footnote}}The VTLMappings are used also for User Defined Operators (UDO). Although UDOs are envisaged to be defined on generic operands, so that the specific artefacts to be manipulated are passed as parameters at their invocation, it is also possible that an UDO invokes directly some specific SDMX artefacts. These SDMX artefacts have to be mapped to the corresponding aliases used in the definition of the UDO through the VtlMappingScheme and VtlMapping classes as well.{{/footnote}} to reference SDMX artefacts. A VtlMappingScheme is a container for zero or more VtlMapping. 32 32 33 -SDMX artefacts through the VtlMappingScheme and VtlMapping classes (see the section of the SDMX IM relevant to the VTL). A VtlMapping allows specifying the aliases to be used in the VTL Transformations, Rulesets{{footnote}}See also the section "VTL-DL Rulesets" in the VTL Reference Manual.{{/footnote}} or User Defined Operators{{footnote}}The VTLMappings are used also for User Defined Operators (UDO). Although UDOs are envisaged to be defined on generic operands, so that the specific artefacts to be manipulated are passed as parameters at their invocation, it is also possible that an UDO invokes directly some specific SDMX artefacts. These SDMX artefacts have to be mapped to the corresponding aliases used in the definition of the UDO through the VtlMappingScheme and VtlMapping classes as well.{{/footnote}} to reference SDMX artefacts. A VtlMappingScheme is a container for zero or more VtlMapping. 34 - 35 35 The correspondence between an alias and a SDMX artefact must be one-to-one, meaning that a generic alias identifies one and just one SDMX artefact while a SDMX artefact is identified by one and just one alias. In other words, within a VtlMappingScheme an artefact can have just one alias and different artefacts cannot have the same alias. 36 36 37 37 The references through the URN and the abbreviated URN are described in the following paragraphs. ... ... @@ -202,7 +202,7 @@ 202 202 203 203 === 12.3.3 Mapping from SDMX to VTL data structures === 204 204 205 - **12.3.3.1 Basic Mapping**201 +==== 12.3.3.1 Basic Mapping ==== 206 206 207 207 The main mapping method from SDMX to VTL is called **Basic **mapping. This is considered as the default mapping method and is applied unless a different method is specified through the VtlMappingScheme and VtlDataflowMapping classes. When transforming **from SDMX to VTL**, this method consists in leaving the components unchanged and maintaining their names and roles, according to the following table: 208 208 ... ... @@ -232,18 +232,11 @@ 232 232 The SDMX structures that contain a MeasureDimension are mapped as described below (this mapping is equivalent to a pivoting operation): 233 233 234 234 * A SDMX simple dimension becomes a VTL (simple) identifier and a SDMX TimeDimension becomes a VTL (time) identifier; 235 -* Each possible Code Cj of the SDMX MeasureDimension is mapped to a VTL Measure, having the same name as the SDMX Code (i.e. Cj); the VTL Measure Cj is a new VTL component even if the SDMX data structure has not such a 236 - 237 -Component; 238 - 231 +* Each possible Code Cj of the SDMX MeasureDimension is mapped to a VTL Measure, having the same name as the SDMX Code (i.e. Cj); the VTL Measure Cj is a new VTL component even if the SDMX data structure has not such a Component; 239 239 * The SDMX MeasureDimension is not mapped to VTL (it disappears in the VTL Data Structure); 240 240 * The SDMX Measure is not mapped to VTL as well (it disappears in the VTL Data Structure); 241 241 * An SDMX DataAttribute is mapped in different ways according to its AttributeRelationship: 242 -** If, according to the SDMX AttributeRelationship, the values of the DataAttribute do not depend on the values of the MeasureDimension, the SDMX DataAttribute becomes a VTL Attribute having the same name. This happens if the 243 - 244 -AttributeRelationship is not specified (i.e. the DataAttribute does not depend on any DimensionComponent and therefore is at data set level), or if it refers to a set (or a group) of dimensions which does not include the MeasureDimension; 245 - 246 -* 235 +** If, according to the SDMX AttributeRelationship, the values of the DataAttribute do not depend on the values of the MeasureDimension, the SDMX DataAttribute becomes a VTL Attribute having the same name. This happens if the AttributeRelationship is not specified (i.e. the DataAttribute does not depend on any DimensionComponent and therefore is at data set level), or if it refers to a set (or a group) of dimensions which does not include the MeasureDimension; 247 247 ** Otherwise, if, according to the SDMX AttributeRelationship, the values of the DataAttribute depend on the MeasureDimension, the SDMX DataAttribute is mapped to one VTL Attribute for each possible Code of the SDMX MeasureDimension. By default, the names of the VTL Attributes are obtained by concatenating the name of the SDMX DataAttribute and the names of the correspondent Code of the MeasureDimension separated by underscore. For example, if the SDMX DataAttribute is named DA and the possible Codes of the SDMX MeasureDimension are named C1, C2, …, Cn, then the corresponding VTL Attributes will be named DA_C1, DA_C2, …, DA_Cn (if different names are desired, they can be achieved afterwards by renaming the Attributes through VTL operators). 248 248 ** Like in the Basic mapping, the resulting VTL Attributes are considered as dependent on all the VTL identifiers (i.e. "at data point / observation level"), because VTL does not have the SDMX notion of Attribute Relationship. 249 249 ... ... @@ -266,10 +266,7 @@ 266 266 At observation / data point level, calling Cj (j=1, … n) the j^^th^^ Code of the MeasureDimension: 267 267 268 268 * The set of SDMX observations having the same values for all the Dimensions except than the MeasureDimension become one multi-measure VTL Data Point, having one Measure for each Code Cj of the SDMX MeasureDimension; 269 -* The values of the SDMX simple Dimensions, TimeDimension and DataAttributes not depending on the MeasureDimension (these components by definition have always the same values for all the observations of the set above) become the values of the corresponding VTL (simple) 270 - 271 -Identifiers, (time) Identifier and Attributes. 272 - 258 +* The values of the SDMX simple Dimensions, TimeDimension and DataAttributes not depending on the MeasureDimension (these components by definition have always the same values for all the observations of the set above) become the values of the corresponding VTL (simple) Identifiers, (time) Identifier and Attributes. 273 273 * The value of the Measure of the SDMX observation belonging to the set above and having MeasureDimension=Cj becomes the value of the VTL Measure Cj 274 274 * For the SDMX DataAttributes depending on the MeasureDimension, the value of the DataAttribute DA of the SDMX observation belonging to the set above and having MeasureDimension=Cj becomes the value of the VTL Attribute DA_Cj 275 275 ... ... @@ -362,7 +362,7 @@ 362 362 The mapping table is the following: 363 363 364 364 (% style="width:689.294px" %) 365 -|(% style="width:344px" %)VTL|(% style="width:341px" %)SDMX 351 +|(% style="width:344px" %)**VTL**|(% style="width:341px" %)**SDMX** 366 366 |(% style="width:344px" %)(Simple) Identifier|(% style="width:341px" %)Dimension 367 367 |(% style="width:344px" %)(Time) Identifier|(% style="width:341px" %)TimeDimension 368 368 |(% style="width:344px" %)Some Measures|(% style="width:341px" %)Measure ... ... @@ -422,26 +422,14 @@ 422 422 423 423 SDMX Dataflow having INDICATOR=//INDICATORvalue //and COUNTRY=// COUNTRYvalue//. For example, the VTL dataset ‘DF1(1.0.0)/POPULATION.USA’ would contain all the observations of DF1(1.0.0) having INDICATOR = POPULATION and COUNTRY = USA. 424 424 425 -In order to obtain the data structure of these VTL Data Sets from the SDMX one, it is assumed that the SDMX DimensionComponents on which the mapping is based are dropped, i.e. not maintained in the VTL data structure; this is possible because their values are fixed for each one of the invoked VTL Data Sets{{footnote}}If these DimensionComponents would not be dropped, the various VTL Data Sets resulting from this kind of mapping would have non-matching values for the Identifiers corresponding to the mapping Dimensions (e.g. POPULATION and COUNTRY). As a consequence, taking into account that the typical binary VTL operations at dataset level (+, -, *, / and so on) are executed on the observations having matching values for the identifiers, it would not be possible to compose the resulting VTL datasets one another (e.g. it would not be possible to calculate the population ratio between USA and CANADA).{{/footnote}}. After that, the mapping method from SDMX to VTL specified for the Dataflow DF1(1.0.0) is applied (i.e. 411 +In order to obtain the data structure of these VTL Data Sets from the SDMX one, it is assumed that the SDMX DimensionComponents on which the mapping is based are dropped, i.e. not maintained in the VTL data structure; this is possible because their values are fixed for each one of the invoked VTL Data Sets{{footnote}}If these DimensionComponents would not be dropped, the various VTL Data Sets resulting from this kind of mapping would have non-matching values for the Identifiers corresponding to the mapping Dimensions (e.g. POPULATION and COUNTRY). As a consequence, taking into account that the typical binary VTL operations at dataset level (+, -, *, / and so on) are executed on the observations having matching values for the identifiers, it would not be possible to compose the resulting VTL datasets one another (e.g. it would not be possible to calculate the population ratio between USA and CANADA).{{/footnote}}. After that, the mapping method from SDMX to VTL specified for the Dataflow DF1(1.0.0) is applied (i.e. basic, pivot …). 426 426 427 -basi c, pivot…).413 +In the example above, for all the datasets of the kind ‘DF1(1.0.0)///INDICATORvalue//.//COUNTRYvalue//’, the dimensions INDICATOR and COUNTRY would be dropped so that the data structure of all the resulting VTL Data Sets would have the identifier TIME_PERIOD only. 428 428 429 -In the example above, for all the datasets of the kind 430 - 431 -‘DF1(1.0.0)///INDICATORvalue//.//COUNTRYvalue//’, the dimensions INDICATOR and COUNTRY would be dropped so that the data structure of all the resulting VTL Data Sets would have the identifier TIME_PERIOD only. 432 - 433 433 It should be noted that the desired VTL Data Sets (i.e. of the kind ‘DF1(1.0.0)/// INDICATORvalue//.//COUNTRYvalue//’) can be obtained also by applying the VTL operator “**sub**” (subspace) to the Dataflow DF1(1.0.0), like in the following VTL expression: 434 434 435 - ‘DF1(1.0.0)/POPULATION.USA’ :=417 +[[image:1747388275998-621.png]] 436 436 437 -DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“USA” ]; 438 - 439 -‘DF1(1.0.0)/POPULATION.CANADA’ := 440 - 441 -DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“CANADA” ]; 442 - 443 -… … … 444 - 445 445 In fact the VTL operator “sub” has exactly the same behaviour. Therefore, mapping different parts of a SDMX Dataflow to different VTL Data Sets in the direction from SDMX to VTL through the ordered concatenation notation is equivalent to a proper use of the operator “**sub**” on such a Dataflow.{{footnote}}In case the ordered concatenation notation is used, the VTL Transformation described above, e.g. ‘DF1(1.0)/POPULATION.USA’ := DF1(1.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“USA”], is implicitly executed. In order to test the overall compliance of the VTL program to the VTL consistency rules, it has to be considered as part of the VTL program even if it is not explicitly coded.{{/footnote}} 446 446 447 447 In the direction from SDMX to VTL it is allowed to omit the value of one or more ... ... @@ -452,10 +452,8 @@ 452 452 453 453 This is equivalent to the application of the VTL “sub” operator only to the identifier //INDICATOR//: 454 454 455 - ‘DF1(1.0.0)/POPULATION.’ :=429 +[[image:1747388244829-693.png]] 456 456 457 -DF1(1.0.0) [ sub INDICATOR=“POPULATION” ]; 458 - 459 459 Therefore the VTL Data Set ‘DF1(1.0.0)/POPULATION.’ would have the identifiers COUNTRY and TIME_PERIOD. 460 460 461 461 Heterogeneous invocations of the same Dataflow are allowed, i.e. omitting different ... ... @@ -481,54 +481,18 @@ 481 481 482 482 Some examples follow, for some specific values of INDICATOR and COUNTRY: 483 483 484 - ‘DF2(1.0.0)/GDPPERCAPITA.USA’ <- expression11; ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ <- expression12;456 +[[image:1747388222879-916.png]] 485 485 486 - … … …458 +[[image:1747388206717-256.png]] 487 487 488 -‘DF2(1.0.0)/POPGROWTH.USA’ <- expression21; 489 - 490 -‘DF2(1.0.0)/POPGROWTH.CANADA’ <- expression22; 491 - 492 -… … … 493 - 494 494 As said, it is assumed that these VTL derived Data Sets have the TIME_PERIOD as the only identifier. In the mapping from VTL to SMDX, the Dimensions INDICATOR and COUNTRY are added to the VTL data structure on order to obtain the SDMX one, with the following values respectively: 495 495 496 - VTL dataset INDICATOR value COUNTRY value462 +[[image:1747388148322-387.png]] 497 497 498 -‘DF2(1.0.0)/GDPPERCAPITA.USA’ GDPPERCAPITA USA 499 - 500 -‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ GDPPERCAPITA CANADA … … … 501 - 502 -‘DF2(1.0.0)/POPGROWTH.USA’ POPGROWTH USA 503 - 504 -‘DF2(1.0.0)/POPGROWTH.CANADA’ POPGROWTH CANADA 505 - 506 -… … … 507 - 508 508 It should be noted that the application of this many-to-one mapping from VTL to SDMX is equivalent to an appropriate sequence of VTL Transformations. These use the VTL operator “calc” to add the proper VTL identifiers (in the example, INDICATOR and COUNTRY) and to assign to them the proper values and the operator “union” in order to obtain the final VTL dataset (in the example DF2(1.0.0)), that can be mapped oneto-one to the homonymous SDMX Dataflow. Following the same example, these VTL Transformations would be: 509 509 510 - DF2bis_GDPPERCAPITA_USA:= ‘DF2(1.0.0)/GDPPERCAPITA.USA’ [calc identifier INDICATOR := ”GDPPERCAPITA”, identifier COUNTRY := ”USA”];466 +[[image:1747388179021-814.png]] 511 511 512 -DF2bis_GDPPERCAPITA_CANADA := ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ [calc identifier INDICATOR:=”GDPPERCAPITA”, identifier COUNTRY:=”CANADA”]; … … … 513 - 514 -DF2bis_POPGROWTH_USA := ‘DF2(1.0.0)/POPGROWTH.USA’ 515 - 516 -[calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”USA”]; 517 - 518 -DF2bis_POPGROWTH_CANADA’ := ‘DF2(1.0.0)/POPGROWTH.CANADA’ [calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”CANADA”]; … … … 519 - 520 -DF2(1.0) <- UNION (DF2bis_GDPPERCAPITA_USA’, 521 - 522 -DF2bis_GDPPERCAPITA_CANADA’, 523 - 524 -… , 525 - 526 -DF2bis_POPGROWTH_USA’, 527 - 528 -DF2bis_POPGROWTH_CANADA’ 529 - 530 -…); 531 - 532 532 In other words, starting from the datasets explicitly calculated through VTL (in the example ‘DF2(1.0)/GDPPERCAPITA.USA’ and so on), the first step consists in calculating other (non-persistent) VTL datasets (in the example 533 533 534 534 DF2bis_GDPPERCAPITA_USA and so on) by adding the identifiers INDICATOR and COUNTRY with the desired values (//INDICATORvalue// and //COUNTRYvalue)//. Finally, all these non-persistent Data Sets are united and give the final result DF2(1.0){{footnote}}The result is persistent in this example but it can be also non persistent if needed.{{/footnote}}, which can be mapped one-to-one to the homonymous SDMX Dataflow having the dimension components TIME_PERIOD, INDICATOR and COUNTRY. ... ... @@ -542,37 +542,30 @@ 542 542 With reference to the VTL “model for Variables and Value domains”, the following additional mappings have to be considered: 543 543 544 544 (% style="width:1170.29px" %) 545 -|**VTL**|(% style="width:7 54px" %)**SDMX**546 -|**Data Set Component**|(% style="width:7 54px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a Component (either a DimensionComponent or a Measure or a DataAttribute) belonging to one specific Dataflow{{footnote}}Through SDMX Constraints, it is possible to specify the values that a Component of a Dataflow can assume.{{/footnote}}547 -|**Represented Variable**|(% style="width:7 54px" %)(((481 +|(% style="width:392px" %)**VTL**|(% style="width:776px" %)**SDMX** 482 +|(% style="width:392px" %)**Data Set Component**|(% style="width:776px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a Component (either a DimensionComponent or a Measure or a DataAttribute) belonging to one specific Dataflow{{footnote}}Through SDMX Constraints, it is possible to specify the values that a Component of a Dataflow can assume.{{/footnote}} 483 +|(% style="width:392px" %)**Represented Variable**|(% style="width:776px" %)((( 548 548 **Concept** with a definite 549 549 550 550 Representation 551 551 ))) 552 -|**Value Domain**|(% style="width:754px" %)((( 553 -**Representation** (see the Structure 554 - 555 -Pattern in the Base Package) 488 +|(% style="width:392px" %)**Value Domain**|(% style="width:776px" %)((( 489 +**Representation** (see the Structure Pattern in the Base Package) 556 556 ))) 557 -|**Enumerated Value Domain / Code List**|(% style="width:7 54px" %)**Codelist**558 -|**Code**|(% style="width:7 54px" %)(((491 +|(% style="width:392px" %)**Enumerated Value Domain / Code List**|(% style="width:776px" %)**Codelist** 492 +|(% style="width:392px" %)**Code**|(% style="width:776px" %)((( 559 559 **Code** (for enumerated 560 560 561 561 DimensionComponent, Measure, DataAttribute) 562 562 ))) 563 -|**Described Value Domain**|(% style="width:754px" %)((( 564 -non-enumerated** Representation** 565 - 566 -(having Facets / ExtendedFacets, see the Structure Pattern in the Base Package) 497 +|(% style="width:392px" %)**Described Value Domain**|(% style="width:776px" %)((( 498 +non-enumerated** Representation **(having Facets / ExtendedFacets, see the Structure Pattern in the Base Package) 567 567 ))) 568 -|**Value**|(% style="width:754px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a **Code** of a Codelist (for enumerated Representations) or 569 -| |(% style="width:754px" %)((( 570 -to a valid **value **(for non-enumerated** **Representations) 571 -))) 572 -|**Value Domain Subset / Set**|(% style="width:754px" %)This abstraction does not exist in SDMX 573 -|**Enumerated Value Domain Subset / Enumerated Set**|(% style="width:754px" %)This abstraction does not exist in SDMX 574 -|**Described Value Domain Subset / Described Set**|(% style="width:754px" %)This abstraction does not exist in SDMX 575 -|**Set list**|(% style="width:754px" %)This abstraction does not exist in SDMX 500 +|(% style="width:392px" %)**Value**|(% style="width:776px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a **Code** of a Codelist (for enumerated Representations) or to a valid **value **(for non-enumerated** **Representations) 501 +|(% style="width:392px" %)**Value Domain Subset / Set**|(% style="width:776px" %)This abstraction does not exist in SDMX 502 +|(% style="width:392px" %)**Enumerated Value Domain Subset / Enumerated Set**|(% style="width:776px" %)This abstraction does not exist in SDMX 503 +|(% style="width:392px" %)**Described Value Domain Subset / Described Set**|(% style="width:776px" %)This abstraction does not exist in SDMX 504 +|(% style="width:392px" %)**Set list**|(% style="width:776px" %)This abstraction does not exist in SDMX 576 576 577 577 The main difference between VTL and SDMX relies on the fact that the VTL artefacts for defining subsets of Value Domains do not exist in SDMX, therefore the VTL features for referring to predefined subsets are not available in SDMX. These artefacts are the Value Domain Subset (or Set), either enumerated or described, the Set List (list of values belonging to enumerated subsets) and the Data Set Component (aimed at defining the set of values that the Component of a Data Set can take, possibly a subset of the codes of Value Domain). 578 578
- 1747388148322-387.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +23.0 KB - Content
- 1747388179021-814.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +66.2 KB - Content
- 1747388206717-256.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +10.3 KB - Content
- 1747388222879-916.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +9.9 KB - Content
- 1747388244829-693.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +7.4 KB - Content
- 1747388275998-621.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.helena - Size
-
... ... @@ -1,0 +1,1 @@ 1 +19.2 KB - Content