Changes for page 12 Validation and Transformation Language (VTL)
Last modified by Artur on 2025/09/10 11:19
Summary
-
Page properties (1 modified, 0 added, 0 removed)
-
Attachments (0 modified, 0 added, 1 removed)
Details
- Page properties
-
- Content
-
... ... @@ -410,14 +410,16 @@ 410 410 411 411 It should be noted that the desired VTL Data Sets (i.e. of the kind ‘DF1(1.0.0)/// INDICATORvalue//.//COUNTRYvalue//’) can be obtained also by applying the VTL operator “**sub**” (subspace) to the Dataflow DF1(1.0.0), like in the following VTL expression: 412 412 413 -> ‘DF1(1.0.0)/POPULATION.USA’ := 414 -> DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“USA” ]; 415 -> 416 -> ‘DF1(1.0.0)/POPULATION.CANADA’ := 417 -> DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“CANADA” ]; 418 -> 419 -> … … … 413 +‘DF1(1.0.0)/POPULATION.USA’ := 420 420 415 +DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“USA” ]; 416 + 417 +‘DF1(1.0.0)/POPULATION.CANADA’ := 418 + 419 +DF1(1.0.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“CANADA” ]; 420 + 421 +… … … 422 + 421 421 In fact the VTL operator “sub” has exactly the same behaviour. Therefore, mapping different parts of a SDMX Dataflow to different VTL Data Sets in the direction from SDMX to VTL through the ordered concatenation notation is equivalent to a proper use of the operator “**sub**” on such a Dataflow.{{footnote}}In case the ordered concatenation notation is used, the VTL Transformation described above, e.g. ‘DF1(1.0)/POPULATION.USA’ := DF1(1.0) [ sub INDICATOR=“POPULATION”, COUNTRY=“USA”], is implicitly executed. In order to test the overall compliance of the VTL program to the VTL consistency rules, it has to be considered as part of the VTL program even if it is not explicitly coded.{{/footnote}} 422 422 423 423 In the direction from SDMX to VTL it is allowed to omit the value of one or more DimensionComponents on which the mapping is based, but maintaining all the separating dots (therefore it may happen to find two or more consecutive dots and dots in the beginning or in the end). The absence of value means that for the corresponding Dimension all the values are kept and the Dimension is not dropped. ... ... @@ -426,9 +426,10 @@ 426 426 427 427 This is equivalent to the application of the VTL “sub” operator only to the identifier //INDICATOR//: 428 428 429 -> ‘DF1(1.0.0)/POPULATION.’ := 430 -> DF1(1.0.0) [ sub INDICATOR=“POPULATION” ]; 431 +‘DF1(1.0.0)/POPULATION.’ := 431 431 433 +DF1(1.0.0) [ sub INDICATOR=“POPULATION” ]; 434 + 432 432 Therefore the VTL Data Set ‘DF1(1.0.0)/POPULATION.’ would have the identifiers COUNTRY and TIME_PERIOD. 433 433 434 434 Heterogeneous invocations of the same Dataflow are allowed, i.e. omitting different Dimensions in different invocations. ... ... @@ -446,38 +446,41 @@ 446 446 447 447 The corresponding VTL Transformations, assuming that the result needs to be persistent, would be of this kind:{{footnote}}the symbol of the VTL persistent assignment is used (<-){{/footnote}} 448 448 449 - >‘DF2(1.0.0)/INDICATORvalue.COUNTRYvalue’ <- expression452 +‘DF2(1.0.0)/INDICATORvalue.COUNTRYvalue’ <- expression 450 450 451 451 Some examples follow, for some specific values of INDICATOR and COUNTRY: 452 452 453 -> ‘DF2(1.0.0)/GDPPERCAPITA.USA’ <- expression11; ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ <- expression12; 454 -> … … … 455 -> ‘DF2(1.0.0)/POPGROWTH.USA’ <- expression21; 456 -> ‘DF2(1.0.0)/POPGROWTH.CANADA’ <- expression22; 457 -> … … … 456 +‘DF2(1.0.0)/GDPPERCAPITA.USA’ <- expression11; ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ <- expression12; 457 +… … … 458 458 459 +‘DF2(1.0.0)/POPGROWTH.USA’ <- expression21; 460 +‘DF2(1.0.0)/POPGROWTH.CANADA’ <- expression22; 461 +… … … 462 + 459 459 As said, it is assumed that these VTL derived Data Sets have the TIME_PERIOD as the only identifier. In the mapping from VTL to SMDX, the Dimensions INDICATOR and COUNTRY are added to the VTL data structure on order to obtain the SDMX one, with the following values respectively: 460 460 461 461 VTL dataset INDICATOR value COUNTRY value 462 462 463 -> ‘DF2(1.0.0)/GDPPERCAPITA.USA’ GDPPERCAPITA USA 464 -> ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ GDPPERCAPITA CANADA … … … 465 -> ‘DF2(1.0.0)/POPGROWTH.USA’ POPGROWTH USA 466 -> ‘DF2(1.0.0)/POPGROWTH.CANADA’ POPGROWTH CANADA 467 -> … … … 467 +‘DF2(1.0.0)/GDPPERCAPITA.USA’ GDPPERCAPITA USA 468 +‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ GDPPERCAPITA CANADA … … … 469 +‘DF2(1.0.0)/POPGROWTH.USA’ POPGROWTH USA 470 +‘DF2(1.0.0)/POPGROWTH.CANADA’ POPGROWTH CANADA 468 468 472 +… … … 473 + 469 469 It should be noted that the application of this many-to-one mapping from VTL to SDMX is equivalent to an appropriate sequence of VTL Transformations. These use the VTL operator “calc” to add the proper VTL identifiers (in the example, INDICATOR and COUNTRY) and to assign to them the proper values and the operator “union” in order to obtain the final VTL dataset (in the example DF2(1.0.0)), that can be mapped oneto-one to the homonymous SDMX Dataflow. Following the same example, these VTL Transformations would be: 470 470 471 -> DF2bis_GDPPERCAPITA_USA := ‘DF2(1.0.0)/GDPPERCAPITA.USA’ [calc identifier INDICATOR := ”GDPPERCAPITA”, identifier COUNTRY := ”USA”]; 472 -> DF2bis_GDPPERCAPITA_CANADA := ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ [calc identifier INDICATOR:=”GDPPERCAPITA”, identifier COUNTRY:=”CANADA”]; … … … 473 -> DF2bis_POPGROWTH_USA := ‘DF2(1.0.0)/POPGROWTH.USA’ [calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”USA”]; 474 -> DF2bis_POPGROWTH_CANADA’ := ‘DF2(1.0.0)/POPGROWTH.CANADA’ [calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”CANADA”]; … … … 475 -> DF2(1.0) <- UNION (DF2bis_GDPPERCAPITA_USA’, 476 -> DF2bis_GDPPERCAPITA_CANADA’, 477 -> … , 478 -> DF2bis_POPGROWTH_USA’, 479 -> DF2bis_POPGROWTH_CANADA’ 480 -> …); 476 +DF2bis_GDPPERCAPITA_USA := ‘DF2(1.0.0)/GDPPERCAPITA.USA’ [calc identifier INDICATOR := ”GDPPERCAPITA”, identifier COUNTRY := ”USA”]; 477 +DF2bis_GDPPERCAPITA_CANADA := ‘DF2(1.0.0)/GDPPERCAPITA.CANADA’ [calc identifier INDICATOR:=”GDPPERCAPITA”, identifier COUNTRY:=”CANADA”]; … … … 478 +DF2bis_POPGROWTH_USA := ‘DF2(1.0.0)/POPGROWTH.USA’ 479 +[calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”USA”]; 480 +DF2bis_POPGROWTH_CANADA’ := ‘DF2(1.0.0)/POPGROWTH.CANADA’ [calc identifier INDICATOR := ”POPGROWTH”, identifier COUNTRY := ”CANADA”]; … … … 481 +DF2(1.0) <- UNION (DF2bis_GDPPERCAPITA_USA’, 482 +DF2bis_GDPPERCAPITA_CANADA’, 483 +… , 484 +DF2bis_POPGROWTH_USA’, 485 +DF2bis_POPGROWTH_CANADA’ 486 +…); 481 481 482 482 In other words, starting from the datasets explicitly calculated through VTL (in the example ‘DF2(1.0)/GDPPERCAPITA.USA’ and so on), the first step consists in calculating other (non-persistent) VTL datasets (in the example DF2bis_GDPPERCAPITA_USA and so on) by adding the identifiers INDICATOR and COUNTRY with the desired values (//INDICATORvalue// and //COUNTRYvalue)//. Finally, all these non-persistent Data Sets are united and give the final result DF2(1.0){{footnote}}The result is persistent in this example but it can be also non persistent if needed.{{/footnote}}, which can be mapped one-to-one to the homonymous SDMX Dataflow having the dimension components TIME_PERIOD, INDICATOR and COUNTRY. 483 483 ... ... @@ -489,26 +489,25 @@ 489 489 490 490 With reference to the VTL “model for Variables and Value domains”, the following additional mappings have to be considered: 491 491 492 -(% style="width:706.294px" %) 493 -|(% style="width:257px" %)VTL|(% style="width:446px" %)SDMX 494 -|(% style="width:257px" %)**Data Set Component**|(% style="width:446px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a Component (either a DimensionComponent or a Measure or a DataAttribute) belonging to one specific Dataflow^^43^^ 495 -|(% style="width:257px" %)**Represented Variable**|(% style="width:446px" %)**Concept** with a definite Representation 496 -|(% style="width:257px" %)**Value Domain**|(% style="width:446px" %)((( 498 +|VTL|SDMX 499 +|**Data Set Component**|Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a Component (either a DimensionComponent or a Measure or a DataAttribute) belonging to one specific Dataflow^^43^^ 500 +|**Represented Variable**|**Concept** with a definite Representation 501 +|**Value Domain**|((( 497 497 **Representation** (see the Structure 498 498 Pattern in the Base Package) 499 499 ))) 500 -| (% style="width:257px" %)**Enumerated Value Domain / Code List**|(% style="width:446px" %)**Codelist**501 -| (% style="width:257px" %)**Code**|(% style="width:446px" %)**Code** (for enumerated DimensionComponent, Measure, DataAttribute)502 -| (% style="width:257px" %)**Described Value Domain**|(% style="width:446px" %)(((505 +|**Enumerated Value Domain / Code List**|**Codelist** 506 +|**Code**|**Code** (for enumerated DimensionComponent, Measure, DataAttribute) 507 +|**Described Value Domain**|((( 503 503 non-enumerated** Representation** 504 504 (having Facets / ExtendedFacets, see the Structure Pattern in the Base Package) 505 505 ))) 506 -| (% style="width:257px" %)**Value**|(% style="width:446px" %)Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a **Code** of a Codelist (for enumerated Representations) or507 -| (%style="width:257px" %)|(% style="width:446px" %)to a valid **value **(for non-enumerated** **Representations)508 -| (% style="width:257px" %)**Value Domain Subset / Set**|(% style="width:446px" %)This abstraction does not exist in SDMX509 -| (% style="width:257px" %)**Enumerated Value Domain Subset / Enumerated Set**|(% style="width:446px" %)This abstraction does not exist in SDMX510 -| (% style="width:257px" %)**Described Value Domain Subset / Described Set**|(% style="width:446px" %)This abstraction does not exist in SDMX511 -| (% style="width:257px" %)**Set list**|(% style="width:446px" %)This abstraction does not exist in SDMX511 +|**Value**|Although this abstraction exists in SDMX, it does not have an explicit definition and correspond to a **Code** of a Codelist (for enumerated Representations) or 512 +| |to a valid **value **(for non-enumerated** **Representations) 513 +|**Value Domain Subset / Set**|This abstraction does not exist in SDMX 514 +|**Enumerated Value Domain Subset / Enumerated Set**|This abstraction does not exist in SDMX 515 +|**Described Value Domain Subset / Described Set**|This abstraction does not exist in SDMX 516 +|**Set list**|This abstraction does not exist in SDMX 512 512 513 513 The main difference between VTL and SDMX relies on the fact that the VTL artefacts for defining subsets of Value Domains do not exist in SDMX, therefore the VTL features for referring to predefined subsets are not available in SDMX. These artefacts are the Value Domain Subset (or Set), either enumerated or described, the Set List (list of values belonging to enumerated subsets) and the Data Set Component (aimed at defining the set of values that the Component of a Data Set can take, possibly a subset of the codes of Value Domain). 514 514 ... ... @@ -516,10 +516,8 @@ 516 516 517 517 Therefore, it is important to be aware that some VTL operations (for example the binary operations at data set level) are consistent only if the components having the same names in the operated VTL Data Sets have also the same representation (i.e. the same Value Domain as for VTL). For example, it is possible to obtain correct results from the VTL expression 518 518 519 - >DS_c := DS_a + DS_b (where DS_a, DS_b, DS_c are VTL Data Sets)524 +DS_c := DS_a + DS_b (where DS_a, DS_b, DS_c are VTL Data Sets) if the matching components in DS_a and DS_b (e.g. ref_date, geo_area, sector …) refer to the same general representation. In simpler words, DS_a and DS_b must use the same values/codes (for ref_date, geo_area, sector … ), otherwise the relevant values would not match and the result of the operation would be wrong. 520 520 521 -if the matching components in DS_a and DS_b (e.g. ref_date, geo_area, sector …) refer to the same general representation. In simpler words, DS_a and DS_b must use the same values/codes (for ref_date, geo_area, sector … ), otherwise the relevant values would not match and the result of the operation would be wrong. 522 - 523 523 As mentioned, the property above is not enforced by construction in SDMX, and different representations of the same Concept can be not compatible one another (for example, it may happen that geo_area is represented by ISO-alpha-3 codes in DS_a and by ISO alpha-2 codes in DS_b). Therefore, it will be up to the definer of VTL 524 524 525 525 Transformations to ensure that the VTL expressions are consistent with the actual representations of the correspondent SDMX Concepts. ... ... @@ -534,9 +534,8 @@ 534 534 535 535 The VTL data types are sub-divided in scalar types (like integers, strings, etc.), which are the types of the scalar values, and compound types (like Data Sets, Components, Rulesets, etc.), which are the types of the compound structures. See below the diagram of the VTL data types, taken from the VTL User Manual: 536 536 540 +[[image:1750067055028-964.png]] 537 537 538 -[[image:1750070288958-132.png]] 539 - 540 540 **Figure 22 – VTL Data Types** 541 541 542 542 The VTL scalar types are in turn subdivided in basic scalar types, which are elementary (not defined in term of other data types) and Value Domain and Set scalar types, which are defined in terms of the basic scalar types.
- 1750070288958-132.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -xwiki:XWiki.helena - Size
-
... ... @@ -1,1 +1,0 @@ 1 -45.9 KB - Content