Changes for page SDMX 2.1 Standards. Section 6. Technical Notes
Last modified by Artur on 2025/08/19 10:43
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -63,18 +63,12 @@ 63 63 64 64 The following section provides a brief overview of the differences between the various SDMX formats. 65 65 66 -Version 2.0 was characterised by 4 data messages, each with a distinct format: Generic, Compact, Cross-Sectional and Utility. Because of the design, data in some formats could not always be related to another format. In version 2.1, this issue has been addressed by merging some formats and eliminating others. As a result, in 66 +Version 2.0 was characterised by 4 data messages, each with a distinct format: Generic, Compact, Cross-Sectional and Utility. Because of the design, data in some formats could not always be related to another format. In version 2.1, this issue has been addressed by merging some formats and eliminating others. As a result, in SDMX 2.1 there are just two types of data formats: //GenericData// and //StructureSpecificData// (i.e. specific to one Data Structure Definition). 67 67 68 -SDMX 2.1 there are just two types of data formats: //GenericData// and 69 - 70 -//StructureSpecificData// (i.e. specific to one Data Structure Definition). 71 - 72 72 Both of these formats are now flexible enough to allow for data to be oriented in series with any dimension used to disambiguate the observations (as opposed to only time or a cross sectional measure in version 2.0). The formats have also been expanded to allow for ungrouped observations. 73 73 74 -To allow for applications which only understand time series data, variations of these formats have been introduced in the form of two data messages; 70 +To allow for applications which only understand time series data, variations of these formats have been introduced in the form of two data messages; //GenericTimeSeriesData// and //StructureSpecificTimeSeriesData//. It is important to note that these variations are built on the same root structure and can be processed in the same manner as the base format so that they do NOT introduce additional processing requirements. 75 75 76 -//GenericTimeSeriesData// and //StructureSpecificTimeSeriesData//. It is important to note that these variations are built on the same root structure and can be processed in the same manner as the base format so that they do NOT introduce additional processing requirements. 77 - 78 78 === //Structure Definition// === 79 79 80 80 The SDMX-ML Structure Message supports the use of annotations to the structure, which is not supported by the SDMX-EDI syntax. ... ... @@ -83,10 +83,8 @@ 83 83 84 84 === //Validation// === 85 85 86 -SDMX-EDI – as is typical of EDIFACT syntax messages – leaves validation to dedicated applications (“validation” being the checking of syntax, data typing, and adherence of the data message to the structure as described in the structural 80 +SDMX-EDI – as is typical of EDIFACT syntax messages – leaves validation to dedicated applications (“validation” being the checking of syntax, data typing, and adherence of the data message to the structure as described in the structural definition.) 87 87 88 -definition.) 89 - 90 90 The SDMX-ML Generic Data Message also leaves validation above the XML syntax level to the application. 91 91 92 92 The SDMX-ML DSD-specific messages will allow validation of XML syntax and datatyping to be performed with a generic XML parser, and enforce agreement between the structural definition and the data to a moderate degree with the same tool. ... ... @@ -97,17 +97,13 @@ 97 97 98 98 === //Character Encodings// === 99 99 100 -All SDMX-ML messages use the UTF-8 encoding, while SDMX-EDI uses the ISO 8879-1 character encoding. There is a greater capacity with UTF-8 to express some character sets (see the “APPENDIX: MAP OF ISO 8859-1 (UNOC) CHARACTER 92 +All SDMX-ML messages use the UTF-8 encoding, while SDMX-EDI uses the ISO 8879-1 character encoding. There is a greater capacity with UTF-8 to express some character sets (see the “APPENDIX: MAP OF ISO 8859-1 (UNOC) CHARACTER SET (LATIN 1 OR “WESTERN”) in the document “SYNTAX AND DOCUMENTATION VERSION 2.0”.) Many transformation tools are available which allow XML instances with UTF-8 encodings to be expressed as ISO 8879-1-encoded characters, and to transform UTF-8 into ISO 8879-1. Such tools should be used when transforming SDMX-ML messages into SDMX-EDI messages and vice-versa. 101 101 102 -SET (LATIN 1 OR “WESTERN”) in the document “SYNTAX AND 103 - 104 -DOCUMENTATION VERSION 2.0”.) Many transformation tools are available which allow XML instances with UTF-8 encodings to be expressed as ISO 8879-1-encoded characters, and to transform UTF-8 into ISO 8879-1. Such tools should be used when transforming SDMX-ML messages into SDMX-EDI messages and vice-versa. 105 - 106 106 === //Data Typing// === 107 107 108 108 The XML syntax and EDIFACT syntax have different data-typing mechanisms. The section below provides a set of conventions to be observed when support for messages in both syntaxes is required. For more information on the SDMX-ML representations of data, see below. 109 109 110 -==== 3.3.2 98 +==== 3.3.2 Data Types ==== 111 111 112 112 The XML syntax has a very different mechanism for data-typing than the EDIFACT syntax, and this difference may create some difficulties for applications which support both EDIFACT-based and XML-based SDMX data formats. This section provides a set of conventions for the expression in data in all formats, to allow for clean interoperability between them. 113 113 ... ... @@ -123,7 +123,8 @@ 123 123 1*. Maximum 70 characters. 124 124 1*. From ISO 8859-1 character set (including accented characters) 125 125 1. **Descriptions **are: 126 -1*. Maximum 350 characters; From ISO 8859-1 character set. 114 +1*. Maximum 350 characters; 115 +1*. From ISO 8859-1 character set. 127 127 1. **Code values** are: 128 128 1*. Maximum 18 characters; 129 129 1*. Any of A..Z (upper case alphabetic), 0..9 (numeric), _ (underscore), / (solidus, slash), = (equal sign), - (hyphen); ... ... @@ -132,21 +132,25 @@ 132 132 133 133 A..Z (upper case alphabetic), 0..9 (numeric), _ (underscore) 134 134 135 -1. **Observation values** are: 136 -1*. Decimal numerics (signed only if they are negative); 137 -1*. The maximum number of significant figures is: 138 -1*. 15 for a positive number 139 -1*. 14 for a positive decimal or a negative integer 140 -1*. 13 for a negative decimal 141 -1*. Scientific notation may be used. 142 -1. **Uncoded statistical concept** text values are: 143 -1*. 144 -1**. Maximum 1050 characters; 145 -1**. From ISO 8859-1 character set. 146 -1. **Time series keys**: 124 +**5. Observation values** are: 147 147 148 -In principle, the maximum permissible length of time series keys used in a data exchange does not need to be restricted. However, for working purposes, an effort is made to limit the maximum length to 35 characters; in this length, also (for SDMXEDI) one (separator) position is included between all successive dimension values; this means that the maximum length allowed for a pure series key (concatenation of dimension values) can be less than 35 characters. The separator character is a colon (“:”) by conventional usage. 126 +* Decimal numerics (signed only if they are negative); 127 +* The maximum number of significant figures is: 128 +* 15 for a positive number 129 +* 14 for a positive decimal or a negative integer 130 +* 13 for a negative decimal 131 +* Scientific notation may be used. 149 149 133 +**6. Uncoded statistical concept** text values are: 134 + 135 +* 136 +** Maximum 1050 characters; 137 +** From ISO 8859-1 character set. 138 + 139 +**7. Time series keys**: 140 + 141 +In principle, the maximum permissible length of time series keys used in a data exchange does not need to be restricted. However, for working purposes, an effort is made to limit the maximum length to 35 characters; in this length, also (for SDMXEDI) one (separator) position is included between all successive dimension values; this means that the maximum length allowed for a pure series key (concatenation of dimension values) can be less than 35 characters. The separator character is a colon (“:”) by conventional usage. 142 + 150 150 == 3.4 SDMX-ML and SDMX-EDI Best Practices == 151 151 152 152 === 3.4.1 Reporting and Dissemination Guidelines ===