Last modified by Artur on 2025/08/19 10:43

From version 4.2
edited by Helena
on 2025/05/21 21:28
Change comment: There is no comment for this version
To version 4.6
edited by Helena
on 2025/05/21 21:32
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -63,18 +63,12 @@
63 63  
64 64  The following section provides a brief overview of the differences between the various SDMX formats.
65 65  
66 -Version 2.0 was characterised by 4 data messages, each with a distinct format: Generic, Compact, Cross-Sectional and Utility. Because of the design, data in some formats could not always be related to another format. In version 2.1, this issue has been addressed by merging some formats and eliminating others. As a result, in
66 +Version 2.0 was characterised by 4 data messages, each with a distinct format: Generic, Compact, Cross-Sectional and Utility. Because of the design, data in some formats could not always be related to another format. In version 2.1, this issue has been addressed by merging some formats and eliminating others. As a result, in SDMX 2.1 there are just two types of data formats: //GenericData// and //StructureSpecificData// (i.e. specific to one Data Structure Definition).
67 67  
68 -SDMX 2.1 there are just two types of data formats: //GenericData// and
69 -
70 -//StructureSpecificData// (i.e. specific to one Data Structure Definition).
71 -
72 72  Both of these formats are now flexible enough to allow for data to be oriented in series with any dimension used to disambiguate the observations (as opposed to only time or a cross sectional measure in version 2.0). The formats have also been expanded to allow for ungrouped observations.
73 73  
74 -To allow for applications which only understand time series data, variations of these formats have been introduced in the form of two data messages;
70 +To allow for applications which only understand time series data, variations of these formats have been introduced in the form of two data messages; //GenericTimeSeriesData// and //StructureSpecificTimeSeriesData//. It is important to note that these variations are built on the same root structure and can be processed in the same manner as the base format so that they do NOT introduce additional processing requirements.
75 75  
76 -//GenericTimeSeriesData// and //StructureSpecificTimeSeriesData//. It is important to note that these variations are built on the same root structure and can be processed in the same manner as the base format so that they do NOT introduce additional processing requirements.
77 -
78 78  === //Structure Definition// ===
79 79  
80 80  The SDMX-ML Structure Message supports the use of annotations to the structure, which is not supported by the SDMX-EDI syntax.
... ... @@ -83,10 +83,8 @@
83 83  
84 84  === //Validation// ===
85 85  
86 -SDMX-EDI – as is typical of EDIFACT syntax messages – leaves validation to dedicated applications (“validation” being the checking of syntax, data typing, and adherence of the data message to the structure as described in the structural
80 +SDMX-EDI – as is typical of EDIFACT syntax messages – leaves validation to dedicated applications (“validation” being the checking of syntax, data typing, and adherence of the data message to the structure as described in the structural definition.)
87 87  
88 -definition.)
89 -
90 90  The SDMX-ML Generic Data Message also leaves validation above the XML syntax level to the application.
91 91  
92 92  The SDMX-ML DSD-specific messages will allow validation of XML syntax and datatyping to be performed with a generic XML parser, and enforce agreement between the structural definition and the data to a moderate degree with the same tool.
... ... @@ -97,17 +97,13 @@
97 97  
98 98  === //Character Encodings// ===
99 99  
100 -All SDMX-ML messages use the UTF-8 encoding, while SDMX-EDI uses the ISO 8879-1 character encoding. There is a greater capacity with UTF-8 to express some character sets (see the “APPENDIX: MAP OF ISO 8859-1 (UNOC) CHARACTER
92 +All SDMX-ML messages use the UTF-8 encoding, while SDMX-EDI uses the ISO 8879-1 character encoding. There is a greater capacity with UTF-8 to express some character sets (see the “APPENDIX: MAP OF ISO 8859-1 (UNOC) CHARACTER SET (LATIN 1 OR “WESTERN”) in the document “SYNTAX AND DOCUMENTATION VERSION 2.0”.) Many transformation tools are available which allow XML instances with UTF-8 encodings to be expressed as ISO 8879-1-encoded characters, and to transform UTF-8 into ISO 8879-1. Such tools should be used when transforming SDMX-ML messages into SDMX-EDI messages and vice-versa.
101 101  
102 -SET (LATIN 1 OR “WESTERN”) in the document “SYNTAX AND
103 -
104 -DOCUMENTATION VERSION 2.0”.) Many transformation tools are available which allow XML instances with UTF-8 encodings to be expressed as ISO 8879-1-encoded characters, and to transform UTF-8 into ISO 8879-1. Such tools should be used when transforming SDMX-ML messages into SDMX-EDI messages and vice-versa.
105 -
106 106  === //Data Typing// ===
107 107  
108 108  The XML syntax and EDIFACT syntax have different data-typing mechanisms. The section below provides a set of conventions to be observed when support for messages in both syntaxes is required. For more information on the SDMX-ML representations of data, see below.
109 109  
110 -==== 3.3.2 Data Types ====
98 +==== 3.3.2 Data Types ====
111 111  
112 112  The XML syntax has a very different mechanism for data-typing than the EDIFACT syntax, and this difference may create some difficulties for applications which support both EDIFACT-based and XML-based SDMX data formats. This section provides a set of conventions for the expression in data in all formats, to allow for clean interoperability between them.
113 113  
... ... @@ -123,7 +123,8 @@
123 123  1*. Maximum 70 characters.
124 124  1*. From ISO 8859-1 character set (including accented characters)
125 125  1. **Descriptions **are:
126 -1*. Maximum 350 characters;  From ISO 8859-1 character set.
114 +1*. Maximum 350 characters;
115 +1*. From ISO 8859-1 character set.
127 127  1. **Code values** are:
128 128  1*. Maximum 18 characters;
129 129  1*. Any of A..Z (upper case alphabetic), 0..9 (numeric), _ (underscore), / (solidus, slash), = (equal sign), - (hyphen);
... ... @@ -132,21 +132,25 @@
132 132  
133 133  A..Z (upper case alphabetic), 0..9 (numeric), _ (underscore)
134 134  
135 -1. **Observation values** are:
136 -1*. Decimal numerics (signed only if they are negative);
137 -1*. The maximum number of significant figures is:
138 -1*. 15 for a positive number
139 -1*. 14 for a positive decimal or a negative integer
140 -1*. 13 for a negative decimal
141 -1*. Scientific notation may be used.
142 -1. **Uncoded statistical concept** text values are:
143 -1*.
144 -1**. Maximum 1050 characters;
145 -1**. From ISO 8859-1 character set.
146 -1. **Time series keys**:
124 +**5. Observation values** are:
147 147  
148 -In principle, the maximum permissible length of time series keys used in a data exchange does not need to be restricted. However, for working purposes, an effort is made to limit the maximum length to 35 characters; in this length, also (for SDMXEDI) one (separator) position is included between all successive dimension values; this means that the maximum length allowed for a pure series key (concatenation of dimension values) can be less than 35 characters.  The separator character is a colon (“:”) by conventional usage.
126 +* Decimal numerics (signed only if they are negative);
127 +* The maximum number of significant figures is:
128 +* 15 for a positive number
129 +* 14 for a positive decimal or a negative integer
130 +* 13 for a negative decimal
131 +* Scientific notation may be used.
149 149  
133 +**6. Uncoded statistical concept** text values are:
134 +
135 +*
136 +** Maximum 1050 characters;
137 +** From ISO 8859-1 character set.
138 +
139 +**7. Time series keys**:
140 +
141 +In principle, the maximum permissible length of time series keys used in a data exchange does not need to be restricted. However, for working purposes, an effort is made to limit the maximum length to 35 characters; in this length, also (for SDMXEDI) one (separator) position is included between all successive dimension values; this means that the maximum length allowed for a pure series key (concatenation of dimension values) can be less than 35 characters. The separator character is a colon (“:”) by conventional usage.
142 +
150 150  == 3.4 SDMX-ML and SDMX-EDI Best Practices ==
151 151  
152 152  === 3.4.1 Reporting and Dissemination Guidelines ===