Version 1.1 by Helena K. on 2026/01/16 00:35

Show last authors
1 {{box title="**Contents**"}}
2 {{toc/}}
3 {{/box}}
4
5 |**Name:**|**R1. Data completeness - rate**
6 |Definition:|The ratio of the number of data cells (entities to be specified by the Eurostat domain manager) provided to the number of data cells required by Eurostat or relevant. The ratio is computed for a chosen dataset and a given period.
7 |Applicability:|(((
8 The rate of available data is applicable:
9
10 * to all statistical processes (including use of administrative sources);
11 * to users and producers, with different focus and calculation formulae.
12
13 Computed only by Eurostat but recommended also for inclusion in national quality reports.
14 )))
15 |(((
16 Calculation formulae:
17
18
19
20
21
22
23
24 )))|(((
25 **For a specific key variable:
26 For producers:**
27
28 #//A,,D,,rqd//,, ,,//R//1//^^PDR ^^//= #//Drqd//
29
30 //D^^rqd^^// in the denominator is the set of data cells required (i.e. excl. derogations/confidentiality) and # //A,,D,,^^rqd^^// in the numerator is the corresponding subset of available/provided data cells. The notation #//D //means the number of elements in the set //D// (the cardinality).
31
32 **For users**
33
34 #//ADrel//,, ,,//R//1//^^U ^^//= #//Drel//
35
36 //D^^rel^^// in the denominator is the set of relevant data cells (full coverage, i.e. excl. only those entities for which the data wouldn't be relevant like e.g. fishing fleet in Hungary) and //A,,D,,^^rel^^// in the numerator is the corresponding subset of  available/provided data cells. The notation #//D// means the number of elements in the set //D// (the cardinality).
37
38
39 The main difference between the two formulas lies in the selection of the denominators' datasets.
40
41 Regarding the first formula, for **producers**, this set comprises the required data cells excluding derogations/confidentiality, since producers are interested in assessing the level of compliance with the requirements.
42
43 On the other hand, for **users**, the formula gives the rate of provided data cells to the ones that are theoretically relevant, meaning that missing cells due to derogations/confidentiality or any other reason for missing data are included here, leaving out only those cells for which data wouldn't be relevant like e.g. fishing fleet in Hungary.
44 )))
45 |Target value:|The target value for this indicator is 1 meaning that 100% of the required or relevant data cells are available.
46 |Aggregation levels and principles:|The calculation is done, for a meaningful choice by the domain manger, at (((
47 subject matter domain level. Aggregations are recommended at EU level for the user-oriented indicator.
48
49 The number of data cells provided and the number of data cells required/relevant are aggregated separately, from which a ratio is then computed.
50 )))
51
52 |(% style="width:204px" %)Interpretation:  |(% style="width:1526px" %)(((
53 The indicator shows to what extent statistics are available compared to what should be available.
54
55 **For producers:**
56
57 It can be used to evaluate the degree of compliance by a given Member
58
59 State for a given dataset and period to be specified by the domain manager.
60
61 **For users:**
62
63 At EU level, it can be used to
64
65 * identify whether important variables are missing for some individual Member State or alternatively
66 * give users an overall measurement (aggregate across countries and/or key variables) of the availability of statistics.
67 )))
68 |(% style="width:204px" %)Specific guidance:|(% style="width:1526px" %)(((
69 The indicator should be accompanied by information about which variable are missing and the reasons for incompleteness as well as, where relevant, the impact of the missing data on the EU aggregate and plans for improving completeness in the future.
70
71 Calculation would need intervention by the Eurostat domain manager at the initial stage (to define the key variables and the period to be monitored). Later on, the indicators should be calculated automatically.
72
73 Both formulas are to be computed per key variable, nevertheless an aggregate for all variables can be calculated.
74
75 **For producers:**
76
77 This indicator forms part of Eurostat compliance monitoring, thus for producers it should be computed per Member State.
78
79 **For users:**
80
81 If certain relevant variables are not reported, the statistics are incomplete. This can be due to data not being collected or data being of low quality or confidential. For users an aggregate across countries for all the key variables could suffice.
82 )))
83 |(% style="width:204px" %)References:|(% style="width:1526px" %)(((
84 * ESS Handbook for Quality Reports – 2009 Edition (Eurostat).
85 * ESS Standard for Quality Reports – 2009 Edition (Eurostat).
86 * ISO/IEC FDIS 11179-1 "Information technology – Metadata registries – Part 1: Framework", March 2004 (according to the SDMX Metadata
87
88 Common Vocabulary draft Febr. 2008).
89 )))
90
91 |(% style="width:204px" %)**Name:**|(% style="width:1526px" %)**A1. Sampling error - indicators**
92 |(% style="width:204px" %)(((
93 Definition:
94
95
96 )))|(% style="width:1526px" %)(((
97 The sampling error can be expressed:
98
99 1. in relative terms, in which case the relative standard error or, synonymously, the coefficient of variation (CV) is used. (The standard error of the estimator ,,θ,,^^ˆ^^ is the square root of its variance V(,,θ,,^^ˆ^^) .) The estimated relative standard error (the estimated CV) is the estimated standard error of the estimator divided by the estimated value of the parameter, see calculation formulae below.
100 1. in terms of confidence intervals, i.e. an interval that includes with a given level of confidence the true value of a parameter θ. The width of the interval is related to the standard error.
101
102 The estimator should take into account the sampling design and should further integrate the effect on precision of adjustments for non-response, corrections for misclassifications, use of auxiliary information through calibration methods etc.
103 )))
104 |(% style="width:204px" %)Applicability:|(% style="width:1526px" %)(((
105 Sampling errors indicator are applicable:
106
107 * to statistical processes based on probability samples or other sampling procedures allowing computation of such information. - to users and producers, with different level of details given.
108
109
110 )))
111 |(% style="width:204px" %)(((
112 Calculation formulae:
113
114
115 )))|(% style="width:1526px" %)(((
116 **Coefficient of variation:**
117
118
119 Remark: The subscript "e" stands for estimate.
120
121 **Confidence interval, symmetric:**
122
123 **~ **[,,θ,,ˆ −//d//;,,θ,,ˆ +//d//]  or  ,,θ,,ˆ,,±,,//d//
124
125 The length of the interval, which is 2∙d, depends on the confidence level (e.g. 95%), the assumptions convening the distribution of the estimator of the parameter, and the sampling error. In many cases d has the form below, where t depends on the distribution and the confidence level. //d //= //t//× //V//^^ˆ^^(,,θ,,^^ˆ^^)
126
127 In case of totals, means and ratios, formulas for aggregation of coefficients of variation at EU level can be found in the third reference below.
128
129 The calculation formulae depend on the sampling design, the estimator, and the method chosen for estimating the variance //V//(,,θ,,ˆ).
130 )))
131 |(% style="width:204px" %)Target value:|(% style="width:1526px" %)(((
132 The smaller the CV, the standard error, and the width of the confidence interval, the more accurate is the estimator. Survey regulations may include specifications for precision thresholds at different population levels.
133 )))
134 |(% style="width:204px" %)Aggregation levels and principles:|(% style="width:1526px" %)The calculation is done for all statistics based on probability sample (((
135 surveys or equivalent. Aggregations are possible at Member State and EU levels, depending on estimators and degree of harmonisation.
136
137 The principle for computing the coefficient of variation of an aggregate depends on the method for aggregation of the estimator belonging to that variable.
138 )))
139 |(% style="width:204px" %)(((
140 Interpretation:
141
142
143 )))|(% style="width:1526px" %)(((
144 The CV is a relative (dimensionless) measure of the precision of a statistical estimator, often expressed as a percentage. More specifically, it has the property of eliminating measurement units from precision measures and one of its roles is to make possible comparisons between precision of estimates of different indicators. 
145
146 However, this property has no value added in case of proportions (which are by definition dimensionless indicators).
147 )))
148 |(% style="width:204px" %)(((
149 Specific guidance:
150
151
152 )))|(% style="width:1526px" %)(((
153 There are several precision measures which can be used to estimate the random variation of an estimator due to sampling, such as coefficients of variation, standard errors and confidence intervals.
154
155 The coefficient of variation is suitable for quantitative variables with large positive values. It is not robust for percentages or changes and is not usable for data estimates of negative values, where they may be substituted by absolute measures of precision (standard errors or confidence intervals).
156
157 The confidence interval is usually the precision measure preferred by data users. It is the clearest way of understanding and interpreting the sampling variability.
158
159 Provision of confidence intervals is voluntary.
160
161 The CV has the advantage of being dimensionless. The standard error or a confidence interval is sometimes preferable, as discussed.
162 )))
163 |(% style="width:204px" %)Reference:|(% style="width:1526px" %)(((
164 * ESS Handbook for Quality Reports – 2009 Edition (Eurostat).
165 * ESS Standard for Quality Reports – 2009 Edition (Eurostat).
166 * Variance estimation methods in the European Union, Monographs of official Statistics, 2002 edition.
167 )))
168
169 = A2. Over-coverage - rate =
170
171 |(% style="width:200px" %)(((
172 **~ Name: **
173 )))|(% style="width:1530px" %)(((
174 **~ A2. Over-coverage - rate**
175 )))
176 |(% style="width:200px" %)(((
177 Definition:
178
179
180 )))|(% style="width:1530px" %)(((
181 The rate of over-coverage is the proportion of units accessible via the frame that do not belong to the target population (are out-of-scope).
182
183 The //target population** **//is the population for which inferences are made. The //frame// (or frames) is a device that permits access to population units. The //frame population** **//is the set of population units which can be accessed through the frame. The concept of a frame is traditionally used for sample surveys, but applies equally to several other statistical processes, e.g. censuses, processes using administrative sources, and processes involving multiple data sources. Coverage deficiencies may be due to delays in reporting (typical for business statistics) and to errors in unit identification, classification, coding etc. This is the case also when administrative data are used.
184
185 The rate may be calculated either as un-weighted or as weighted to refer to the overall level (frame/population rather than sample). Units of unknown eligibility provide an inherent difficulty; see below.
186 )))
187 |(% style="width:200px" %)Applicability :|(% style="width:1530px" %)(((
188 The rate of over-coverage is applicable:
189
190 * to all statistical processes (including use of administrative sources);
191 * to producers.
192
193 If the survey has more than one unit type, a rate may be calculated for each type.
194
195 If there is more than one frame or if over-coverage rates vary strongly between sub-populations, rates should be separated.
196 )))
197 |(% style="width:200px" %)Calculation formulae:|(% style="width:1530px" %)(((
198 The over-coverage rate has three main versions written in one and the same formula as the weighted over-coverage rate,, ,,//OCr,,w,,//
199
200 //OCrw //= ∑∑//O Owj //+//j //∑//E wj //+∑∑//Q Qwwj j//  //w //+(1−α)
201
202 O  set of out-of-scope units (over-coverage, resolved and not belonging to the target population)
203
204 E  set of in-scope units (resolved units belonging to the target population; eligible units)
205
206 Q  set of units of unknown eligibility. //w,,j,,// weight of unit //j//, described below.
207
208 α  The estimated proportion of cases of unknown eligibility that are actually eligible. It should be set equal 1 unless there is strong evidence at country level for assuming otherwise.
209
210 The three main cases are:
211
212 Un-weighted rate: //w,,j ,,//=1
213
214 Design-weighted rate: //w,,j ,,//=//d ,,j ,,//where basically //d ,,j ,,//=1π//,,j ,,//, meaning that the design weight is the inverse of the selection probability.
215
216 Size-weighted rate: //w,,j ,,//=//d ,,j ,,x ,,j ,,//where //x,,j,,// is the value of a variable X.
217
218 The variable X, which is chosen subjectively, shows the size or importance of the units. The value should be known for all units. X is auxiliary information, often available in the frame. Examples are turnover for businesses and population for municipalities.
219
220 For the over-coverage rate the un-weighted and the design-weighted alternatives are the ones mostly used, see Interpretation below.
221
222 The design-weighted rate is mainly used for samples surveys, but it may apply also, e.g., for price index processes or processes with multiple data sources. The weight //d ,,j,,// is a “raising” factor when unit //j// represents more than itself. Otherwise //d ,,j ,,//is equal to one. Hence, when dealing with administrative sources the un-weighted and the size-weighted versions of the rate are normally the interesting one.
223 )))
224
225 |(% style="width:202px" %)Target value:|(% style="width:1528px" %)The target value of this indicator is as much as possible close to 0.
226 |(% style="width:202px" %)(((
227 Aggregation levels and principles:
228
229
230 )))|(% style="width:1528px" %)(((
231 * MS: the indicator is to be calculated for frame populations where meaningful, e.g. over industries. Then separate frame populations are treated as one frame population.
232 * EU: the indicator can be aggregated across countries only where statistical production processes are fully harmonised. For the statistical processes involved, the separate frame populations are treated as one frame population. Where production processes differ across countries, lower and higher over-coverage rates can be shown to indicate the range.Interpretation:
233
234 //Over-coverage//: there are units accessible via the frame, which do not belong to the target population (e.g., deceased persons still listed in a Population Register or no longer operating enterprises still in the Business Register).
235
236 The interest of the indicator depends on the statistical process and the ways of identification of over-coverage. If administrative data are used also to define the target population, this indicator normally has little value added, except possibly duplicates, if they are found. It may provide an overall idea of the quality of the register/frame and the rate of change of the population.
237
238 The un-weighted over-coverage rate gives the number of units that have been found not belonging to the target in proportion to the total number of observed units. The number refers to the sample, the census or the register population studied.
239
240 The design-weighted over-coverage rate is an estimate for the frame population in comparison with the target population, based on the information at hand, usually a sample.
241
242 The size-weighted over-coverage rate expresses the rate in terms of a chosen size variable, e.g. turnover in business statistics. (This case is less interesting for over-coverage than for non-response.)
243 )))
244 |(% style="width:202px" %)(((
245 Specific guidance:
246
247
248 )))|(% style="width:1528px" %)-
249 |(% style="width:202px" %)References:|(% style="width:1528px" %)§ ESS Handbook for Quality Reports – 2009 Edition (Eurostat). § ESS Standard for Quality Reports – 2009 Edition (Eurostat).
250 |(% style="width:202px" %)(((
251 **~ Name:**
252 )))|(% style="width:1528px" %)**A3. Common units - proportion**
253 |(% style="width:202px" %)(((
254 Definition:
255 )))|(% style="width:1528px" %)The proportion of units covered by both the survey and the administrative sources in relation to the total number of units in the survey.
256 |(% style="width:202px" %)Applicability:|(% style="width:1528px" %)(((
257 The proportion is applicable
258
259 * to mixed statistical processes where some variables or data for some units come from survey data and others from administrative source(s);
260 * to producers.
261 )))
262 |(% style="width:202px" %)Calculation formulae:|(% style="width:1528px" %)(((
263 //Ad //,,= ,,No. of common units across survey data and admin. sources
264
265 No. of unique units in survey data                     
266
267 **~ **
268 )))
269 |(% style="width:202px" %)Target value:|(% style="width:1528px" %)-
270 |(% style="width:202px" %)Aggregation levels and principles::|(% style="width:1528px" %)-
271 |(% style="width:202px" %)(((
272 Interpretation:
273
274
275 )))|(% style="width:1528px" %)(((
276 The indicator is used when administrative data is combined with survey data in such a way that data on unit level are obtained from both the survey and one or more administrative sources (some variables come from the survey and other variables from the administrative data) or when data for part of the units come from survey data and for another part of the units from one or more administrative sources.
277
278 The indicator provides an idea of completeness/coverage of the sources – to what extent units exist in both administrative data and survey data. This indicator does not apply if administrative data is used only to produce estimates without being combined with survey data.
279 )))
280 |(% style="width:202px" %)Specific guidance:|(% style="width:1528px" %)(((
281 Common units refer to those units that are included in the data stemming from an administrative source and survey data.
282
283
284 For the purpose of this indicator, the “unique units in survey data” in the denominator means that if a unit exists in more than one source it should only be counted once.
285
286
287 If only a survey is conducted not for all of the units in the administrative source (e.g. conducting a survey only for larger enterprises), this indicator should be calculated only for the relevant subset.
288
289
290 Linking errors should be detected and resolved before this indicator is calculated.
291
292
293 If there are few common units due to the design of the statistical output (e.g. a combination of survey and administrative data), this should be explained.
294 )))
295 |(% style="width:202px" %)(((
296 References:
297
298
299 )))|(% style="width:1528px" %)ESSNet use of administrative and accounts data in business statistics, WP6 Quality Indicators when using Administrative Data in Statistical Operations, November 2010.
300
301 **~ **
302
303
304 |(((
305 **~ **
306
307 **Name:**
308
309 **~ **
310 )))|(((
311 **~ **
312
313 **A4. Unit non-response - rate **
314
315 **~ **
316 )))
317 |Definition:|The ratio of the number of units with no information or not usable information (non-response, etc.) to the total number of in-scope (eligible) units. The ratio can be weighted or un-weighted.
318 |Applicability:|(((
319 The unit non-response rate is applicable:
320
321 * to all statistical processes (including direct data collection and administrative data; the terminology varies between statistical processes, but the basic principle is the same; it may in some cases be difficult to distinguish between unit non-response and undercoverage, especially for administrative data sources (in the former case units are known to exist but data are missing, e.g. due to very late reporting or so low quality that the information is useless – in the latter case the units are not known at the frame construction);
322 * to users and producers, with different level of details given.
323
324
325 )))
326 |Calculation formulae:|(((
327 The non-response rate has three main versions written in one and the same formula as the weighted unit non-response rate //NRr,,w,,//
328
329 //NRrw //=1−    //wj //+∑∑//NR//R//wwjj //+α∑//Q wj//
330
331 ∑//,,R,,//
332
333 **~ **
334
335 R the set of responding eligible units
336
337 NR the set of non-responding eligible units
338
339 Q the set of selected units with unknown eligibility (un-resolved selected units)
340
341 //w,,j,,// weight of unit //j//, described below
342
343 α The estimated proportion of cases of unknown eligibility that are actually eligible. It should be set equal 1 unless there is strong evidence at country level for assuming otherwise.
344
345 **~ **
346
347 The three main cases are:
348
349 Un-weighted rate: //w,,j ,,//=1
350
351 Design-weighted rate: //w,,j ,,//=//d ,,j ,,//where basically //d ,,j ,,//=1π//,,j ,,//, meaning that the design weight is the inverse of the selection probability.
352
353 Size-weighted rate: //w,,j ,,//=//d ,,j ,,x ,,j ,,//where //x,,j,,// is the value of a variable X.
354
355
356 The variable X, which is chosen subjectively, shows the size or importance of the units. The value should be known for all units. X is auxiliary information, often available in the frame. Examples are turnover for businesses and population for municipalities.
357
358
359 For the unit non-response rate all three alternatives are frequently used, see Interpretation below.
360
361
362 The design-weighted rate is mainly used for samples surveys, but it may apply also, e.g., for price index processes or processes with multiple data sources. The weight //d ,,j,,// is a “raising” factor when unit //j// represents more
363 )))
364
365
366
367 | |(((
368 than itself. Otherwise //d ,,j ,,//is equal to one. Hence, when dealing with administrative sources the un-weighted and the size-weighted versions of the rate are normally the interesting one.
369
370
371 )))
372 |Target value:|(((
373 The target value for this indicator is as close to 0 as possible.
374
375
376 )))
377 |Aggregation levels and principles:  |(((
378 * MS: the indicator is to be calculated at statistical process level
379 * EU: rather than aggregating this indicator over countries or to calculate a mean, lower and higher unit non-response rates can be shown by Eurostat for a given variable at statistical process level.
380 )))
381 |(((
382 Interpretation:
383
384
385 )))|(((
386 Unit non-response occurs when no data about an eligible unit are recorded (or data are so few or so low in quality that they are deleted).
387
388
389 The un-weighted unit non-response rate shows the result of the data collection in the sample (the units included), rather than an indirect measure of the potential bias associated with non-response. If α=1, it assumes that all the units with unknown eligibility are eligible, so it provides a conservative estimate of A4 with regard to other choices of  α .
390
391
392 The design-weighted unit non-response rate shows how well the data collection worked considering the population of interest.
393
394
395 The size-weighted unit non-response rate would represent an indirect indicator of potential bias caused by non-response prior to any calibration adjustments.
396
397
398 Note overall that the bias may be low even if the non-response rate is high, depending on the pattern of the non-responses and the possibilities to adjust successfully for non-response.
399 )))
400 |Specific guidance:|(((
401 Non-response is a source of errors in survey statistics mainly for two reasons:
402
403 - it reduces the number of responses and therefore the precision of the estimates (this may be particularly relevant when samples are used); - it might introduce bias. The size of bias depends on the non-response rate but also on the differences between the respondents and the non- respondents with respect to the variable of interest; furthermore on the strength of auxiliary information.
404 )))
405 |(((
406 References:
407
408
409 )))|(((
410 * ESS Handbook for Quality Reports – 2009 Edition (Eurostat).
411 * ESS Standard for Quality Reports – 2009 Edition (Eurostat).
412 * U.S. Census Bureau Statistical Quality Standards, Reissued 2010.
413 * Trépanier, Julien, and Kovar. “Reporting Response Rates when Survey and Administrative Data are Combined.” //Proceedings of the Federal Committee on Statistical Methodology Research Conference 2005.//
414 )))
415
416
417
418 |(((
419 **~ **
420
421 **Name:**
422
423 **~ **
424 )))|(((
425 **~ **
426
427 **A5. Item non-response - rate**
428 )))
429 |(((
430 Definition:
431
432
433 )))|The item non-response rate for a given variable is defined as the (weighted) ratio between in-scope units that have not responded and in-scope units that are required to respond to the particular item.
434 |Applicability :|(((
435 The item non-response rate is applicable:
436
437 * to all statistical processes (including direct data collection and administrative data; the terminology varies between statistical processes, but the basic principle is the same;
438 * to users and producers, for selected key variables or for variables with very high item non-response rates, and with different level of details given.
439
440
441
442 If the survey has more than one unit type or data sources, a rate may be calculated for each type or data source.
443
444 If there is more than one frame, or if rates vary strongly between subpopulations, rates should (also) be calculated for separate sub-populations (or strata, groups).
445 )))
446 |(((
447 Calculation formulae:
448
449
450
451
452
453
454
455
456
457 )))|(((
458 The item non-response rate has three main versions written in one and the same formula as the weighted item non-response rate //NR,,Y ,,r,,w,,// ,which is calculated as follows:
459
460 ∑R//Y wj                     //
461
462 //NRY rwREQ //=1− R//Y wj //+∑//N//R//Y wj//
463
464
465
466
467 //R,,Y,,// the set of eligible units responding to item //Y// (as required)
468
469 //NR,,Y,,//  the set of eligible units not responding to item //Y// although this item is required. – The denominator corresponds to the set of units for which item //Y// is required. (Other units do not get this item because their answers to earlier items gave them a skip past this item; they were “filtered away”.)
470
471 //w,,j,,// weight of unit //j//, described below
472
473
474 The three main cases are:
475
476 Un-weighted rate: //w,,j ,,//=1
477
478 Design-weighted rate: //w,,j ,,//=//d ,,j ,,//where basically //d ,,j ,,//=1π//,,j ,,//, meaning that the design weight is the inverse of the selection probability.
479
480 Size-weighted rate: //w,,j ,,//=//d ,,j ,,x ,,j ,,//where //x,,j,,// is the value of a variable X.
481
482
483 The variable X, which is chosen subjectively, shows the size or importance of the units. The value should be known for all units. X is auxiliary information, often available in the frame. Examples are turnover for businesses and population for municipalities.
484
485
486 The design weight may in the computation of final estimates be modified to correct for non-response, under-coverage etc. This design weight should be used if the rates are to apply to final estimates.
487
488
489 The design-weighted rate is mainly used for samples surveys, but it may apply also, e.g., for price index processes or processes with multiple data sources.
490
491 The weight //d ,,j,,// is a “raising” factor when unit //j// represents more than itself.
492 )))
493 | |(((
494 Otherwise //d ,,j ,,//is equal to one. Hence, when dealing with administrative sources the un-weighted and the size-weighted versions of the rate are normally the interesting one.
495
496
497 )))
498 |(((
499 Target value:
500
501
502 )))|(((
503 The target value for this indicator is as close to 0 as possible.
504
505
506 )))
507 |(((
508
509
510 Aggregation levels and principles:
511
512
513 )))|(((
514
515
516 * MS: the indicator is to be calculated at statistical process level for key variables and variables with low rates.
517 * EU: rather than to aggregate this indicator over countries or to calculate a mean, lower and higher item non-response rates can be shown by Eurostat for a given variable at statistical process level.
518 )))
519 |(((
520 Interpretation:
521
522
523 )))|(((
524 A high item non-response rate indicates difficulties in providing information,
525
526 e.g. a sensitive question or unclear wording for social statistics or information not available in the accounting system for business statistics.
527
528
529 The indicator is a proxy indicator of the possible bias caused by item nonresponse. In spite of the low item response rate, the bias may still be low, depending on causes, response pattern, and auxiliary information to adjust/impute.
530 )))
531 |(((
532 Specific guidance
533
534
535 )))|The un-weighted** **item non-response rate should be calculated before the data editing and imputation in order to measure the impact of item non-response for the key variables.
536 |References|(((
537 * ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat).
538 * ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
539 * U.S. Census Bureau Statistical Quality Standards, Reissued 2010.
540 * Trépanier, Julien, and Kovar. “Reporting Response Rates when Survey and Administrative Data are Combined.” //Proceedings of the Federal//
541
542 //Committee on Statistical Methodology Research Conference 2005.//
543 )))
544
545 **~ **
546
547
548 |(((
549 **~ **
550
551 **Name:**
552
553 **~ **
554 )))|(% colspan="4" %)(((
555 **~ **
556
557 **A6. Data revision - average size**
558
559 **~ **
560 )))
561 |(((
562 Definition:
563
564
565 )))|(% colspan="4" %)(((
566 The average over a time period of the revisions of a key indicator. The “revision” is defined as the difference between a later and an earlier estimate of the key item.
567
568
569 The number of releases (//K//) of a key item (number of times it is published) is fixed and specified in the revision policy. Usually, revisions involve a time series: when publishing an estimate of the key indicator referring to time //t//, it is a common practice to release the revised version of the indicator referring to a set of previous periods.
570
571
572 In the following table this situation is illustrated for a revision analysis where the policy has K revisions and //n// reference periods are included in the analysis.
573
574
575 Reference periods
576
577
578 Releases 1 … //t// … //n//
579 )))
580 | | 1^^st^^ release //X//,,11,, …|//X//1//t//|…|//X//1//n//
581 | | … … …|…|…|…
582 | | //k//th release //X ,,k,,//,,1,, …|//X,,kt,,//|…|//X ,,kn,,//
583 | | … … …|…|…|…
584 | |(% colspan="4" %)(((
585 //K//th and final release   //X,,K,,//,,1,, … //X ,,Kt,,// … //X ,,Kn,,//
586
587
588 Different indicators can be derived by different ways of averaging the revisions for a time series (revisions can be averaged in absolute value or not, the indicator can be absolute or relative).
589 )))
590 |Applicability:|(% colspan="4" %)(((
591 The average size of revisions is applicable:
592
593 * to statistical processes where initial and subsequent (revised) estimates are published according to a revision policy (quarterly national accounts, short term statistics);
594 * to users and producers, with different level of details given.
595
596
597 )))
598 |(((
599 Calculation formulae:
600
601
602
603
604
605
606
607
608
609
610 )))|(% colspan="4" %)(((
611 With the reference to the two-dimensional situation described in the definition there are several strategies to compute indicators: with or without sign, absolute or relative values, for specific pairs of revisions over time or over a sequence of revisions etc. The main suggestion here is to consider an average for a given revision step over a set of //n// reference periods.
612
613
614 **MAR (Mean Absolute Revision):**
615
616 **~ **
617
618 //MAR//=1∑//tn//=1//X Lt //−//X Pt//** **//n//
619
620
621 where:
622
623 //X ,,Lt,,// “later” estimate, //L//^^th ^^release of the item at time reference //t//;
624
625 //X ,,Pt,,// “earlier” estimate, //P//^^th ^^ release of the item at time reference
626
627 //t//;
628
629
630 )))
631
632
633
634 | |(% colspan="3" %)(((
635 //n// = No. of estimates (reference periods) in the time series taken into account. //n//≥ 20 is recommended for quarterly estimates while //n//≥ 30 is recommended for monthly estimates. The indicator is not recommended for annual estimates.
636
637
638 MAR provides and idea of the average size of a given revision step.
639
640
641 This indicator can alternatively be expressed in relative terms:
642
643
644 **RMAR: Relative Mean Absolute Revision**
645
646 **~ **
647
648 //RMAR//=∑ 
649
650 //,,t,,n//=,,1 ,,^^ ^^//X LtX//−//,,Lt,,X Pt //∑//,,t,,nX//=,,1,,//LtX ,,Lt ,,//,,,,,,,,^^= ^^∑//tn//∑=1 //X,,t,,n//=,,1,,//LtX//−//,,Lt,,X Pt//
651
652
653
654
655
656
657 In addition – at the level of Eurostat – and where the sign is interesting, there is the mean revision from Release //P// to Release //L// over the //n// reference periods:
658
659
660 [[image:file:///C:/Users/axyli/AppData/Local/Temp/msohtmlclip1/01/clip_image014.gif]]**MR (Mean Revision):**
661
662 //MR //** **
663
664
665 Different combinations of //P// and //L// can be considered. For instance OECD suggests to compare the following releases:
666
667
668 **Monthly data** **Quarterly data**
669 )))
670 | |**//Release L//**|**//Release P//**|**//Release L//**// **Release P**//
671 | |After 2 Months|First|After 5 Months First
672 | |After 3 Months|First|After 1 Year After 5 Months
673 | |After 3 Months|After 2 Months|After 1 Year First
674 | |After 1 Year|First|After 2 Years First
675 | |After 2 Years|First|Latest available First
676 | |Latest available|First|After 2 Years After 1 Year
677 | |(((
678 After 2 Years
679
680
681
682 )))|After 1 Year|
683 |Target value:|-| |
684 |Aggregation levels and principles: |(% colspan="3" %)(((
685 * MS: the indicator is to be calculated at statistical process level.
686 * EU: the indicator is calculated on the revisions made on the EU aggregate/indicator.
687
688
689 )))
690 |(((
691 Interpretation:
692
693
694 )))|(% colspan="3" %)(((
695 **MAR** provides an idea of the average size of a given revision step for a key item step over the time.
696
697
698 The **RMAR** indicator normalises the MAR measure using the final estimates. It facilitates international comparisons and comparisons over time periods. When estimating growth rates this measure corrects the MAR for the size of growth and, so, takes account of the fact that revisions might be expected to be larger in periods of high growth than in periods of slow growth.
699
700
701 Both MAR and RMAR indicators provide information on the stability of
702 )))
703 | |(% colspan="3" %)(((
704 the estimates. They do not provide information on the direction of revisions, since the absolute values of revisions are considered. Such information is provided by **MR**. A positive sign means upwards revision (underestimation), and a negative sign indicates overestimation in the first case. MR sometimes is referred to as ‘average bias’, but a nonzero MR is not sufficient to establish whether the size of revisions is systematically biased in a given direction. To ascertain the presence of bias it has to be assessed whether MR is statistically different from zero (given no changes in definitions, methodologies, etc.).
705
706
707 )))
708 |Specific guidance:|(% colspan="3" %)Either MAR or RMAR should be presented under this indicator. In addition MR could also be calculated at EU-level.
709 |(((
710 References:
711
712
713 )))|(% colspan="3" %)§ OECD: [[http:~~/~~/stats.oecd.org/mei/default.asp?rev=1>>url:http://stats.oecd.org/mei/default.asp?rev=1]][[url:http://stats.oecd.org/mei/default.asp?rev=1]]
714
715 **~ **
716
717 **~ **
718
719 |(((
720 **~ **
721
722 **Name:**
723
724 **~ **
725 )))|(((
726 **~ **
727
728 **A7. Imputation - rate **
729
730 **~ **
731 )))
732 |(((
733 Definition:
734
735
736
737 )))|(((
738 Imputation is the process used to assign replacement values for missing, invalid or inconsistent data that have failed edits. This includes automatic and manual imputations; it excludes follow-up with respondents and the corresponding corrections (if applicable). Thus, imputation as defined above occurs after data collection, no matter from which source or mix of sources the data have been obtained, including administrative data. After imputation, the data file should normally only contain plausible and internally consistent data records.
739
740
741 This indicator is influenced both by the item non-response and the editing process. It measures both the relative amount of imputed values and the relative influence on the final estimates from the imputation procedures.
742
743
744 The un-weighted imputation rate for a variable is the ratio of the number of imputed values to the total number of values requested for the variable.
745
746
747 The weighted rate shows the relative contribution to a statistic from imputed values; typically a total for a quantitative variable. For a qualitative variable, the relative contribution is based on the number of units with an imputed value for the qualitative item.
748 )))
749 |Applicability :|(((
750 The imputation rate is applicable
751
752 −      to all statistical processes (with micro data; hence, e.g., direct data collection and administrative data); 
753
754 −      to producers.
755
756
757 )))
758 |(((
759 Calculation formulae:
760
761
762
763 )))|(((
764 1. Un-weighted on the statistical process and variable level:
765
766 [[image:1768512459010-647.jpeg]]
767
768 //nAV //and //nOV //are the numbers of assigned values and observed values, respectively.
769
770
771 1. The contribution of imputed values is calculated in an analogous way, but weighted and with variable values.
772
773
774
775
776 [[image:file:///C:/Users/axyli/AppData/Local/Temp/msohtmlclip1/01/clip_image019.jpg]]Here, //AV //and //OV //are the sets of units with assigned and observed values, respectively. In addition, //j w //is the weight (normally the weight used for estimation takes into account the sample design as well as adjustment for unit non response and final calibration) of the unit j. In case of a qualitative variable, the value of y equals 1.
777
778
779 In case of a qualitative variable, the value of //y ,,j ,,//=1 if the //j//th unit shows a given characteristic and 0 otherwise.
780
781 **~ **
782
783 When imputation is counted the following changes have to be considered:
784 )))
785 | |(((
786 1. imputation of a (non-blank) value for a missing item
787 1. imputation of a (non-blank) value to correct an observed invalid
788
789 (non-blank) value  iii. imputation of a blank value to correct an undue invalid (nonblank) response.
790
791
792 The two main cases for the imputation rate are:
793
794
795 Design-weighted rate: //w,,j ,,//=//d ,,j ,,//where basically//d ,,j ,,//=1π//,,j ,,//, meaning that the design weight is the inverse of the selection probability.
796
797 Size-weighted rate: //w,,j ,,//=//d ,,j ,,x ,,j ,,//where //x,,j,,// is the value of a variable X
798 )))
799 |Target value:|A value equal or close to zero is desirable; imputation indicates missing and invalid values.
800 |Aggregation levels and principles:|(((
801 * MS: The calculation is done for key variables at statistical process level.
802 * EU: Aggregations can be made at the level of EU on the basis of harmonised statistical production processes across Member States, considering this as a single statistical process. Alternatively, Eurostat can report lower and higher imputation rates for a given variable at statistical process level.
803 )))
804 |(((
805 Interpretation:
806
807
808 )))|(((
809 The un-weighted rate shows, for a particular variable, the proportion of units for which a value has been imputed due to the original value being a missing, implausible, or inconsistent value in comparison with the number of units with a value for this variable. Units with imputation of a blank value to correct an undue invalid (non-blank) response (type iii) have to be included in both numerator and denominator.
810
811 The weighted rate shows, for a particular variable, the relative contribution of imputed values to the estimate of this item/variable. Obviously this weighted indicator is meaningful when the objective of a survey is that of estimating the total amount or the average of a variable. When the objective of the estimation is that of estimating complex indices, the weighted indicator is not meaningful.
812 )))
813 |Specific guidance:|-
814 |References:|(((
815 * ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat).
816 * ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
817 * Statistics Canada Quality Guidelines, Fifth Edition – October 2009
818 )))
819
820
821
822
823 |(((
824 **~ **
825
826 **Name:**
827 )))|(((
828 **~ **
829
830 **TP1. Time lag - first results**
831
832 **~ **
833 )))
834 |(((
835 Definition:
836
837
838
839 )))|(((
840 //General definition~://
841
842 The timeliness** **of statistical outputs is the length of time between the end of the event or phenomenon they describe and their availability.
843
844
845 //Specific definition~://
846
847 The number of days (or weeks or months) from the last day of the reference period to the day of publication of first results.
848 )))
849 |Applicability :|(((
850 This indicator is applicable:
851
852 - to all statistical processes with **preliminary data releases**; - to producers.
853
854
855 T1 is **not** applicable for statistical processes with only one, directly final, set of results/statistics – then only T2 is used.
856 )))
857 |(((
858 Calculation formulae:
859
860
861
862
863
864
865 )))|(((
866 //T//1 =//d frst //−//drefp//
867
868
869 //d,,frst,,// … Release date of first results;
870
871 //d,,refp,,//… Last day (date) of the reference period of the statistics
872
873
874 //Measurement units//: datum format (calendar days; if the number of days is large, it may be converted into weeks or months )
875
876 Instead of a period, the reference can also be a time point.
877 )))
878 |Target value:|The target values usually are fixed by legislation or gentlemen's agreement. Nevertheless, smaller values denote higher timeliness.
879 |Aggregation levels and principles: |(((
880 The calculation is done, for a meaningful choice, at subject matter domain level. It could refer to the current production round or be an average over a time period. Aggregations are possible at EU and domain (e.g. social statistics, business statistics) level.
881
882
883 )))
884 |(((
885 Interpretation:
886
887
888 )))|(((
889 This indicator quantifies the gap between the release date of first results and the date of reference for the data.
890
891
892 Comparisons could be made among statistical processes with the same periodicity.
893 )))
894 |(((
895 Specific guidance
896
897
898 )))|(((
899 The reasons for possible long production times should be explained and efforts to improve the situation should be described.
900
901
902 For annual statistics or where timeliness is measured in years rather than in days a sentence stating timeliness would be sufficient.
903 )))
904 |References:|§ ESS Handbook for Quality Reports – 2009 Edition (Eurostat). § ESS Standard for Quality Reports – 2009 Edition (Eurostat).
905
906 **~ **
907
908 **~ **
909
910 |(((
911 **~ **
912
913 **Name:**
914
915 **~ **
916 )))|(((
917 **~ **
918
919 **TP2. Time lag - final results**
920 )))
921 |(((
922 Definition:
923
924
925
926 )))|(((
927 //General definition~://
928
929 The timeliness** **of statistical outputs is the length of time between the end of the event or phenomenon they describe and their availability.
930
931
932 //Specific definition~://
933
934 The number of days (or weeks or months) from the last day of the reference period to the day of publication of complete and final results.
935 )))
936 |Applicability :|(((
937 This indicator is applicable:
938
939 * to all statistical processes;
940 * to users and producers, with different level of details given.
941 )))
942 |(((
943 Calculation formulae:
944
945
946
947
948 )))|(((
949 //T//2 =//d finl //−//drefp//
950
951 //d,,finl,,// … Release date of final results ;
952
953 //d,,refp,,//… Last day (date) of the reference period of the statistics
954
955
956 //Measurement units//: datum format (calendar days; if the number of days is large, it may be converted into weeks or months)
957
958 Instead of a period, the reference can also be a time point.
959 )))
960 |Target value:|The target values usually are fixed by legislation or gentlemen's agreement. Nevertheless, smaller values denote higher timeliness.
961 |Aggregation levels and principles: |The calculation is done, for a meaningful choice, at subject matter domain level. It could refer to the current production round or be an average over a time period. Aggregations are possible at EU and domain (e.g. social statistics, business statistics) level.
962 |(((
963 Interpretation:
964
965
966 )))|(((
967 This indicator quantifies the gap between the release date of the final results and the end of the reference period.
968
969
970 Comparisons could be made among statistical processes with the same periodicity
971 )))
972 |Specific guidance|(((
973 The reasons for possible long production times should be explained and efforts to improve the situation should be described.
974
975
976 To be further defined by subject matter domain, taking the revisions’ policy into account, what could be considered by "final results".
977
978
979 For annual statistics or where timeliness is measured in years rather than in days a sentence stating timeliness would be sufficient.
980 )))
981 |References:|§ ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat). § ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
982
983 **~ **
984
985
986 |(((
987 **~ **
988
989 **Name:**
990
991 **~ **
992 )))|(((
993 **~ **
994
995 **TP3. Punctuality - delivery and publication**
996 )))
997 |(((
998 Definition:
999
1000
1001 )))|Punctuality is the time lag between the delivery/release date of data and the target date for delivery/release as agreed for delivery or announced in an official release calendar, laid down by Regulations or previously agreed among partners.
1002 |Applicability :|(((
1003 The punctuality of publication is applicable:
1004
1005 * to all statistical processes with fixed/pre-announced release dates,
1006 * to users and producers, with different aspects and calculation formulae.
1007
1008
1009
1010 Computed only by Eurostat but recommended also for inclusion in national quality reports.
1011 )))
1012 |(((
1013 Calculation formulae:
1014
1015
1016 )))|(((
1017 **For producers:**
1018
1019
1020 **Punctuality of data delivery P3 **
1021
1022 //P//3 = //dact //− //dsch//
1023
1024 d,,act,, .. Actual date of the effective provision of the statistics d,,sch,,…Scheduled date of the effective  provision of the statistics
1025
1026 // //
1027
1028 //Measurement units//: datum format (calendar days)
1029
1030 **~ **
1031
1032 **For users:  **
1033
1034 **~ **
1035
1036 **Rate of punctuality of data publication** **P3,,R,,** Relevant for a group of statistics/results
1037
1038 P3,,R,, is the rate of datasets that have met the release calendar date in a group of datasets. m
1039
1040 //P//3//R //= mpc +pcmup
1041
1042 m,,pc,,…  Number of statistics/results that have been published on the date announced in the calendar or have been released earlier (punctual) m,,up,,…  Number of statistics/results that have not met the date announced in the calendar (unpunctual)
1043 )))
1044 |Target value:|(((
1045 The target value for P3 is 0 meaning that there is no delay on the delivery/transmission of data.
1046
1047
1048 For P3,,R,, the target value is 1 meaning that 100% of the items were published on the pre-fixed calendar date.
1049
1050
1051 )))
1052 |Aggregation levels and principles: |(((
1053 There are  two aspects:
1054
1055 - National data deliveries to Eurostat (producer-oriented), - Publication/release by Eurostat (user oriented),
1056
1057
1058 The calculation is done at statistical process level. Aggregations are to be made at EU-level over countries and over domains.
1059 )))
1060 |(((
1061 Interpretation:
1062
1063
1064 )))|(((
1065 The indicator **Punctuality of data delivery** quantifies the difference (time lag) between actual and target date.
1066
1067
1068 This should be interpreted according to the periodicity of the statistical process.
1069
1070
1071 )))
1072
1073
1074
1075 | |(((
1076 The indicator **Rate of punctuality** of release (P3,,R,,),, ,,evaluates the punctuality of release of a group of particular datasets.
1077
1078
1079 )))
1080 |(((
1081 Specific guidance
1082
1083
1084 )))|(((
1085 **For producers:**
1086
1087 For compliance monitoring purposes Eurostat domain managers should monitor this indicator for individual countries. This information can be pre-filled by Eurostat as it is known when data are received from the MS. Formula P3 should be applied in this case.
1088
1089
1090 This indicator can be presented in table format for the different MS.
1091
1092
1093 The reasons for late or non-punctual delivery should be stated along with their effect on the statistical product, meaning that because of late data deliveries the quality assurance procedures for the whole product/series might not be completed.
1094
1095
1096 **For users:**
1097
1098 Enough to compile this indicator as an aggregate at ESTAT level. Formula P3,,R,, should be applied in this case.
1099
1100
1101 Some explanations should be given to users concerning non-punctual publication.
1102
1103
1104 )))
1105 |References:|§ ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat). § ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
1106
1107
1108
1109
1110
1111 |(((
1112 **~ **
1113
1114 **Name:**
1115
1116 **~ **
1117 )))|(((
1118 **~ **
1119
1120 **CC1. Asymmetry for mirror flows statistics - coefficient**
1121 )))
1122 |(((
1123 Definition:
1124
1125
1126
1127 )))|(((
1128 //General definition~://
1129
1130 Discrepancies between data related to flows, e.g. for pairs of countries.
1131
1132
1133 //Specific definition (a few versions are provided) Bilateral mirror statistics~://
1134
1135 The difference or the absolute difference of inbound and outbound flows between a pair of countries divided by the average of these two values.
1136
1137
1138 //Comment//
1139
1140 Outbound and inbound flows should be considered to be any kind of flows specific to each subject matter domain (amounts of products traded, number of people visiting a country for tourism purposes, etc.)
1141 )))
1142 |Applicability :|(((
1143 The asymmetries for statistics mirror flows is applicable:
1144
1145 - to domains in which mirror statistics (flows concerning trade, migration, tourism statistics, FATS, balance of payment etc) are available  - to producers.
1146
1147
1148 Computed by Eurostat (pre-filled in quality report)
1149 )))
1150 |(((
1151 Calculation formulae:
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169 )))|(((
1170 **Bilateral mirror statistics:**
1171
1172 For each pair of countries, suppose:
1173
1174 1. – Country A
1175 1. – Country B
1176
1177
1178
1179 //CC//2//AB //= //OFOFABAB //+− //mIFmIFABAB//
1180
1181 2
1182
1183 //CC//2//BA //= //OFOFBABA //+− //mIFmIFBABA//
1184
1185 2
1186
1187 A joint measure can be obtained from the two differences in relation to an average flow (several possibilities, one is given below):
1188
1189 //CC//2//AB //= //OFOFABAB //−+//mIFmIFABAB //++ //OFOFBABA //+−//mIFmIFBABA//
1190
1191 2                         2
1192
1193 OF,,AB,, - outbound flow going from country A to country B  m IF,,AB – ,,mirror inbound flow
1194
1195 IF,,BA,, - mirror inbound flow to country B from country A m OF,,AB - ,,mirror outbound flow
1196
1197
1198 **Multilateral mirror statistics: **
1199
1200 OF,,AiOj,, - outbound flow going from country A,,i,, to any other country O,,i,,  mIF,,AiOj – ,,mirror inbound flow
1201
1202 Ai – country Ai
1203
1204 Oj – Another country Oj
1205
1206 K – the number of countries country A,,i,, may have contacts with C – group of countries EU + EFTA
1207
1208
1209 )))
1210
1211
1212
1213 | |(((
1214 //C      K//
1215
1216 ∑∑ //OFAiOj //− //mIFAiOj//
1217
1218
1219 |(((
1220 ~=
1221 )))
1222
1223 |(((
1224 ~=
1225 )))
1226
1227 |(((
1228 ~=
1229 )))
1230
1231 |(((
1232 //C//
1233 )))
1234
1235 |(((
1236 //K//
1237 )))
1238
1239 |(((
1240 //i//
1241 )))
1242
1243 |(((
1244 //j//
1245 )))
1246
1247 |(((
1248 //C//
1249 )))
1250
1251 |(((
1252 1
1253 )))
1254
1255 |(((
1256 1
1257 )))
1258
1259 |(((
1260 2
1261 )))
1262
1263 //CC                 OFAiOj //+ //mIFAiOj//
1264 ^^∑∑^^= =            2 //i //1 //j //1
1265 )))
1266 |Target value:|The value of this indicator should be as close to zero as possible, since – at least in theory – the value of inbound and outbound flows between pairs of countries should match.
1267 |Aggregation levels and principles:|(((
1268 * MS: The calculation is done for key variables/sub-series to be selected by the Eurostat domain manager.
1269 * EU: Aggregations are possible at EU-level (see multilateral mirror statistics formulae). Alternatively, where e.g. not all information is available, lower and higher values of bilateral mirror statistics can be reported to indicate the range.
1270 )))
1271 |(((
1272 Interpretation:
1273
1274
1275 )))|(((
1276 In domains where mirror statistics are available it is possible to assess geographical comparability measuring the discrepancies between inbound and outbound flows for pairs of countries.
1277
1278
1279 Mirror data can help checking the consistency of data reporting, of data, of the reporting process and the definitions used. Finally, they can help to estimate missing data. For the users the asymmetries indicators provide some indication of overall data credibility.
1280
1281
1282 There is perfect symmetry (outbound flows are equal to mirror inbound flows) when the coefficient is equal to zero. The more the coefficient diverges from zero, the more the asymmetry between outbound flows and mirror inbound flows becomes important.
1283 )))
1284 |(((
1285 Specific guidance:
1286
1287
1288 )))|(((
1289 CC2A,,B,, and CC2B,,A ,,indicators can be negative or positive.  Indicator CC2AB is always non-negative.
1290
1291
1292 Outbound flows from Member State A to Member State B, as reported by A, should be almost equal to inbound flows into B coming from A, as reported by B. Because some domains use a different valuation principle, inbound flows can be slightly different from outbound flows. Therefore comparisons dealing with mirror statistics have to be made cautiously and should take into account the existence of these discrepancies.
1293
1294
1295 The asymmetry coefficient CC2AB is useful because it can be monitored over time. 
1296
1297
1298 Indicators CC2A,,B,, and CC2B,,A,, can be either positive or negative and can be used to estimate if a country is globally declaring higher or lower level of flows compared with the mirror flows declared by its partner countries.  Indicators CC2A,,B,, and CC2B,,A,, should be presented in a table (example foreign trade statistics).
1299 )))
1300 |References:|(((
1301 * ESS Handbook for Quality Reports – 2009 Edition (Eurostat).
1302 * ESS Standard for Quality Reports – 2009 Edition (Eurostat).
1303 * International trade in services statistics - Monitoring progress on implementation of the Manual and assessing data quality – OECD Eurostat Trade in services experts meeting 2005.
1304 )))
1305
1306
1307
1308 |(((
1309 **~ **
1310
1311 **Name:**
1312
1313 **~ **
1314 )))|(((
1315 **~ **
1316
1317 **CC2. Length of comparable time series  **
1318
1319 **~ **
1320 )))
1321 |(((
1322 Definition:
1323
1324
1325
1326 )))|(((
1327 Number of reference periods in time series from last break.
1328
1329
1330 //Comment//
1331
1332 Breaks in statistical time series may occur when there is a change in the definition of the parameter to be estimated (e.g. variable or population) or the methodology used for the estimation. Sometimes a break can be prevented, e.g. by linking.
1333 )))
1334 |Applicability:|(((
1335 The length of comparable time series is applicable:
1336
1337 * to all statistical processes producing time-series;
1338 * to users and producers, with different level of details given.
1339
1340
1341
1342 Computed only by Eurostat but recommended also for inclusion in national quality reports.
1343 )))
1344 |(((
1345 Calculation formula:
1346
1347
1348 )))|(((
1349 The reference periods are numbered.
1350
1351
1352 //CC//1 =//Jlast //−//J first //+1
1353
1354 //J,,last,,// …number of the last reference period with disseminated statistics.
1355
1356 //J,,first,,//,, ,,…number of the first reference period with comparable statistics.
1357 )))
1358 |Target value:|A long time series may seem desirable, but it may be motivated to make changes, e.g. since reality motivates new concepts or to achieve coherence with other statistics.
1359 |Aggregation levels and principles:|(((
1360 The calculation is done at statistical process level. Aggregations are possible at MS, EU, and Domain (e.g. social statistics, business statistics) level.
1361
1362
1363 The indicator for the EU or domain level should be calculated by Eurostat considering the time series of the EU aggregate.
1364 )))
1365 |(((
1366 Interpretation:
1367
1368
1369 )))|If there has not been any break, the indicator is equal to the number of the time points in the time series.
1370 |Specific guidance:|(((
1371 The length of the series with comparable statistics is expressed as the number of time periods (points) in this series. It is counted from the first time period with statistics after the break onwards. The result does not depend on the length of the reference period.
1372
1373
1374 Only applicable for the statistical data disseminated in the sequence of regular time periods (points).
1375
1376
1377 If more than one series exist for one statistical process the domain manager should select the appropriate ones for calculation.
1378
1379
1380 )))
1381 |(((
1382 References:
1383
1384
1385 )))|§ ESS Handbook for Quality Reports – 2009 Edition (Eurostat). § ESS Standard for Quality Reports – 2009 Edition (Eurostat).
1386 |(((
1387 **~ **
1388
1389 **Name:**
1390
1391 **~ **
1392 )))|**AC1. Data tables – consultations [[(% class="wikiinternallink" %)^^**~[1~]**^^>>path:#_ftn1]](%%) **
1393 |(((
1394 Definition:
1395
1396
1397 )))|(((
1398 Number of consultations of data tables within a statistical domain for a given time period.
1399
1400 By "number of consultations" it is meant number of data tables views, where multiples views in a single session count only once.
1401
1402 Some information available through the monthly Monitoring report on
1403
1404 Eurostat Electronic Dissemination and its excel files with detailed figures.
1405 )))
1406 |Applicability:|(((
1407 The number of consultations of data tables is applicable:
1408
1409 * to all statistical processes using on-line data tables for dissemination of statistics;
1410 * to producers (Eurostat domain managers).
1411
1412 Computed only by Eurostat but recommended also for inclusion in national quality reports.
1413 )))
1414 |(((
1415 Calculation formulae:
1416
1417
1418
1419
1420
1421
1422 )))|(((
1423 AC2 = #//CONS//
1424
1425
1426 where #//CONS//,, ,,denotes the absolute number of elements in the set CONS (this is also called cardinality of the set). In this case CONS represents the consultations of a data table for specific subject-matter domain. The frequency of collection of the figures for this indicator should be monthly.
1427
1428 Remark: internal page views will be excluded.
1429 )))
1430 |Target value:|There is no immediate interpretation of low and high values of this indicator, and there is no particular target.
1431 |Aggregation levels and principles: |(((
1432 The calculation is done at statistical process level. Aggregation is possible at the following level:
1433
1434 * Domains specific data tables.
1435 * Annual aggregation.
1436
1437
1438
1439 The principle is to calculate the number of consultations of data tables by subject matter.
1440 )))
1441 |Interpretation:|(((
1442 This indicator should be carefully analysed and combined with other information that will complement the analysis.
1443
1444 The indicator contributes to the assessment of users' demand of data (level of interest), for the assessment of the relevance of subject-matter domains.
1445
1446
1447 A ratio can be computed to give insight to the proportion of consultation of the ESMS files in question in comparison to the total number of consultations for all the domains.
1448 )))
1449 |Specific guidance: |(((
1450 An informative and straightforward way to represent the output of this indicator is by plotting the figures over time in a graph. In particular, it would be a graph where the horizontal (x) axis would represent months and the vertical (y) axis would represent the number of datasets consulted. It would be possible to monitor the interest of users for each dataset at the domain specific level.
1451
1452
1453 A graph of both the number of consultations of data tables and ESMS files (AC1), with the appropriate tuning, would be interesting to display.
1454 )))
1455 |(((
1456 References:
1457
1458
1459 )))|§ ESS Handbook for Quality Reports – 2009 Edition (Eurostat). § ESS Standard for Quality Reports – 2009 Edition (Eurostat).
1460
1461 **~ **
1462
1463
1464 |(((
1465 **~ **
1466
1467 **Name:**
1468
1469 **~ **
1470 )))|(((
1471 **~ **
1472
1473 **AC2. Metadata - consultations [[(% class="wikiinternallink" %)^^**~[2~]**^^>>path:#_ftn2]](%%) **
1474 )))
1475 |(((
1476 Definition:
1477
1478
1479 )))|(((
1480 Number of metadata consultations (ESMS) within a statistical domain for a given time period.
1481
1482 By "number of consultations" it is meant the number of times a metadata file is viewed.
1483
1484
1485 Some information is available through the monthly Monitoring report on
1486
1487 Eurostat Electronic Dissemination and its excel files with detailed figures.
1488 )))
1489 |Applicability|(((
1490 This indicator is applicable:
1491
1492 * to all statistical processes;
1493 * to producers (Eurostat domain managers).
1494
1495 Computed only by Eurostat.
1496 )))
1497 |(((
1498 Calculation formulae:
1499
1500
1501
1502
1503
1504 )))|(((
1505 AC1 = #//ESMS//
1506
1507
1508 where #//ESMS//,, ,,denotes the absolute number of elements in the set ESMS
1509
1510 (this is also called cardinality of the set). In this case the set ESMS represents the ESMS files consulted for a specific subject-matter domain for a given time period.
1511
1512
1513 Remark: internal page views will be excluded.
1514 )))
1515 |Target value:|There is no immediate interpretation of low and high values of this indicator, and there is no particular target.
1516 |Aggregation levels and principles: |(((
1517 The calculation is done at statistical process level. Aggregation is possible at the following levels:
1518
1519 * Domains specific ESMS files.
1520 * Annual aggregation.
1521
1522
1523
1524 The principle is to calculate the number of consultations of ESMS files by subject matter domains.
1525 )))
1526 |(((
1527 Interpretation:
1528
1529
1530 )))|(((
1531 The indicator contributes to the assessment of users' demand of metadata (level of interest), for the assessment of the relevance of subject-matter domains.
1532
1533
1534 A ratio can be computed to give insight to the proportion of consultation of the ESMS files in question in comparison to the total number of consultations for all the domains.
1535 )))
1536 |(((
1537 Specific guidance
1538
1539
1540 )))|(((
1541 An informative and straightforward way to represent the output of this indicator is by plotting the figures over time in a graph. In particular, it would be a graph where the horizontal (x) axis would represent months and the vertical (y) axis would represent the number of ESMS files consulted. It would be possible to monitor the interest of users for each ESMS file at the domain specific level.
1542
1543
1544 A graph of both the number of consultations of data tables (indicator AC2) and metadata (ESMS) files with a correspondence, with the appropriate tuning, would be interesting to display, over time.
1545 )))
1546 |References:|§ ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat). § ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
1547
1548
1549
1550 |(((
1551 **~ **
1552
1553 **Name:**
1554
1555 **~ **
1556 )))|**AC3. Metadata completeness - rate**
1557 |(((
1558 Definition:
1559
1560
1561 )))|The ratio of the number of metadata elements provided to the total number of metadata elements applicable.
1562 |Applicability:|(((
1563 The rate of completeness of metadata is applicable:
1564
1565 * to all statistical processes;
1566 * to producers (Eurostat domain managers).
1567
1568
1569
1570 Computed only by Eurostat** **but recommended also for inclusion in national quality reports.
1571 )))
1572 |(((
1573 Calculation formulae:
1574
1575
1576
1577 )))|(((
1578 ∑#//M,,L,,//
1579
1580 //AC//3//,,C ,,//=   
1581
1582 ∑#//L//
1583
1584 //L// in the denominator is the set of applicable metadata elements under consideration and //M ,,L,,// in the numerator is the subset of //L //of available metadata elements. The notation #//L //means the number of elements in the set //L// (the cardinality). Letter C in the left-hand side of the formula stands for both EU and EFTA countries.
1585
1586
1587 The set //L //is obtained by calculation for a group of metadata elements as explained below over a geographical entity (MS or the EU+EFTA), a statistical domain, etc.
1588
1589
1590 There are three groups of metadata, described below together with a categorisation using the current EURO-SDMX concepts (only the main concepts are included in the following breakdown).
1591
1592
1593 1. Metadata about statistical outputs; concepts 3, 4, 5, 8.1, 9, 10;
1594 1. Metadata about statistical processes; concepts 11, 20.1, 20.2, 20.3, 20.4, 20.5, 20.6;
1595 1. Metadata about quality: concepts 12-19
1596
1597
1598
1599 Computations are made separately for each of the three groups and for each of the combinations (group of metadata, EU level, etc.)
1600 )))
1601 |Target value:|The target value is 1 meaning that 100% of metadata is available from what is required/applicable to the statistical process, or aggregate, in question.
1602 |Aggregation levels and principles: |(((
1603 The calculation is done at the level of ESMS files.
1604
1605 Aggregations are possible at MS, EU, and Domain (e.g. social statistics, business statistics) level.
1606
1607
1608 The principle is to calculate the indicators as an un-weighted rate at the level of MS and EU for a statistical domain (social statistics, business statistics etc.).
1609 )))
1610 |(((
1611 Interpretation:
1612
1613
1614 )))|(((
1615 Each indicator shows to what extent metadata of a specific type is available compared to what should be available.
1616
1617
1618 This indicator should be carefully analysed since this rate only reflects the existing amount of metadata for a certain statistical process but not the
1619 )))
1620 | |quality of that information.
1621 |Specific guidance:|(((
1622 All the information is to be retrieved from ESMS files.
1623
1624 In case the ESMS is empty for the different categories specified previously no calculation is needed but a descriptive text should be replaced.
1625
1626
1627 Concerning Eurostat, it is possible to have direct access to those files through Eurostat's website whereas for MS it will be possible to have access to ESMS files, in the near future, through the National RME tool.
1628
1629
1630 It should be taken into account what availability of metadata actually means.
1631 )))
1632 |(((
1633 References:
1634
1635
1636 )))|(((
1637 * ESS  Handbook for Quality Reports – 2009 Edition  (Eurostat).
1638 * ESS Standard for  Quality Reports – 2009 Edition  (Eurostat).
1639 * Euro SDMX Metadata Structure, version March 2009.
1640 )))
1641
1642 **~ **
1643
1644
1645 ----
1646
1647 [[~[1~]>>url:file:///C:/Users/axyli/Downloads/02-ESS-Quality-and-performance-Indicators-2014.pdf#_ftnref1]] The indicator must be collected in collaboration with Unit D4 - Dissemination.
1648
1649 [[~[2~]>>url:file:///C:/Users/axyli/Downloads/02-ESS-Quality-and-performance-Indicators-2014.pdf#_ftnref2]] The indicator must be collected in collaboration with Unit D4 - Dissemination.
1650
1651
© Semantic R&D Group, 2026