Data Quality

In the preparatory stage for the FALCOT 92 and during its implementation, every effort was made to ensure strict adherence to the methodology developed and the field protocol designed. However, classic problems may arise in any survey regardless of the control procedures used. The quality of data collected is usually determined by the extent of control on measurement errors (non-sampling errors) as long as the proper methodology is used to insure minimal amount of sampling errors which are uncontrollable. Having made every effort to control non-sampling errors and assuming that the questionnaire is carefully designed and pretested, it is only natural to expect to get a data set of fairly good quality. However, response and field worker bias may result in low quality data. As for the FALCOT 92, the general assessment of data quality is not the subject of this section. However, due to the sensitivity of the parameters we are trying to estimate and the procedures used, we present here a brief investigation of the quality of those variables entering estimates of childhood mortality and fertility. The overall quality of these variables will be assessed by evaluating age structures, average parities, sex ratios of CEB and proportion of dead children.

Age Structure
The data set represents the responses of 1223 females between 15 and 85 years of age at the time of field work. For the purpose of analysis for this and the next section, only 995 (81.4%) female respondents of child- bearing age will be used in the analysis based on data by age groups. Only 665 (54.4%) ever married women (EMW) respondents who reported having a marriage duration of 34 years or less will be used in the analysis using data classified by duration of marriage.

Table 2.4 shows the age structure of qualified females for analysis. The breakdown of the age structure of all females of child bearing age using 5-year categories is almost similar to the overall age structure of females in the same age interval which is compiled from the listing of ages of all household members. The only exception is the share of females in the age groups 20-24 and 25-29. While the first age group is under- represented, the second is over-represented. Assuming that the overall age structure is accurate, this slight misrepresentation may be viewed as an indicator of selection bias during randomization or during field work. The over- (under-) representation in these two age groups will have direct impact on the estimates of IMR and U5MR due to the fact that the average parities of these two age groups will be extensively used in calculations.

Table 2.4 Age structure of qualified females for child mortality estimation
Age Group# all FemalesPercent# of EMWPercent% EMW of All
15-1925825.9 44 7.117.1
20-2418418.513421.572.8
25-2917918.012419.969.3
30-3412812.910516.882.0
35-39909.17512.183.3
40-44848.47612.190.5
45-49727.26710.793.1

Age Reporting
Age misreporting, in particular digit preference for ages ending in 0 or 5, is common in censuses and sample surveys in most developing countries. This is also evident in the FAFO survey. An indicator of the degree of age heaping is the Whipple's Index, which ranges from 100 when there is no preference for 0 and 5 and up to 500, when only ages ending in 0 and 5 are reported (Newell 1988: 24). The Whipple's Index for the age range 15-49 in the FAFO survey is 141, which indicates a high but not severe case of digit preference. United Nations describes data with an Index of this magnitude as "rough" (Newell 1988: 25).

There is also a clear case of age misreporting at age 16, with more than twice as many respondents (98) at age 16 as in ages 15 (44) and 16 (45). Over-sampling at other ages seems to have happened to a lesser extent (ages 32, 42 and 48).

Age misreporting may result in a slight bias in parity calculations since the heaping usually occurs at the start of age groups. However, since age heaping seems to have occurred in all categories, we expect the net effect to be negligible. Had field workers been instructed to verify ages from official documents, the effect of this phenomenon could have been reduced.
The data on age at first marriage does not appear to be positively skewed in the smooth manner that it should be. Calculations indicate possible over-sampling of ever married women or age heaping by those sampled in reporting their age at first marriage. The age heaping is very clear at ages ending with 0 or 5. Other ages suspected of heaping or over- sampling are 32, 42, and 48, as is also found in the age reporting of all females.

Sex Ratios and Proportion Dead
An acceptable sex ratio of male (MCEB) to female children ever born (FCEB) usually falls between 1.02 and 1.07. Sex ratios falling outside this interval may indicate errors in sampling or under-reporting of births of one sex - many of these probably died at an early age. The official reports of the Israeli CBS for the occupied territories indicates that the sex ratio at birth is approximately 1.5 for 1991 (ICBS, 1992). The overall sex ratio for CEB in the data set used here is 1.12. This slight deviation is not serious enough to raise concern about the quality of the data. The age-specific sex ratios and proportion dead by age category are presented in table 2.5.

The sex ratios of the table may indicate the existence of under- reporting of female births, but could also stem from sampling error. This would probably be due to ommissions on the part of mothers giving female births who later (in a matter of days or weeks) have died. On the other hand, the sex ratio of dead children could indicate the existence of a severe case of under-reporting of dead males in three age groups. The abnormality of the sex ratios for both births and deaths casts some doubt on the quality of the data set. These abnormalities will directly affect our estimates of childhood mortality, and pave the way for an over- estimation of female infant mortality and under-estimation of male infant mortality. As for the proportion of dead children, we notice a drop in this proportion for the 25-29 age group. Moreover, this indicator does not increase normally for the first three age groups, as opposed to the last three age groups. Both the drop in the proportion of dead children and the slow increase for the three categories are indicators of problems with the data quality.

Table 2.5 Sex ratios of children ever born, dead children and proportion of dead children by age of mother at the time of interview
Age GroupSex Ratio at Birth (MCEB/FCEB)Sex Ratio of Dead ChildrenProportion of Dead Children (# DEAD/# FEMALES)
15-190.82----0.0104
20-241.020.400.0512
25-291.481.060.0497
30-341.270.830.0656
35-390.860.380.0921
40-441.101.550.1109
45-491.071.110.1499
Total1.120.920.0914

Average Parity
Questions relating to children ever born were put to all ever married women, EMW. From the investigation of the data it turns out that no EMW were classified as having missing parity. Moreover, 72 (10.8%) EMW were classified as having had no children (zero parity). Most of these are relatively newly wed females. Average parity that systematically increases with the age of the woman with no sharp jumps, is a sign of acceptable data quality. The average parity is calculated by dividing the number of births by the total number of female respondents in each age group, see table 2.11 in the fertility section.

Conclusion Every measure we have used for checking the data quality has showed us a problem of some sort, the most serious being related to sex ratios. Our conclusion is that the available data - although suitable for many other analytical purposes - maybe are not good enough to produce reliable estimates of parameters of childhood mortality and fertility. Therefore, all estimates and analyses should be treated with caution.

----------------

al@mashriq                       960715