The Research Value of DataIt is important to note that data can take a number of forms. The terms "raw" or "primary" data refer to information available in the form in which it was originally recorded: for example, a registration record, a daily count of new hospital cases, or a list of secondary schools. In contrast, statistics are numbers produced to describe or summarize patterns of primary data: number of families registered in a given Field, average hospital admissions per month, percentage of districts containing secondary schools. Statistics are often easier than primary data to access and manipulate, but because they are essentially summaries, they always imply a loss of original information. This may limit their usefulness for some types of research.
Another important dimension is the unit of analysis or type of population
that the data describe. Different units of analysis are appropriate for
different types of data. For example, the implied unit of analysis for
age or gender data is the individual; for measures of income, it may be
the household or family; and for infrastructure or public health measures,
the village. Units of analysis can of course be combined to different levels
of aggregation, such as the Area, the Field, or various socio-economic groupings.
The unit of analysis may also be relevant to technical discussions about
linking different data bases, such as individual birth records and family-level
registration files.
As we see in the sections to follow, the UNRWA data currently available
vary in all these dimensions. Some data sources, such as the registration
records and Special Hardship Cases data base, are maintained as primary
data; others, particularly in the areas of health and education, are comprised
of statistics at various levels of aggregation. Depending on the source
and subject matter, underlying units of analysis may include the individual,
the family, the clinic, or school. By relevance we mean the interest of the data in terms of their potential for answering a research question.
The comprehensive geographical and thematic scope of UNRWA's data is their
greatest advantage. The UNRWA data represent one of the few possibilities
for comparison of the conditions for Palestinian refugees in different host
countries. The fact that records were systematically collected over time
creates opportunities for time series analysis, and further increases the
value of the data. The quality of data has two dimensions: reliability and validity. Data with little reliability have limited value for researchers. The reliability of data is determined by how data are produced, as the term refers to the accuracy of the various operations in this process. Here proper documentation of procedures for collection of information is of great importance. Data have high reliability if repeated measurements of the same phenomenon provide consistent results. To provide an accurate representation, data must be valid as well as reliable. Validity means that the data actually measure the concept that they purport to measure. For example, "wage" or "salary" income alone is not always a valid measure of household income, as some families will supplement these earnings with income from home-based business. Data can have low reliability and high validity, or vice versa. Also, reliability and validity are not all-or-none properties; both are matters of degree. The validity and reliability of the various UNRWA data will be discussed in more detail in the following sections of this report. By accessibility we mean the ease with which data can be obtained by a researcher and arrayed in a form suitable for research. Access to data thus has both legal and technical dimensions. Legally, access may be constrained by need to obtain permissions from the responsible authorities. To accomplish its mandate of providing services to refugees in a politically unstable landscape, UNRWA may have to place constraints on researchers, and the need for permissions may be duly justified. Introducing standardized procedures and forms for application for permissions may, however, improve access for approved researchers from outside the Agency.
Regarding the technical dimension of accessibility, central storage and
computerization of UNRWA's data would substantially improve access for researchers.
Computerized data bases can be accessible from anywhere through computer
networks. An index of contents (such as lists of tables and variables),
and documentation of data collection methods further improves access to
data. |
al@mashriq 960428/960613 |