Data Collection

Overview of data collection procedures

The AHDSS database contains the data resulting from the exhaustive coverage of demographic events within a geographically defined population, situated within the Bushbuckridge sub-district of South Africa. As of June 2018, the study site consists of 31 research villages with a population of 116, 549 people, living in 22, 721 households. The population includes all persons resident in the study site thus requiring no sampling. The population includes people linked as temporary migrants to the households in the sub-district. This enables, amongst other benefits, computation of the incidence of population events in the de jure population.

Fertility, mortality, and migration data are based on a comprehensive registration system starting with a baseline enumeration of the whole population in 1992. This has been followed by a routine update of vital events involving repeat returns to all households in the population. There were four updates between 1992 and 1998 followed by annual updates from 1999 to present. Variables measured routinely include: births, deaths, in- and out-migrations, household relationships, resident status, refugee status, education, antenatal and delivery health-seeking practices.

During the census update rounds, a trained fieldworker visits a household unit and interviews the most knowledgeable respondent available. Individual-level information is checked and updated on all household members. Any events that have occurred since the previous census update are recorded. Where appropriate, certain questions are directed at specific household members. For example, maternity history or pregnancy outcome information is asked directly from the woman involved and a verbal autopsy is conducted with the person most closely involved with the deceased during their terminal illness to establish the most probable cause of death.

Data quality checks include duplicate surveying of a random sample of households and rigorous checking of census forms at field and office levels.

Overview of database structure

The Agincourt database is a relational database model that is a longitudinal representation of population data in the sub-district. The data is stored in the computer program, Microsoft SQL Server 2005. The historical evolution of the database is as follows: The baseline census was captured in Foxpro in 1993, converted into Microsoft Access in 1995, and followed the upgrades of the Microsoft Access software until 2001 when it was converted into SQL Server 2000. The current relational database model has been in place since 1999.

Overview of database contents

A range of tables store information in the database and are linked together to form the relational structure. The following tables exist:

  • The Observations table records the details of each household interview. Examples of data fields: ‘date of observation’ and ‘name of fieldworker’
  • The Individuals table records the basic information about each individual in the population. Examples of data fields: ‘Name’, ‘Surname’, ‘Date of Birth’ , ‘Gender’, ‘Nationality’, and ‘Refugee Status’
  • The Residences table records the episode of an individual in a location. The key data are the date at which a residence starts or ends, and the event that starts an episode (e.g. birth, in-migration, or present at the start of the DSS), and/or the event that ends the episode (e.g. death, out-migration). If an episode has not ended and is still open, then it is marked as ‘current’.
  • The Locations table records the dwelling place at which an individual is located. The residence episode records when a person enters or leaves this place. A location is a physical structure that has a corresponding latitude and longitude.
  • The Memberships table records the episode of an individual joining or leaving a household. The key data are the date at which a membership starts or ends, and the event that starts an episode (e.g. birth, in-migration, or present at the start of the DSS), and/or the event that ends the episode (e.g. death, out-migration). If an episode has not ended and is still open, then it is marked as ‘current’. Included in this table is the relationship of the individual to the household head (e.g. wife, son, etc.) If a household head dies or out-migrates then the membership episode is closed and a new membership episode is opened containing the new relationship to household head.
  • The Households table records the social unit (i.e., household) of which an individual is a member.
  • The Union Episodes records the episode of an inidividual in a union another individual. This table records the date when a union starts, the event that started it (e.g., marriage, re-marriage), the date when a union ends, and the event that ends it (e.g., divorce, separation or death). The Union Episodes table links to the Marriage Attributes table which records details about marriages and informal unions.
  • There are several ‘Event’ tables, which record the details of the demographic events which bring an individual into the database or remove an individual from the database. A few key variables are recorded on each event of which a very important one is the date of the event. These tables include:
  • Pregnancies
  • Births
  • In-Migrations
  • Out-Migrations
  • A number of status observation tables are also stored in the database. These table record attributes of either individuals or households and are recorded as cross-sectional data at the time of a census. Most status observations are repeated over time. Other related tables include the Verbal Autopsy table which records information used to determine cause of death and the Maternity Historytable which records the details of a woman’s fertility outside the study site and observation period.
    • At household level, such tables include:
      • Asset Status
      • Child Care Grants
      • Food Security Status
    • At the individual level, such tables include:
      • Adult Health Status
      • Child Care Grants
      • Child Morbidity
      • Cough Status
      • Education Status
      • ElderHealthCare
      • Fatherhoods
      • Father Support Status
      • Health Care Utilisation
      • IndividualGrantStatus
      • Labour Status
      • Stroke Status
      • Temporary Migrations
      • Vital Documents

Managing “time” in the AHDSS

The date is recorded for all vital events (births, deaths, and migrations) in the database. If the date is estimated this is indicated in a separate field. Observations are time stamped with an observation date. This gives the date at which an interview took place, which is the date at which the data was recorded. All events and status observations can be linked to an observation date. Residences and memberships are recorded as episodes with start and end dates. As described above, a residence is the period of time an individual spends located at a specific dwelling. A membership is the period of time that an individual remains a member of a household, i.e., a social unit. The events that start or end a residence or a membership are described above. Status observations are repeated, cross-sectional measures and the dates of observation are recorded. These observations are repeated at different periodicities in the database and some have only been captured once.

Overview of which census modules were run in which years