Data Collection

The Agincourt database is a relational database model that is a longitudinal representation of population data in the sub-district. The data is stored in the computer program, Microsoft SQL Server 2005. The historical evolution of the database is as follows: The baseline census was captured in Foxpro in 1993, converted into Microsoft Access in 1995, and followed the upgrades of the Microsoft Access software until 2001 when it was converted into SQL Server 2000. The current relational database model has been in place since 1999.

Overview of data collection procedures

The Bushbuckridge sub-district of South Africa’s AHDSS database houses the data arising from the thorough coverage of demographic events within a geographically specified population. There is no need for sampling because the population consists of all the people who live in the research area, including those who are temporarily attached to the subdistrict’s homes as migrants. This makes it possible to calculate how frequently population events occur in the population.


  • Fertility, mortality, and migration data are based on a comprehensive registration system starting with a baseline enumeration.
  • This has been followed by a routine update of vital events involving repeat returns to all households in the population.
  • There were four updates between 1992 and 1998 followed by annual updates from 1999 to present.
  • Variables measured routinely include: births, deaths, in- and out-migrations, household relationships, resident status, refugee status, education, antenatal and delivery health-seeking practices.
  • During the census update rounds, a trained fieldworker visits a household unit and interviews the most knowledgeable respondent available.
  • Individual-level information is checked and updated on all household members. Any events that have occurred since the previous census update are recorded.
  • Where appropriate, certain questions are directed at specific household members. For example, maternity history or pregnancy outcome information is asked directly from the woman involved and a verbal autopsy is conducted with the person most closely involved with the deceased during their terminal illness to establish the most probable cause of death.
  • Data quality checks include duplicate surveying of a random sample of households and rigorous checking of census forms at field and office levels.

Overview of database contents

A range of tables store information in the database and are linked together to form the relational structure. The following tables exist:

The Observations table records the details of each household interview. Examples of data fields: ‘date of observation’ and ‘name of fieldworker’

The Individuals table records the basic information about each individual in the population. Examples of data fields: ‘Name’, ‘Surname’, ‘Date of Birth’ , ‘Gender’, ‘Nationality’, and ‘Refugee Status’

The Residences table records the episode of an individual in a location. The key data are the date at which a residence starts or ends, and the event that starts an episode (e.g. birth, in-migration, or present at the start of the DSS), and/or the event that ends the episode (e.g. death, out-migration). If an episode has not ended and is still open, then it is marked as ‘current’.

The Memberships table records the episode of an individual joining or leaving a household. The key data are the date at which a membership starts or ends, and the event that starts an episode (e.g. birth, in-migration, or present at the start of the DSS), and/or the event that ends the episode (e.g. death, out-migration). If an episode has not ended and is still open, then it is marked as ‘current’. Included in this table is the relationship of the individual to the household head (e.g. wife, son, etc.) If a household head dies or out-migrates then the membership episode is closed and a new membership episode is opened containing the new relationship to household head

The Locations table records the dwelling place at which an individual is located. The residence episode records when a person enters or leaves this place. A location is a physical structure that has a corresponding latitude and longitude.

The Households table records the social unit (i.e., household) of which an individual is a member.

There are several ‘Event’ tables, which record the details of the demographic events which bring an individual into the database or remove an individual from the database. A few key variables are recorded on each event of which a very important one is the date of the event. These tables include Pregnancies, Births, In-Migrations, Out-Migrations

The Union Episodes records the episode of an inidividual in a union another individual. This table records the date when a union starts, the event that started it (e.g., marriage, re-marriage), the date when a union ends, and the event that ends it (e.g., divorce, separation or death). The Union Episodes table links to the Marriage Attributes table which records details about marriages and informal unions.

A number of status observation tables are also stored in the database. These table record attributes of either individuals or households and are recorded as cross-sectional data at the time of a census. Most status observations are repeated over time. Other related tables include the Verbal Autopsy table which records information used to determine the cause of death and the Maternity History table which records the details of a woman’s fertility outside the study site and observation period. At the household level, such tables include:

  • Asset Status
  • Child Care Grants
  • Food Security Status
  • At the individual level, such tables include:
  • Adult Health Status
  • Child Care Grants
  • Child Morbidity
  • Cough Status
  • Education Status
  • Elder Health Care
  • Fatherhoods
  • Father Support Status
  • Health Care Utilisation
  • IndividualGrantStatus
  • Labour Status
  • Stroke Status
  • Temporary Migrations
  • Vital Documents
  • Managing “time” in the AHDSS

The date is recorded for all vital events (births, deaths, and migrations) in the database. If the date is estimated this is indicated in a separate field. Observations are time-stamped with an observation date. This gives the date at which an interview took place, which is the date at which the data was recorded. All events and status observations can be linked to an observation date. Residences and memberships are recorded as episodes with start and end dates. Status observations are repeated, cross-sectional measures and the dates of observation are recorded. These observations are repeated at different periodicities in the database and some have only been captured once.