Annual Household Survey Methodology

Research design

For topline impact analysis, we follow the longitudinal study research design, under which the same samples are followed for five years from baseline. Raising The Village also carries out a heterogeneity analysis by cohorts to ensure we are reaching and impacting the most vulnerable population and understand the category of income earners where our impact is felt most. In our heterogeneity analysis, partner households are divided into four groups based on their household income and production at baseline. To measure the impact of our interventions, the incomes and production of these households are compared at baseline and graduation with counterpart peer groups.

Sampling approach

Random sampling of the target and reserve households is done once at baseline, and the same sample is followed for the entire longitudinal study period.

Sampling Frame: This is a representation of all listed village households.
Sampling Strategy Utilized: Probability sampling
Sampling Method: Stratified random sampling
The Different Strata: Men-headed households (single or joint), women-headed households (single), and youth-headed households (single or joint).

Sample selection

Raising The Village applies a 24/30 sampling approach. From the village census, households are stratified across household types with a random selection of households based on village size aligned with Uganda’s 2014 Census village demographics for both peer and Raising The Village households:

Villages comprising >100 Households: A sample of 30 households is drawn with a 60/20/20 ratio of Men/Women/Youth Headed Households.
Villages consisting of <100 Households: A sample of 24 households is drawn with a 50/25/25 ratio of Men/Women/Youth Headed Households

When a target household drops from the study, it is replaced with a reserve household with similar characteristics and relevant strata from a pool of reserve households. This evaluation is carried out for Year 1, 2, 3, 4 and 5 evaluations.

‍

Data collection and quality assurance

GPS validation is applied across modules: the household GPS is cross-checked against the time a contractor opens and closes each module, providing contextual verification throughout the survey process. Backchecks and callbacks are conducted for 10% of all households surveyed within one week of completion. These random quality checks ensure adherence to protocols and data accuracy.

Data for AHS is collected electronically using SurveyCTO, programmed with logical flow, consistency checks, and speed violation tracking. Prior to data collection, the AHS is field-tested, and enumerator feedback is incorporated into the tool. During survey administration, enumerators submit daily reports and the Venn team compiles activity reports to identify errors and inform real-time improvements.

To maintain independence and uniformity, we hire and train independent contractors as enumerators in line with our data privacy and protection protocols. One field supervisor is assigned to every 15 enumerators to ensure oversight and quality. In 2025, we introduced Case Management, where households are created as cases with pre-populated forms. This replaces the use of track sheets and manual ID assignment, streamlining workflow and reducing entry errors.

Additional data quality audits have also been strengthened in 2025, including audio audits (with respondent consent), and market price caps (informed by market surveys) to standardize and validate reported prices.

Analytical approach

We utilize the difference-in-differences (DID) approach to measure the true impact of our program by comparing changes in outcomes over time between partner communities (treatment group) and peer communities (control group).

To apply this method, we collect baseline data for both control and treatment groups. Baseline activities involve identifying and randomly selecting control and treatment sub-counties. The pre-treatment differences in outcomes across the two groups are captured at the household level, ensuring that our control and treatment groups have similar characteristics. This creates a level playing field for comparison. The treatment group is then exposed to the intervention, after which we analyze the differences in differences between both groups. The impact of the treatment is the difference after intervention (second difference) minus the difference pre-treatment (first difference).

Data analysis

Utilizing statistical modeling, we perform a regression analysis to assess impact using Alteryx workflows, STATA, and Python. Our analysis includes univariate, bivariate and multivariate methods to investigate the relationships between key variables and household incomes. Univariate analysis is conducted to determine household characteristics, whereas bivariate and multivariate analyses are done to examine key relationships between different key variables and household incomes.
To manage outliers and achieve a normal distribution, data is sorted in ascending order using household program value. Five per cent of the data is dropped from the analysis (1% at the bottom and 4% at the top) for every cohort at the district level for a true comparison. The dropped data is also not considered for the heterogeneity analysis. Outlier management for both control and treatment is done separately following the same procedure.
Findings are assumed to be true and published only when 95%** to 99%*** statistical significance is achieved with a p-value equal to or less than 0.05 or 0.01.
In 2025, we continue to strengthen our approach by introducing new indicators that expand what we can learn about household outcomes. We have also refined our aggregation methods at a regional level to support greater scalability of topline results as we extend our programs across more districts and regions. These enhancements build on our existing framework, ensuring our analysis keeps pace with the scale and scope of our work.