USAID's Project Design Guidance states that: if an impact evaluation is planned, its design should be summarized in the Project Appraisal Document (PAD) section that describes the project's Monitoring and Evaluation Plan and Learning Approach. Early attention to the design for an impact evaluation is consistent with USAID Evaluation Policy requirements for pre-intervention baseline data and a separate, parallel contract, for an impact evaluation.
USAID Evaluation Policy encourages Missions to undertake prospective impact evaluations that involve the identification of a comparison group or area, and the collection of baseline data, prior to the initiation of the project intervention. This type of impact evaluation can potentially be employed whenever an intervention is delivered to some but not all members of a population, i.e., some but not all firms engaged in exporting, or some but not all farms that grow a particular crop. This type of design may also be feasible when USAID projects introduce an intervention on a phased basis.
The identification of a valid comparison group is critical for impact evaluations. In principle, the group or area that receives an intervention should be equivalent to the group or area that does not. The more certain we are that groups are equivalent at the start, the more confident we can be in claiming that any post-intervention difference is due to the project being evaluated. For this reason, USAID evaluation policy prefers a method for selecting a comparison group that is called randomized assignment, as this method for constructing groups that do and do not receive an intervention is more effective than any other when it comes to ensuring that groups are equivalent on a pre-intervention basis.
Randomized assignment is a method for creating groups that do and do not receive a project intervention. Randomized assignment can be achieved in a number of ways including flipping a coin; computer generated assignments, or a lottery, where the number of participants is greater than the number that can draw tickets that assignment them to the intervention (or treatment) group. With all of these methods, all members of a population (or all members of a representative sample of a large population) have an equal chance of ending up in the intervention (or treatment) group. The resulting treatment and non-treatment (or control) groups are deemed to be functionally equivalent, as all of the other characteristics of population members have been distributed across both groups.
Prospective impact evaluations that employ randomized assignment are classified as having experimental designs (also called randomized controlled trials (RCTs). Any other method of assigning members of a population to treatment and comparison groups, no matter how elaborate or carefully developed, involves decisions by evaluators about the basis on which population members will be assigned to the treatment or comparison groups. Evaluator involvement in these decisions automatically steps away from the 'equal chance' proposition and introduces the possibility of bias (either deliberate or inadvertent) in the assignment process. This results in an impact evaluation that is classified as having a quasi-experimental design. Both experimental and quasi-experimental designs can produce credible impact evaluation findings, but there is a difference, and their classifications signal what that difference is. Along this continuum, the preference in USAID's Evaluation Policy is clear:
For impact evaluations, experimental methods generate the strongest evidence. Alternative methods should be utilized only when random assignment strategies are infeasible.
Whenever a prospective impact evaluation involving treatment and comparison groups is being considered, it is wise to undertake a power analysis to ensure that the number of units (people, locations) available for assignment to these groups is adequate to detect important differences between them.
In addition to prospective impact evaluations in which equivalent or close to equivalent groups are established prior to an intervention and followed to determine their post intervention status on outcome or effect measures of interest there are other types of impact evaluations what are useful under specific conditions, including designs that can be used in situations where all members of a population are exposed to a treatment, such as a policy reform, and which in some instance will be undertaken as retrospective impact evaluations. Other impact evaluation designs, are intended for use when populations that are known to be different (such as those living above and below a poverty line) are to be compared after the poorer group receives an intervention that is designed to improve their circumstances. These as well as other impact evaluation designs used in specialized circumstances are described in Impact Evaluation in Practice as well as in Experimental and Quasi Experimental Design for Generalized Causal Inference, as well as in other publications in this field.
When considering which type of evaluation design to choose, project design teams may find that using an Evaluation Design Decision Tree helps them work their way to the most appropriate option.