Skip Navigation Links

 Step 4: Gathering Credible Evidence

An evaluation should collect information that is seen as credible to the evaluation’s primary users. In some situations, consulting evaluation specialists may be necessary to ensure credible information is gathered. Credible evidence strengthens the evaluation and subsequent recommendations. Credibility can be improved by using multiple procedures for gathering, interpreting, and analyzing data. Stakeholders engaged in defining and gathering data they find credible can lead to a greater chance of acceptance of the evaluation's conclusions and recommendations. Evidence gathering includes indicators, sources, quality, quantity, and logistics.

Types of Credible Evidence

Indicators – indicators address criteria that will be used to evaluate the program. Indicators can be defined to measure program activities (ex: participation rate, levels of client satisfaction, capacity to deliver services) and/or measure program effects (ex: changes in participant behavior, community norms, health status, quality of life, polices/practices).

It is necessary to develop multiple indicators for monitoring the implementation of a program and its effects, but it is important not to define too many indicators as that can detract from the evaluation's goals. A good place to start in developing and defining indicators is the logic model that was developed in Step 2 of the evaluation.

Sources – sources of evidence can include people, documents, or observations that provide information. More than one source of information can be used to gather evidence for each indicator. Selecting multiple sources of evidence provides an opportunity to include different perspectives and can enhance credibility. The criteria used for selecting sources should be clearly stated so that users can interpret information appropriately. Using both qualitative and quantitative information can increase the chance that the evidence is balanced.

Quality – quality refers to the integrity and appropriateness of information used in evaluations. High quality data is reliable, valid, and informative. Quality data is affected by well-defined indicators, data collection methods, training of data collectors, data management, instrument design, coding, and source selection, among others. It is important to note that all data have limitations, and obtaining quality data will have tradeoffs (breadth vs depth). This should be negotiated among stakeholders so that it meets their threshold for credibility.

Quantity – quantity refers to the amount of evidence gathered. When possible, the quantity of information to be gathered should be decided in advance or for evolving processes, criteria for how much information to gather should be set. Quantity of data collected has implications for confidence level, precision, and sufficient power to determine effects.

Logistics – logistics refers to the physical infrastructure, timing, and methods for gathering and handling evidence. Each technique for gathering evidence should be appropriate based on the source, plan of analysis, and strategy for communicating findings. It is important to consider the cultural norms for gathering evidence and ensure that privacy or confidentiality agreements are in place where necessary.

Example From Innovation Station: Washington State Parent Child Assistance Program (PCAP) 

The AMCHP Innovation Station Best Practice Parent Child Assistance Program (PCAP) is a three year advocacy/case management model for women who are at-risk for substance abuse during pregnancy as well as their families. PCAP used several indicators for their evaluation design. For example, client satisfaction and behavior change in the following areas: alcohol/drug treatment; abstinence from alcohol/drugs; family planning & subsequent birth; health & well‐ being of target child; family connection with services; and stability indicators: education, source of income, employment. The sources of evidence included the participants, the Advocate Client Relationship Inventory (a 27-item instrument), and pre/post-tests. In terms of logistics, PCAP collected data at various times – during the intervention, upon completion of the intervention, and about two and half years following the intervention. The PCAP gathered data using surveys and other techniques.