Originally posted at Development that Works
Last March we published a post (also here) with the results of the first randomized impact evaluation of the One Laptop Per Child (OLPC) model in Peru, which has been widely discussed in the technology and education blogosphere (see for example the Educational Technology Debate posts). Recently, Berk Ozler’s post on the World Bank’s Development Impact blog raised some interesting questions on whether we are learning enough from the OLPC evaluations.
The IDB members of the OLPC evaluation welcome the discussion and think that the author raises important issues. In short, we agree that ideally we should experiment, identify successful programs using solid empirical methods, and then scale up development interventions. In an ideal world this is the
Ozler’s first question is why large-scale OLPC programs have been implemented around the world without solid empirical evidence. Instead of speculating on this issue, we would just like to point out that we know very little about how to improve learning in developing countries.
In fact, this is one of the main conclusions of the recent review of the last two decades of research in this area by Paul Glewwe and co-authors. Hence, policymakers could answer that question by saying that they invested in the OLPC program because they wanted to improve education and, given the information available, this was the most promising program among the potential alternatives. Clearly, this knowledge gap in such an important area is unacceptable, but it is difficult to criticize policymakers in this context. In our work we found that computers are widely believed to help increase learning, and this could explain the popularity of the OLPC model. The head of the program in Peru considered OLPC to be a promising solution and something that could be done in the short run (compared to, for example, upgrading teacher skills and changing their teaching practices). It should be noted that he was very interested in demonstrating results, and for this reason he partnered with the IDB for the evaluation of the project (the IDB did not finance the project, only the evaluation).
The second question in the post is why we did not design an evaluation aimed at testing the effect of alternative “technologies” or rather different combinations of inputs (e.g., specialized software, training, Internet) instead of just evaluating OLPC. Clearly, multi-treatment evaluations are attractive because they allow a direct comparison across different interventions and also exploit economies of scale and synergies. We agree with the author that multi-treatment, multi-site efficacy evaluation is ideal, and we would have designed an evaluation along those lines if we were running OLPC and did not have binding budget constraints (e.g., multi-treatment / multi-outcome evaluations for heterogeneous groups are very demanding in terms of program implementation, data collection efforts, and costs). But we are not running the program, and we responded to a legitimate question posed by the Peruvian government: is the program, as implemented, working? The evaluation questions were posed based on the objectives of the OLPC project in Peru, and on what the literature had identified as relevant questions. We considered this a first approximation to a very complex issue, and looked at the “theory of change” in order to explain the results.
That is why, although we considered the option of implementing more treatments, we decided to focus our limited resources on trying to provide a solid answer on the impacts of the program and on understanding why those impacts did or did not occur. We made our decision for three reasons.
First, OLPC was a popular (but untested) program and hence by estimating its impact we were going to provide valuable information for a host of governments thinking about implementing or expanding it. Second, the expected effects in the two central outcomes (academic achievement and cognitive skills) were modest and hence we needed a large sample to be able to identify small but meaningful effects. Third, we viewed this evaluation as the first in an ambitious research agenda in technology in education. We therefore preferred to focus our resources in the initial period on exploring the effects of OLPC so as to later tackle more specific questions in this agenda. In fact, in the last two years we have continued our close collaboration with the Peruvian government and with GRADE, trying out some innovations based on the findings of the evaluation, and we plan to continue designing and implementing pilots in the next few years.
Finally, as is occurring in other areas, our sense is that the value of experimenting and evaluating is increasingly accepted among decision-makers in technology in education. Still, we need to do a better job of communicating the returns of solid, carefully designed research. Technology is very versatile, and innovations are occurring at a frantic pace. Hopefully, in some years what is happening in technology in education could be pointed to as an example of a successful story in terms of innovation, identification of effective programs and subsequent scale-up.
We also agree that institutions such as the IDB, World Bank, UNDP and others should invest in “evaluate first, implement later” projects. Funds to finance efficacy evaluations are scarce. We do need to take the next step: we started from an ignorance equilibrium (Pritchett, It pays to be ignorant), and we have made significant progress. Today impact evaluations are part of the practice and jargon of development work. But we need to keep moving forward and make sure that the potential of impact evaluations as learning tools is maximized.
That said, we think that the OLPC evaluation, as implemented, has helped to move this agenda forward.
To Comment, visit the original post here: Development that Works