The Story Behind Project Follow Through

The Story Behind Project Follow Through

Bonnie Grossen, Editor

Project Follow Through (FT) remains today the world’s largest educational experiment. It began in 1967 as part of President Johnson’s ambitious War on Poverty and continued until the summer of 1995, having cost about a billion dollars. Over the first 10 years more than 22 sponsors worked with over 180 sites at a cost of over $500 million in a massive effort to find ways to break the cycle of poverty through improved education.

The noble intent of the fledgling Department of Education (DOE) and the Office of Economic Opportunity was to break the cycle of poverty through better education. Poor academic performance was known to correlate directly with poverty. Poor education then led to less economic opportunity for those children when they became adults, thus ensuring poverty for the next generation. FT planned to evaluate whether the poorest schools in America, both economically and academically impoverished, could be brought up to a level comparable with mainstream America. The actual achievement of the children would be used to determine success.

The architects of various theories and approaches who believed their methods could alleviate the detrimental educational effects of poverty were invited to submit applications to become sponsors of their models. Once the slate of models was selected, parent groups of the targeted schools serving children of poverty could select from among these sponsors one that their school would commit to work with over a period of several years.

The DOE-approved models were developed by academics in education with the exception of one, the Direct Instruction model, which had been developed by an expert Illinois preschool teacher with no formal training in educational methods. The models developed by the academics were similar in many ways. These similarities were particularly apparent when juxtaposed with the model developed by the expert preschool teacher from Illinois. The models developed by the academics consisted largely of general statements of democratic ideals and the philosophies of famous figures, such as John Dewey and Jean Piaget. The expert preschool teacher’s model was a set of lesson plans that he had designed in order to share his expertise with other teachers.

The preschool teacher, Zig Engelmann, had begun developing his model in 1963 as he taught his non-identical twin boys at home, while he was still working for an advertising agency. From the time the boys had learned to count at age 3 until a year later, Zig had taught them multi-digit multiplication, addition of fractions with like and unlike denominators, and basic algebraic concepts using only 20 minutes a day.

Many parents may have dismissed such an accomplishment as the result of having brilliant children. Zig thought differently; he thought he might be able to accomplish the same results with any child, especially children of poverty. He thought that children of poverty did not learn any differently than his very young boys, whose cognitive growth he had accelerated by providing them with carefully engineered instruction, rather than waiting for them to learn through random experience.

Zig filmed his infant sons doing math problems and showed the home movie to Carl Bereiter at the University of Illinois, where Carl was leading a preschool project to accelerate the cognitive growth of disadvantaged young children. Nothing was working. After seeing Zig’s film, he asked Zig if he could accomplish similar results with other children. Zig said “yes” and got a job working with him. Excerpts from the home movie of Zig working with his twin sons was shown at the 1994 Eugene conference and are included in the Conference ’94 video tape available through ADI. The Conference ’94 tape also includes footage of Zig working with the economically disadvantaged preschool children and comments from those who were there in the early days of Zig’s career and FT.

Carl Bereiter decided to leave Illinois to go to the Ontario Institute for Studies in education. The preschool project needed a director with faculty rank, a ranking that Zig did not have, in order to continue to receive funding on a grant from the Carnegie Foundation.

Wes Becker, a professor of psychology saved the preschool by joining it as a co-director. Wes had graduated as a hot shot clinical psychologist from Stanford, having completed the undergraduate and graduate programs in a record six years. Wes had then moved from the orientation of a development a list to much the opposite, that of a behaviorist. At the time Wes became familiar with Zig’s work Wes was doing a demonstration project to show how behavioral principles apply to human subjects. Wes’s demonstration was having difficulties because the instructional program for teaching reading was not working (Sullivan Programmed Phonics). One of Wes’s graduate students, Charlotte Giovanetti, also worked with Zig in the preschool. She told Wes, “We know how to do that,” and proceeded to develop a small group program for teaching sounds in the Sullivan sequence. It was successful and impressed Wes.

As chance would have it, about the same time that Zig and Carl’s preschool program was looking for a new director, Wes heard Jean Osborn describe the Direct Instruction program used in the preschool at a symposium. Wes personally commented to Jean afterward how taken he was with the careful analysis (building skills on preskills, choice of examples, etc.). That night he was attacked by phone calls, strategically planned, requesting him to replace Carl Bereiter. The callers assured him it would take only a little bit of his time.

So Wes agreed to a partnership that then consumed his life. Only a few months after Wes became involved in the preschool project with Zig, Project FT began. Wes and Zig became the Engelmann-Becker team and joined Project FT under the sponsorship of the University of Illinois in 1967.

Zig began sharing his expertise with other teachers in the form of the Direct Instruction System for Teaching Arithmetic and Reading (DISTAR or Direct Instruction). His phenomenal success started getting attention. Other talented people began working with Zig. Bob Egbert, who for years was the National Director of Project FT, describes a scene from those early days in a letter he wrote to Zig for the 20th anniversary celebration:

The University of Kansas was having its first summer workshop for teachers. Don Bushell had invited Ziggy to do a demonstration lesson. My image of that occasion is still crystal clear. Ziggy was at the front of the large classroom when a half dozen five-year-old children were brought in. They were shy in front of the large audience and had to be encouraged to sit in the semi-circle in front of Ziggy. “How in the world,” I thought, “will this large, imposing man who has not been educated as a teacher cope with this impossible situation?” I need not have been concerned. Within three minutes the excited youngsters, now on the edge of their chairs, were calling out answers individually or in unison, as requested, to the most “difficult” of Ziggy’s challenges and questions. By the end of the demonstration lesson, the children had learned the material that Ziggy taught; they also had learned that they were very smart. They knew this because they could answer all of the questions that Ziggy had assured them were too hard for them! (The full text of Bob Egbert’s letter is in the Fall, 1994 issue of Effective School Practices on pages 20-21.)

Problems began to develop immediately with the University of Illinois’ sponsorship. Illinois allowed no discounts for the large volume printing of materials that were sent to the schools. Furthermore, Illinois would not allow a Direct Instruction teacher training program as part of its undergraduate elementary education program. Teachers learning Direct Instruction could not get college credit toward teacher certification. Wes and Zig began looking for a new sponsor. They sent letters to 13 universities that had publicized an interest in the needs of disadvantaged children, offering their one and a half million dollar per annum grant to a more friendly campus. Only two universities even responded, Temple University in Pennsylvania and the University of Oregon. Being more centrally located, Temple seemed more desirable. But then the faculty of two departments at Temple voted on the question of whether Temple should invite the DI model to join them. The faculty were unanimously opposed.

That left only the University of Oregon in tiny remote Eugene, hours of flying time from all the sites. Bob Mattson and Richard Schminke, Associate Deans of the College of Education, expressed the eagerness of the University to have the Engelmann-Becker model come to Oregon. The DI project staff took a vote on whether to move to Eugene. At this point Zig voted against the move. (He hates to travel.) But he was outvoted. As if on signal, Wes Becker, along with a number of his former students who had started working on the project (Doug Carnine was one of those students), and Zig Engelmann, along with a number of his co-teachers and co-developers, left their homes in Illinois and moved to Eugene, Oregon in 1970.

The Effects of FT

One of the most interesting aspects of FT that is rarely discussed in the technical reports is the way schools selected the models they would implement. The model a school adopted was not selected by teachers, administrators, or central office educrats. Parents selected the model. Large assemblies were held where the sponsors of the various models pitched their model to groups of parents comprising a Parent Advisory Committee (PAC) for the school. Administrators were usually present at these meetings and tried to influence parents’ decisions. Using this selection process, the Direct Instruction model was the most popular model among schools; DI was implemented in more sites during FT than any other model. Yet among educrats, DI was the darkhorse. Most educrats’ bets would undoubtedly have been placed on any of the models but the Direct Instruction model. The model developed by the Illinois preschool teacher who didn’t even have a teaching credential, much less a Ph.D. in education, was not expected by many educrats to amount to much, especially since it seemed largely to contradict most of the current thinking. All sponsors were eagerly looking forward to the results.

The U.S. Department of Ed hired two independent agencies to collect and evaluate the effects of the various models. The data were evaluated in two primary ways. Each participating school was to be compared with a matched nonparticipating school to see if there were improvements. In reality, it became difficult to find matching schools. Many of the comparison schools were not equivalent on pretest scores to the respective FT schools. These pretest differences were adjusted with covariance statistics. In addition, norm-referenced measures were used to determine if the participating schools had reached the goal of the 50th percentile. This represented a common standard for all schools. Prior scores had indicated that schools with economically disadvantaged students would normally be expected to achieve at only the 20th percentile, without special intervention. The 20th percentile was therefore used as the “expected level” in the evaluation of the results.

The preliminary annual reports of the results were a horrifying surprise to most sponsors. By 1974, when San Diego School District dropped the self-sponsored models they had been using with little success since 1968, the U.S. Department of Ed allowed San Diego only two choices: Direct Instruction or the Kansas Behavioral Analysis model. It was evident by this time that the only two models that were demonstrating any positive results were these two. The results of the evaluation were already moving into policy. This was not well-received by the many sponsors of models that were not successful.

Before the final report was even released, the Ford foundation arranged with Ernest House to do a third evaluation & critique of the FT evaluation & discredit the embarrassing results. The critique was published in the Harvard Educational Review and widely disseminated.

Ernest House describes the political context for this third evaluation as follows:

In view of the importance of the FT program and its potential impact on education, a program officer from the Ford Foundation asked Ernest House in the fall of 1976 whether a third-party review of the FT evaluation might be warranted. FT had already received considerable attention, and the findings of the evaluation could affect education for a long time to come. Although the sample was drawn from a non representative group of disadvantaged children, the findings would likely be generalized far beyond the group of children involved. Moreover, while the study had not yet been completed, the evaluation had generated considerable controversy, and most of the sponsors were quite unhappy with preliminary reports. Finally, the evaluation represented the culmination of years of federal policy, stretching back to the evaluation of Head Start. Would this evaluation entail the same difficulties and controversies as previous ones? Would there be lessons to be learned for the future? For these reasons and after examining various documents and talking to major participants in the evaluation, House recommended that a third-party review would be advisable. If such a review could not settle the controversies, it could at least provide another perspective. The evaluation promised to be far too influential on the national scene not to be critically examined. In January 1977 the Ford Foundation awarded a grant to the Center for Instructional Research and Curriculum Evaluation at the University of Illinois to conduct the study, with Ernest House named as project director. House then solicited names of people to serve on the panel from leading authorities in measurement, evaluation, and early-childhood education. The major selection criteria were that panel members have a national reputation in their fields and no significant affiliation with FT. The panelists chosen by this procedure were Gene V. Glass of the University of Colorado, Leslie D. McLean of the Ontario Institute for Studies in Education, and Decker F. Walker of Stanford University. (p. 129, House, Glass, McLean, & Walker, 1978)

The main purpose of House et. al.’s critique seemed directed at preventing the FT evaluation results from influencing education policy. House implied that it was even inappropriate to ask “Which model works best?” as the FT evaluation had: “The ultimate question posed in the evaluation was ‘Which model works best?’ rather than such other questions as ‘What makes the models work?’ or ‘How can one make the models work better?'” (p. 131, House, Glass, McLean, & Walker, 1978).

Glass wrote another report for the National Institute of Education (NIE), which convinced them not to disseminate the results of the FT evaluations they had paid 30 to 40 million dollars to have done. The following is an ERIC abstract of Glass’s report to the NIE:

Two questions are addressed in this document: What is worth knowing about Project FT? And, How should the National Institute of Education (NIE) evaluate the FT program? Discussion of the first question focuses on findings of past FT evaluations, problems associated with the use of experimental design and statistics, and prospects for discovering new knowledge about the program. With respect to the second question, it is suggested that NIE should conduct evaluation emphasizing an ethnographic, principally descriptive case- study approach to enable informed choice by those involved in the program. The discussion is based on the following assumptions: (1) Past evaluations of FT have been quantitative, experimental approaches to deriving value judgments; (2) The deficiencies of quantitative, experimental evaluation approaches are so thorough and irreparable as to disqualify their use; (3) There are probably at most a half-dozen important approaches to teaching children, and these are already well-represented in existing FT models; and (4) The audience for FT evaluations is an audience of teachers to whom appeals to the need for accountability for public funds or the rationality of science are largely irrelevant. Appended to the discussion are Cronbach’s 95 theses about the proper roles, methods, and uses of evaluation. Theses running counter to a federal model of program evaluation are asterisked. (Eric Reproduction Service ED244738. Abstract of Glass, G. & Camilli, G., 1981, “FT” Evaluation, National Institute of Education, Washington, DC).



“The audience for FT evaluations is an audience of teachers to whom appeals to the need for accountability for public funds or the rationality of science are largely irrelevant.” ERIC abstract of Gene V. Glass’s critique


The final Abt report (Bock, Stebbins, & Proper, 1977) showed that the aggregate effects of all the models rendered FT to be a failure. FT was a failure because all of the models, except one, did not produce the desired results. (The Kansas Behavioral Analysis model also got positive results, but they were not as strong as the Direct Instruction model.) However, the FT Project did successfully identify what does work. The only model that brought children close to the 50th percentile in all subject areas was the Direct Instruction model.

These remarkable results were achieved by the Direct Instruction model in spite of the fact that Grand Rapids, MI was included in the analysis. The PAC in Grand Rapids had originally chosen to participate in FT using the Direct Instruction model. A new superintendent to the district later convinced the PAC to reject the model. The Direct Instruction sponsors subsequently withdrew from Grand Rapids; however, the US Office of Education continued to fund the site and continued to categorize it as Direct Instruction. It is probably not irrelevant that at this time Gerald Ford from Michigan was the U.S. President. In any case, because Grand Rapids had received FT funding throughout the evaluation period (1971-1976), they were included in the Abt analysis even though they had not implemented Direct Instruction for several years.

The most popular models were not only unable to demonstrate many positive effects; most of them produced a large number of negative effects. (See articles in this issue for details.)

After the House-Glass critiques were published, Bereiter and Kurland reviewed the FT data once again in 1981-2, responding in detail to each question and issue raised by House-Glass in a comprehensive and very readable report of the results.

In spite of the counter arguments raised by Bereiter and Kurland and others, the House-Glass critique was successful. The results of Project FT were not used to shape education policy. Though much of the House and Glass critiques were based on a rejection of the use of experimental science in education, other critics, who did not reject experimental science, argued that the outcomes valued by the losing approaches had not been measured in the FT evaluation. Though some pleaded for more extensive evaluation studies of multiple outcomes, no further evaluation was funded. The following excerpts from Bob Egbert’s letter to Zig provide his perspective on the evaluation.


No one who was not there during the early years of Head Start and FT can know how much your initiative, intellect and commitment contributed to the development of those programs. You simply shook off criticism and attempts at censorship and moved ahead, because you knew you were right and that what you were doing was important for kids. Lest you think that censorship is too strong a word, let me remind you that many in the early education field did not want your program included in FT. As confirming evidence for my personal experience and memory I cite the Head Start consultant meeting held in, I think, September 1966, in which a group of consultants, by their shrill complaints, stopped the full release of a Head Start Rainbow Series pamphlet which described an approach more direct than the approach favored by mainline early childhood educators: but one that was much less direct than the one you and Carl Bereiter were developing and using. The endorsement of Milton Akers for inclusion of “all” approaches in Head Start and FT Planned Variation made our task much easier. Ziggy, despite what some critics have said, your program’s educational achievement success through the third grade is thoroughly documented in the Abt reports. Your own follow up studies have validated the program’s longer term success. I am completely convinced that more extensive studies of multiple outcomes, which the Department of Education has been unwilling to fund, would provide a great deal more evidence for your program’s success.

After the Abt report in 1977, there was no further independent evaluation of FT. However, the DOE did provide research funds to individual sponsors to do follow-up studies. The Becker and Engelmann article in this issue summarizes the results of the follow-up studies by the Direct Instruction sponsors. Gary Adams’ summary of the various reports of the results of FT provides a discussion of the reasons for the different reports and the consistencies and differences across them. This summary is excerpted from a chapter on Project FT research in a new book summarizing Direct Instruction research (Adams & Engelmann, Direct Instruction Research, Educational Achievement Systems).

FT and Public Policy Today

In responding to the critique by House et al., Wisler, Burns,& Iwamoto summarized the two important findings of Project FT:

With a few exceptions, the models assessed in the national FT evaluation did not overcome the educational disadvantages poor children have. The most notable exception was the Direct Instruction model sponsored by the University of Oregon.

Another lesson of FT is that educational innovations do not always work better than what they replace. Many might say that we do not need an experiment to prove that, but it needs to be mentioned because education has just come through a period in which the not-always- stated assumption was that any change was for the better. The result was a climate in which those responsible for the changes did not worry too much about the consequences. The FT evaluation and other recent evaluations should temper our expectations. (p. 179-181,Wisler, Burns, & Iwamoto, 1978).

The most expensive educational experiment in the world showed that change alone will not improve education. Yet change for the sake of change is the major theme of the current educational reform effort. Improving education requires more thought than simply making changes.

Perhaps the ultimate irony of the FT evaluation is that the critics advocated extreme caution in adopting any practice as policy in education; they judged the extensive evaluation of the FT Project inadequate. Yet 10 short years later, the models that achieved the worst results, even negative results, are the ones that are, in fact, becoming legislated policy in many states, under new names. Descriptions of each of the models evaluated in FT, excerpted from the Abt report, are included in this issue. The Abt Associates ensured that these descriptions were carefully edited and approved by each of the participating sponsors, so they would accurately describe the important features of each of the models. Any reader familiar with current trendy practices that are becoming policy in many areas of North America, will easily recognize these practices in the descriptions of models evaluated in Project FT, perhaps under different names.

Curriculum organizations, in particular, are working to get these failed models adopted as public policy. The National Association for the Education of Young Children (NAEYC), for example, advocates for legislative adoption of the failed Open Education model under the new name “developmentally appropriate practices.” This model has been mandated in Kentucky, Oregon, and British Columbia. Oregon and British Columbia have since overturned these mandates. However, the NAEYC effort continues. Several curricular organizations advocate the language experience approach that was the Tucson Early Education Model in FT, under the new name “whole language.”

That these curricular organizations can be so successful in influencing public policy, in spite of a national effort to reach world class standards and the results of scientific research as extensive as that in FT, is alarming. That the major source of scientific knowledge in education, the educational research program of the federal government, is in danger of being cut is alarming.

That the scientific knowledge we have about education needs to be better disseminated is clear. At the very least the models that failed, even to the point of producing lower levels of performance, should not be the educational models being adopted in public policy.

I, personally, would not advocate mandating Direct Instruction, even though it was the clear winner. I don’t think that mandates work very well. But every educator in the country should know that in the history of education, no educational model has ever been documented to achieve such positive results with such consistency across so many variable sites as Direct Instruction. It never happened before FT, and it hasn’t happened since. What Wes, Zig, and their associates accomplished in Project FT should be recognized as one of the most important educational accomplishments in history. Not enough people know this.


Wisler, C., Burns, G.P.,Jr.,& Iwamoto, D. (1978). FT redux: A response to the critique by House, Glass,McLean, & Walker. Harvard Educational Review, 48(2), 171-185).

House, E.,Glass, G., McLean, L., & Walker, D. (1978). No simple answer: Critique ofthe FT evaluation. Harvard Educational Review, 48(2), 128-160).

Bock, G.,Stebbins, L., & Proper, E. (1977). Education as experimentation: A planned variation model (Volume IV-A & B) Effects of follow through models. Washington,D.C.: Abt Associates.