Sponsor Findings From Project Follow Through

Sponsor Findings From Project Follow Through

Wesley C. Becker and Siegfried Engelmann
University of Oregon

The final report of the National Evaluation of Project Follow Through, a comparative study of different approaches to teaching economically disadvantage children in the primary grades, shows that the Direct Instruction Model (University of Oregon) was largely successful in assisting disadvantaged children in catching up with their middle-class peers in academic skills. This demonstration is the first to show that compensatory education can work.

The Direct Instruction Model emphasizes small-group face-to-face instruction by teachers and aides using carefully sequenced lessons in reading, arithmetic, and language. These programs were designed by Siegfried Engelmann using modern behavioral principles and advanced programming strategies (Becker, Engelmann, & Thomas, 1975), and are published by Science Research Associates under the trade name DISTAR. The program directors, Professor Wesley C. Becker and Siegfried Engelmann, attribute its success to the technological details, the highly specific teacher training, and careful monitoring of student progress. The closest rival to the Direct Instruction Model in overall effects was another behaviorally-based program, the University of Kansas Behavior Analysis Model. Child-centered, cognitively focused, and open classroom approaches tended to perform poorly on all measures of academic progress.


The National Evaluation of Follow Through used a planned variation design to provide a broad-range comparison of educational alternatives for teaching the disadvantaged and find out “what works.” Different models of instruction were tested in 139 communities and evaluated for stability of results over successive program years. Model programs were implemented in kindergarten through third grade. The descriptions of the nine major models in the National Evaluation are taken from the Abt Associates descriptions of the report.

The Open Classroom Model (Education Development Center, EDC) which is based on the British Infant School model;

Cognitively-Oriented Curriculum Model (High/Scope Educational Research Foundation) which is based on Piaget’s theories;

The Responsive Education Model (Far West Laboratory for Educational Research) which is based on Glen Nimnict’s work in structuring a teaching environment and uses a variety of techniques;

Bank Street Early Childhood Education Model (Bank Street College of Education) which is concerned with the development of the whole child.

Tucson Early Education Model (TEEM, University of Arizona) which is based on the language-experience approach of Marie Hughes that initially focused on teaching bilingual children; i

Language Development (Bilingual) Model (Southwest Educational Development Laboratory-SEDL) which utilizes programmed curricula for bilingual children (and others) focusing on language development;

Behavior Analysis Model (University of Kansas) which used modern principles of reinforcement and systematic classroom management procedures; and

Direct Instruction Model (University of Oregon).

For each sponsor, children were followed from entry to third grade in 4 to 8 kindergarten-entering sites, and some first-entering sites. Comparison groups were also tested. The evaluation, referred to as “the largest controlled education experiment ever,” included measures of Basic Skills, Cognitive Skills, and Affect.

Basic Skills were based on four subtests of the Metropolitan Achievement Test (MAT)-Word Knowledge, Spelling, Language, and Math Computation.

The Cognitive Skills included MAT Reading, MAT Math Concepts, MAT Math Problem Solving, and the Raven’s Coloured Progressive Matricies.

The Affective Measures consisted of the Coopersmith Self-Esteem Inventory and the Intellectual Achievement Responsibility Scale (IARS). The Coopersmith measures children’s feelings about themselves and school; the IARS measures the degree to which children take responsibility for their successes and failures.


Adjusted Outcomes
Abt Associates used covariance analysis to adjust third-grade scores according to entry differences between experimental and comparison groups. An adjusted difference was defined as educationally significant if the difference between experimental and comparison group was at least one-fourth standard deviation unit in magnitude. This convention was adopted because when dealing with large groups, statistical significance can be very misleading.

Figures 1 to 3 show the performance of the various sponsors on the adjusted outcomes in comparison to control groups. An Index of Significant outcomes (ISO’s) is used to show relative effects across models. ISO’s are derived by taking the number of educationally significant minus outcomes for a sponsor and subtracting this from the number of significant plus outcomes.1 This number (which may be negative) is divided by the total number of comparisons for a model and multiplied by 1000 to get rid of decimal points. The result is a number either positive or negative, that shows both the plus-minus direction and the consistency of each model’s effects. If the direction is positive, it means that the model outperforms the controls. The larger the number, the more consistently the model performs. If the number is negative, the control groups outperform the model.

Figure 1 compares the performance of different models on Basic Skills. Only three models achieve positive ISO’s. The Direct Instruction Model is more than 270 ISO points above the nearest comparison (Florida). The Direct Instruction Model is about 700 points above the lowest program, EDC’s Open Education Model.

Figure 2 compares models on academic Cognitive-Conceptual Skills.2 Only two models have positive outcomes, and again the Direct Instruction Model is in first place (this time by over 225 ISO points above the second-place finisher and by over 800 ISO points above the lowest program [EDC]). The performance on Cognitive-Conceptual Skills demonstrates that those programs based on cognitive “theories” do not have the technological know-how to achieve positive results with poverty children, and that behaviorally-based Behavior Analysis Model also lacks the technology to teach Cognitive-Conceptual Skills.

Figure 3 compares models on Affective Measures. The Direct Instruction Model achieves the highest positive effect. Behavioral Analysis, Parent Education, and SEDL also have positive ISO’s. Note that only those models that achieved positive effects on Basic Skills or Cognitive-Conceptual skills produce positive outcomes on Affective Measures. Note also that the cognitively-oriented programs (with the exception of Parent Education) perform as poorly on Affective Measures as they do on Academic Achievement. The high correlation between academic and affective outcomes suggest a need to re-evaluate some interpretations of what turns kids on and how they learn to feel good about themselves in school.

Grade-Equivalent and Percentile Performance
The Abt IV Report provides performance level data for four MAT measures: Total Reading, Total Math, Spelling, and Language. Tables 1­p;4 display percentiles on a one-fourth standard deviation scale. With this display, differences between sponsors of a quarter-standard deviation (e.g., an educationally significant difference) are easily detected, while the percentiles provide the “norm reference.” The baseline at the 20th percentile represents average expectation for disadvantaged children.

Total Reading (Table 1). The Direct Instruction Model, the only one to show achievement above 3.0 grade level, is about one-half standard deviation above the mean of all other sponsors. It is nearly a quarter-standard deviation above the second-place model, Behavior Analysis.

Total Math (Table 2). The mean grade-equivalent scores for the Direct Instruction Model is 3.71, which is 48th percentile, and one full standard deviation about the average of all other sponsors. The Model is one-half standard deviation about the second-place model, again, Behavior Analysis.

Spelling (Table 3). The Direct Instruction Model achieves the 51 percentile and again leads all sponsors. The Behavior Analysis model, however, is a close second (49th percentile).

Language (Table 4). The Direct Instruction Model performs at the 4.0 grade level, or 50th percentile. It is three-fourths standard deviation above all other models. (No other model scores within one year of the Direct Instruction Model on grade-equivalent score.)

Sponsor Findings
Sponsor-collected data further support the above conclusions:

· A greater measurable and educationally significant benefit is present at the end of third grade for those who begin Direct Instruction in kindergarten than for those who begin in first grade (Becker and Engelmann, 1978; Gersten, Darch & Gleason, xx).

· Significant gains in IQ are found, which are largely maintained through third grade. Students entering the program with IQ’s over 111 do not lose during the Follow Through years, though one might expect some repeated regression phenomena. The low-IQ children, on the other hand, display appreciable gains, even after the entry IQ has been corrected for regression artifact. Students with IQ’s below 71 gain 17 points in the entering kindergarten sample and 9.4 points in the entering first-grade sample; gains for the children entering with IQ’s in the 71-90 range are 15.6 and 9.2, respectively (Gersten, Becker, Heiry & White, 1984).

· Studies of low-IQ students (under 80) show the program is clearly effective with students who have a higher probability of failure. As indicated in Figures 3 and 4, these students gain nearly as much each year in reading (decoding) and math, as the rest of our students with higher IQ’s­p;more than a year-per-year on the WRAT (Wide Range Achievement Test) Reading and a year-per-year on MAT (Metropolitan Achievement Test) Total Math (Gersten et al., 1984).

· High school follow-up studies of Direct Instruction and comparison students were carried out in five districts. All the significant differences favored Direct Instruction students: five on academic measures, three on attendance, two on college acceptance and three on reduced retention rates (Gersten and Keating, 1987).

· The model generalizes across both time and across populations. The Department of Education has a Joint Dissemination Review Panel that validates educational programs as exemplary and qualifies them for national dissemination. During the 1980-81 school year, the last of the 12 Direct Instruction Follow Through projects were submitted for validation. Of the 12 districts, 11 had 8 to 10 years of data on successive groups of children. The schools sampled a full range of students: large cities (New York; San Diego; Washington, D.C.); middle-sized cities (Flint, MI; Dayton, OH; E. St. Louis, IL); rural white communities (Flippin, AR; Smithville, TN); a rural Black community (Williamsburg, SC); Mexican American communities (Uvalde, TX; E. Las Vegas, NM); and an American Indian community (Cherokee, NC). One hundred percent of the projects were certified as exemplary in reading and mathematics for the primary grades, thus providing replication over 8 to 10 years and in a dozen quite diverse communities.

· Research on implementation found consistent high-to-moderate relationships between observed level of model implementation and classroom achievement gains in reading. At least for highly structured models of instruction, degree of implementation can be measured in a reliable and valid fashion (Gersten, Carnine, Zoref, Cronin, 1986).

Two conclusions seem of special interest, especially in view of the wave of programs recently initiated in major urban areas to improve the teaching of basic skills. The first is that teachers at first may react negatively to­p;or be confused by­p;intensive, structured, in-class training (or technical assistance). Yet, ultimately at least half of the teachers found this to be one of the most positive features of the intervention.

The other key finding is that many teachers altered their reactions to structured educational models after they saw the effects of this program with their students on a day-to-day basis. Often this transformation took many months. At the beginning teachers were far from enthusiastic about the program and tended to feel that too much time was devoted to academics. Not enough was set aside for “fun” or creative activities. Yet their strong support by the end of the second year was unequivocal. From teacher interview data collected over two years, there can only be one main explanation for this, namely, the effect of the Direct Instruction Model on student performance. Time and again the teachers marveled at the new academic skills their pupils demonstrated. Teachers reported anecdotal evidence of growth well before the standardized achievement tests were administered (Cronin, 1980).

Implications of the Direct Instruction Findings

The Follow Through data and our extensive experience in the field attempting to generate changes in school systems permit tentative answers to a number of major issues in the field today.

Will Money and Comprehensive Services Do the Job?

Each of the sponsors in Follow Through had about the same amount of money to provide comprehensive services and an educational program. Most sponsors had two aides in most classrooms, and spent about $350 per child above basic school support on the educational component. The Abt data provide a convincing demonstration that money, good will, people, material, Hawthorne effect, health programs, dental programs, and hot lunches do not cause gains in achievement. All Follow Through sponsors had these things, and most failed to do the job in basic instruction.

Does Individualization Require Many Approaches?

The programs that failed the most in terms of educational achievements were those oriented to individual needs in instruction. The popular belief that it is necessary to teach different students in different ways is, for the most part, a fiction. The requirements for sequencing an instructional program are determined by what is to be taught, not who. In the DISTAR programs used by the Direct Instruction Model, each child faces the same sequence of tasks and the same teaching strategies. What is individualized is entry level, when corrections are used, reinforcement procedures, and number of practice trials to mastery.

Is Self-Directed Learning Best?

A common assumption arising from dominant subjective education philosophies is that self-directed learning is the only meaningful learning. Direct Instruction is said to produce isolated rote learning, not “meaningful” learning. The Follow Through results obviously demonstrate such an assumption to be false. The students performing best on all measures of higher cognitive processes were from the Direct Instruction Model. The assumption about the value of self-directed learning probably arises from observing young children (as Piaget did) interacting with the physical environment. The physical environment directly reinforces and punishes different responses. However, there is no way a child can learn the arbitrary conventions of a language system without someone who knows that system providing systematic teaching (including modeling of appropriate language usage). In addition, there can be no question that smart adults can organize and sequence experiences that will teach concepts and problem-solving skills better than children.

Why is Improvement is Reading Comprehension Hard to Achieve?

The Abt IV Report notes that successful outcomes were harder to come by in reading comprehension than in other skill areas. Only the Direct Instruction program made significant and sustained gains in this area. Even then, we only reached the 40th percentile on MAT Reading. Becker (1977) analyzed the Follow Through data and other data on reading, and concluded that schools are not designed to teach the English language to “poor kids” (e.g., to children whose parents, on the average, are less well-versed in knowledge of standard English). Schools are basically designed for white, middle-class children, and leave largely to parents the teaching of a most basic building block for intelligent bahevior­p;namely, words and their referents.

Why Do Economically Disadvantaged Students Continue to Do Poorly in School?

In general, economically disadvantaged students come to school with less knowledge relevant to suceeding in school. Thus, teaching these students requires teachers with different attitudes and skills, and more patience than is typically required. Colleges of education and schools are not organized or administered to develop and support teachers with these attributes. To coin a malapropism, “there is a way, but no will.” Students from low-income families do not need to fail in schools. They can be taught.

In summary, through the careful design of curricula, classroom procedures, and training procedures, the DI Follow Through Model was able to achieve a major goal of compensatory education­p;improving the academic performance of economically disadvantaged children to (or near) median national levels. Only one other major model in the Follow Through experiment (the University of Kansas Behavior Analysis Model) came close to matching this achievement. The DI Model also performed best on measures of affective outcomes, such as self-esteem. Follow-up studies, through primary and secondary levels, show strong continuing effects in terms of academic performance at the primary level, and better attendance, fewer grade retentions, and increased college acceptance at the high school level.

The Communities

The communities which have used the Direct Instruction Model are Providence, RI, Brooklyn, NY (P.S. 137), Washington D.C., Cherokee, NC, Williamsburg County, SC, Dayton, OH, E. St. Louis, IL, Flint, MI, Grand Rapids*, MI, West Iron County*, MI, Smithville, TN, Tupelo, MS, Racine*, WI, Todd County, SD, Rosebud Tribe, SD, Flippin, AR, Uvalde, TX, Dimmitt*, TX, E. Las Vegas, NM.

*No longer in Follow Through.

(Figures showing yearly gains, K-3, on WRAT reading and 1-3 on MAT Total Math for students according to IQ blocks could not be reproduced from the article clearly in electronic format.)

1The Abt analysis provides the two comparisons for each measure. One with a local control group and the other with a pooled national control group. A comparison was counted plus if either comparison was plus, and minus if either was minus. Use of alternative decisions rules would not change the relative rankings of models.

2The Raven’s Coloured Progressive Matricies result is not included with the data graphed because it is not an academic skill. Only 3 of 27 comparisons for all nine sponsors showed a positive outcome on the Raven’s suggesting that this test does not reflect what was being taught by sponsors. Direct Instruction shows a negative ISO on this measure, but would still rank 1 if it were included.


Abt Associates. (1977). Education as experimentation: A planned variation model (Vol. IV). Cambridge, MA: Author.

Becker, W.C., Engelmann, S., & Thomas, D.R. (1975). Teaching 2: Cognitive Learning and Instruction. Chicago: Science Research Associates.

Becker, W.C., & Engelmann, S. (1976). Analysis of achievement data on six cohorts of low income children from 20 school districts in the University of Oregon Direct Instruction Follow Through Model (Technical Report #76-1). Eugene, OR: University of Oregon, Office of Education, Follow Through Project.

Becker, W., & Engelmann, S. (1978). Analysis of achievement data on six cohorts of low income children from 20 school districts in the University of Oregon Direct Instruction Follow Through Model (Technical Report #78-1). Eugene, OR: University of Oregon, Office of Education, Follow Through Project.

Bereiter, C. (1967). Acceleration of intellectual development in early childhood. Final Report Project No. 2129, Contract No. OE 4-10-008. Urbana, IL: University of Illinois, College of Education.

Cronin, D. P. (1980). Implementation study, year 2, Instructional staff interviews. Los Altos, CA: Emrick.

Engelmann, S. (1967). Teaching formal operations to preschool children. Ontario Journal of Educational Research, 9 (3), 193-207.

Engelmann, S. (1968). The effectiveness of direct verbal instruction on IQ performance and achievement in reading and arithmetic. In J. Hellmuth (Ed.), Disadvantaged Child, Vol. 3. New York: Bruner Mazel.

Gersten, R., Becker, W., Heiry, T., & White. (1984). Entry IQ and yearly academic growth in children in Direct Instruction programs: A longitudinal study of low SES children. Educational Evaluation and Policy Analysis, 6(2), 109-121.

Gersten, R., & Keating, T. (1987). Improving high school performance of “at risk” students: A study of long-term benefits of direct instruction. Educational Leadership, 44(6), 28–31.

Nero and Associates, Inc.. (1975). Follow Through. A description of Follow Through sponsor implementation processes. Portland, OR: Author.

McLaughlin, M.W. (1975). Evaluation and reform. Cambridge, MA: Ballinger Publishing Co.

Weisberg, H.I. Short-term cognitive effects of Head Start programs: A report of the third year of planned variation-1971­p;72. Cambridge, MA: Huron Institute.

Back to Table of Contents