Why Education Experts Resist Effective Practices (And What It Would Take to Make Education More Like Medicine)


Why Education Experts Resist Effective Practices (And What It Would Take to Make Education More Like Medicine)

by Douglas Carnine



In perhaps no other profession is there as much disputation as in education. Phonics or whole language? Calculators or no calculators? Tracked or mixed-ability classrooms? Should teachers lecture or “facilitate”? Ought education be content-centered or child-centered? Do high-stakes exams produce real gains or merely promote “teaching to the test”? Which is the most effective reform: Reducing class size? Expanding pre-school? Inducing competition through vouchers? Paying teachers for performance?

And on and on and on. Within each debate, moreover, we regularly hear each faction citing boatloads of “studies” that supposedly support its position. Just think how often “research shows” is used to introduce a statement that winds up being chiefly about ideology, hunch or preference.

In other professions, such as medicine, scientific research is taken seriously, because it usually brings clarity and progress. We come close to resolving vast disputes, and answering complex questions, with the aid of rigorous, controlled studies of cause and effect. Yet so much of what passes for education research serves to confuse at least as much as it clarifies. The education field tends to rely heavily on qualitative studies, sometimes proclaiming open hostility towards modern statistical research methods. Even when the research is clear on a subject—such as how to teach first-graders to read—educators often willfully ignore the results when they don’t fit their ideological preferences.

To Professor Douglas Carnine of the University of Oregon, this is symptomatic of a field that has not yet matured into a true profession. In education, research standards have yet to be standardized, peer reviews are porous, and practitioners tend to be influenced more by philosophy than evidence. In this insightful paper, Doug examines several instances where educators either have introduced reforms without testing them first, or ignored (or deprecated) research when it did not yield the results they wanted.

After describing assorted hijinks in math and reading instruction, Doug devotes considerable space to examining what educators did with the results of Project Follow Through, one of the largest education experiments ever undertaken. This study compared constructivist education models with those based on direct instruction. One might have expected that, when the results showed that direct instruction models produced better outcomes, these models would have been embraced by the profession. Instead, many education experts discouraged their use.

Carnine compares the current state of the education field with medicine and other professions in the early part of the 20th century, and suggests that education will undergo its transformation to a full profession only when outside pressures force it to.

He knows the field well, as Director of the National Center to Improve the Tools of Educators, which works with publishers to incorporate research-based practices into education materials and with legislative, business, community and union groups to understand the importance of research-based tools. Doug can be phoned at 541-683-7543, e-mailed at dcarnine@oregon.uoregon.edu, and written the old fashioned way at 85 Lincoln St., Eugene, OR 97401.

The Thomas B. Fordham Foundation is a private foundation that supports research, publications, and action projects in elementary/secondary education reform at the national level and in the Dayton area. Further information can be obtained at our web site (www.edexcellence.net) or by writing us at 1627 K Street, NW, Suite 600, Washington, DC 20006. (We can also be e-mailed through our web site.) This report is available in full on the Foundation’s web site, and hard copies can be obtained by calling 1-888-TBF-7474 (single copies are free). The Foundation is neither connected with nor sponsored by Fordham University.



Education school professors in general and curriculum and instruction experts in particular are major forces in dictating the “what” and “how” of American education. They typically control pre-service teacher preparation, the continued professional development of experienced teachers, the curricular content and pedagogy used in schools, the instructional philosophy and methods employed in classrooms, and the policies espoused by state and national curriculum organizations.

Although they wield immense power over what actually happens in U.S. classrooms, these professors are senior members of a field that lacks many crucial features of a fully developed profession. In education, the judgments of “experts” frequently appear to be unconstrained and sometimes altogether unaffected by objective research. Many of these experts are so captivated by romantic ideas about learning or so blinded by ideology that they have closed their minds to the results of rigorous experiments. Until education becomes the kind of profession that reveres evidence, we should not be surprised to find its experts dispensing unproven methods, endlessly flitting from one fad to another. The greatest victims of these fads are the very students who are most at risk.

The first section of this essay provides examples from reading and math curricula. The middle section describes how experts have, for ideological reasons, shunned some solutions that do display robust evidence of efficacy. The following sections briefly examine how public impatience has forced other professions to “grow up” and accept accountability and scientific evidence. The paper concludes with a plea to hasten education’s metamorphosis into a mature profession.


Embracing Teaching Methods that Don’t Work

The reaction of a large number of education experts to converging scientific evidence about how children learn to read illustrates the basic problem. Data strongly support the explicit teaching of phonemic awareness, the alphabetic principle, and phonics, which is often combined with extensive practice with phonic readers. These are the cornerstones of successful beginning reading for young children, particularly at-risk youngsters. The findings of the National Reading Panel, established by Congress and jointly convened by the Department of Education and the Department of Health and Human Services, confirm the importance of these practices. Congress asked the panel to evaluate existing research on the most effective approaches for teaching children how to read. In its February 1999 Progress Report, the panel wrote,

[A]dvances in research are beginning to provide hope that educators may soon be guided by scientifically sound information. A growing number of works, for example, are now suggesting that students need to master phonics skills in order to read well. Among them are Learning to Read by Jeanne Chall and Beginning to Read: Thinking and Learning about Print by Marilyn Adams. As Adams, a senior scientist at Bolt Beranek and Newman, Inc., writes, “[It] has been proven beyond any shade of doubt that skillful readers process virtually each and every word and letter of text as they read. This is extremely counter-intuitive. For sure, skillful readers neither look nor feel as if that’s what they do. But that’s because they do it so quickly and effortlessly.1

Even the popular media have recognized this converging body of research. As James Collins wrote in Time magazine in October 1997: “After reviewing the arguments mustered by the phonics and whole-language proponents, can we make a judgment as to who is right? Yes. The value of explicit, systematic phonics instruction has been well established. Hundreds of studies from a variety of fields support this conclusion. Indeed, the evidence is so strong that if the subject under discussion were, say, the treatment of the mumps, there would be no discussion.”2 Yet in the face of such overwhelming evidence, the whole-language approach, rather than the phonics approach, dominated American primary classrooms during the 1990s. Who supports whole language? As Nicholas Lemann wrote in the Atlantic Monthly in 1997, “Support for it is limited to an enclosed community of devotees, including teachers, education school professors, textbook publishers, bilingual educators, and teacher trainers. Virtually no one in the wider public seems to be actively promoting whole language. No politicians are crusading for it. Of the major teachers’ unions, the American Federation of Teachers (AFT) is a wholehearted opponent and the National Education Association (NEA) is neutral. No independent scientific researchers trumpet whole language’s virtues. The balance of parental pressure is not in favor of whole language.”3

This phenomenon is not just the story of reading. Math education experts also live in an enclosed community. In 1989, the National Council of Teachers of Mathematics (NCTM) developed academic content standards that have since been adopted by most states and today drive classroom practice in thousands of schools. The standards not only specified what children were to learn, but how teachers were to teach. According to the NCTM, these standards were designed to “ensure that the public is protected from shoddy products,” yet no effort was made by the NCTM to determine whether the standards themselves were based on evidence. Indeed, the document setting them forth also urged that the standards be tested, recommending “the establishment of some pilot school mathematics program based on these standards to demonstrate that all students—including women and underserved minorities—can reach a satisfactory level of mathematics achievement.”4 There’s nothing wrong with testing the NCTM approach to math education. But should NCTM’s standards become the coin of the realm before they have proven their efficacy in rigorous experimental settings?

What is striking about the math episode is the NCTM’s inconsistent stance toward evidence. At one point there seems to be a reverence for evidence. “It seems reasonable that anyone developing products for use in mathematics classrooms should document how the materials are related to current conceptions of what content is important to teach and should present evidence about their effectiveness,” wrote the NCTM experts.5 The NCTM pointed to the Food and Drug Administration (FDA) as a model for what it was doing in creating content standards.

Yet it is impossible to imagine the FDA approving a drug—indeed, urging its widespread use—and later proposing “the establishment of some pilot … program” to see whether the drug helps or harms those to whom it is given. The FDA uses the most reliable kind of research to identify what works: dividing a population into two identical groups and randomly assigning treatment to one group, with the other group serving as a control. Properly done, the “patients” don’t know which group they’re in and neither do the scientists dispensing the medications and placebos. (This is known as a “double blind” experiment.) Such research is virtually unknown in education.

The resistance of education experts to evidence is so puzzling that it is worth closely investigating what educators say about research. In 1995, the Research Advisory Committee of the NCTM expressed its disdain for the kind of research that the FDA routinely conducts: “The question ‘Is Curriculum A better than Curriculum B?’ is not a good research question, because it is not readily answerable.” In fact, that is exactly the kind of research question that teachers, parents, and the broader public want to see answered. This kind of research is not impossible, though it is more complicated to undertake than other kinds of research—particularly the qualitative research that most education experts seem to prefer. (The role of qualitative research is discussed later in this essay.)

For some education professors, the problem with experimental research runs deeper. One prominent member of the field, Gene Glass, a former president of the American Educational Research Association, introduced an electronic discussion forum on research priorities with the following remarks: “Some people expect educational research to be like a group of engineers working on the fastest, cheapest, and safest way of traveling to Chicago, when in fact it is a bunch of people arguing about whether to go to Chicago or St. Louis.”6

With research understood in this way, it should not be surprising to find that the education profession has little by way of a solid knowledge base on which to rest its practices. But if we don’t know what works, how are teachers to know how to respond in a sure and confident way to the challenges they face? Hospitalized some months ago with a pulmonary embolism, Diane Ravitch, former assistant secretary of the U.S. Department of Education, looked up at the doctors treating her in the intensive care unit and imagined for an instant that she was being treated by education experts rather than physicians. As she recounts:

My new specialists began to argue over whether anything was actually wrong with me. A few thought that I had a problem, but others scoffed and said that such an analysis was tantamount to “blaming the victim.” . . .

Among the raucous crowd of education experts, there was no agreement, no common set of standards for diagnosing my problem. They could not agree on what was wrong with me, perhaps because they did not agree on standards for good health. Some maintained that it was wrong to stigmatize people who were short of breath and had a really sore leg; perhaps it was a challenge for me to breathe and to walk, but who was to say that the behaviors I exhibited were inappropriate or inferior compared to what most people did?

A few researchers continued to insist that something was wrong with me; one even pulled out the results of my CAT-scan and sonogram. But the rest ridiculed the tests, pointing out that they represented only a snapshot of my actual condition and were therefore completely unreliable, as compared to longitudinal data (which of course was unavailable).

. . . The assembled authorities could not agree on what to do to make me better. Each had his own favorite cure, and each pulled out a tall stack of research studies to support his proposals. One group urged a regimen of bed rest, but another said I needed vigorous exercise. . . . One recommended Drug X, but another recommended Drug Not-X. Another said that it was up to me to decide how to cure myself, based on my own priorities about what was important to me.

Just when I thought I had heard everything, a group of newly minted doctors of education told me that my body would heal itself by its own natural mechanisms, and that I did not need any treatment at all.7

This may read like caricature, yet it is clear that many education experts have not embraced the use of rigorous scientific research to identify effective methods. But this is not the only thing that affects their judgments. In other cases, what prevents them from being guided by scientific findings is a misunderstanding of the inherent limits of descriptive or qualitative research. Such research has its place. It can aid, for example, in the understanding of a complex problem and can be used to formulate hypotheses that can be formally evaluated (in an experiment with control groups, for instance). But such research cannot provide reliable information about the relative effectiveness of a treatment, of “Drug X” vs. “Drug Not-X.”

Despite this simple fact of logic, many education experts assume that descriptive research will determine the relative effectiveness of various practices. Claims made by two national organizations of mathematics educators illustrate the problem. In a letter to the president of the California State Board of Education, the American Educational Research Association’s Special Interest Group for Research in Mathematics Education wrote, “[D]ata from the large-scale NAEP tests tell us that children in the middle grades do well in solving one-step story problems but are unable to solve two-step story problems. A qualitative study, involving observations and interviews with children, can provide us with information about why this is the case and how instructional programs can be changed to improve this situation8 (emphasis added). In another letter to the same board, Judith T. Sowder, editor of the NCTM’s Journal for Research in Mathematics Education, wrote that “by in-depth study of children’s thinking we have been able to overcome some of our past instructional mistakes and design curricula that allows (sic) students to form robust mathematical concepts9 (emphasis added).

Both statements illustrate a serious reasoning fallacy, one that is pandemic in education: deriving an ‘ought’ from an ‘is.’ A richly evocative description of what a problem is does not logically imply what the solution to that problem ought to be. The viability of a solution depends on its being compared to other options.
What is clear from these examples is that lack of evidence does not deter widespread acceptance of untested innovations in education; indeed, a pedagogical method can even be embraced in the face of contradictory evidence. Conversely, the evidence for an instructional approach may be overwhelmingly positive, yet there is no guarantee that it will be adopted. The case of Direct Instruction is a prime example.


A Large-scale Education Experiment

In the annals of education research, one project stands out above all others. Project Follow Through was probably the largest education experiment ever conducted in the United States. It was a longitudinal study of more than twenty different approaches to teaching economically disadvantaged K-3 students. The experiment lasted from 1967 to 1976, although Follow Through continued as a federal program until 1995. Project Follow Through included more than 70,000 students in more than 180 schools, and yearly data on 10,000 children were used for the study. The project evaluated education models falling into two broad categories: those based on child-directed construction of meaning and knowledge, and those based on direct teaching of academic and cognitive skills.

The battle between these two basic approaches to teaching has divided educators for generations. Each is rooted in its own distinctive philosophy of how children learn. Schools that have implemented the child-centered approach (sometimes called “constructivist”) have a very different look and feel from schools that have opted for the more traditional, teacher-directed approach (often called “direct instruction” in its most structured form).

First graders in a constructivist reading classroom might be found scattered around the room; some children are walking around, some are talking, some painting, others watching a video, some looking through a book, and one or two reading with the teacher. The teacher uses a book that is not specifically designed to be read using phonics skills, and, when a child misses a word, the teacher will let the mistake go by so long as the meaning is preserved to some degree (for instance, if a child reads “horse” instead of “pony”). If a child is stuck on a word, the teacher encourages her to guess, to read to the end of the sentence and then return to the word, to look at the picture on the page, and, possibly, to look at the first letter of the word.

In a direct instruction classroom, some children are at their desks writing or reading phonics-based books. The rest of the youngsters are sitting with the teacher. The teacher asks them to sound out challenging words before reading the story. When the children read the story, the teacher has them sound out the words if they make mistakes.

In the category of child-directed education, four major models were analyzed in Project Follow Through:

Constructivism/Discovery Learning: The Responsive Education Model, sponsored by the Far West Laboratory and originated by Glenn Nimnict. The child’s own interests determine where and when he works. The goal is to build an environment that is responsive to the child so that he can learn from it.
Whole Language: The Tucson Early Education Model (TEEM), developed by Marie Hughes and sponsored by the University of Arizona. Teachers elaborate on the child’s present experiences and interests to teach intellectual processes such as comparing, recalling, looking, and relationships. Child-directed choices are important to this model; the content is less important.
Developmentally Appropriate Practices. Cognitively Oriented Curriculum, sponsored by the High/Scope Educational Research Foundation and developed by David Weikart. The model builds on Piaget’s concern with the underlying cognitive processes that allow one to learn on one’s own. Children are encouraged to schedule their own activities, develop plans, choose whom to work with, etc. The teacher provides choices in ways that foster development of positive self-concept. The teacher demonstrates language by labeling what is going on, providing interpretations, and explaining causes.
Open Education Model. The Education Development Center (EDC) sponsored a model derived from the British Infant School and focused on building the child’s responsibility for his own learning. Reading and writing are not taught directly, but through stimulating the desire to communicate. Flexible schedules, child-directed choices, and a focus on intense personal involvement characterize this model.

The major skills-oriented, teacher-directed model tested in Project Follow Through was Direct Instruction, sponsored by the University of Oregon and developed by Siegfried Engelmann and Wes Becker. It emphasizes the use of small group, face-to-face instruction by teachers and aides using carefully sequenced lessons in reading, mathematics, and language in kindergarten and first grade. (Lessons in later grades are more complicated.) A variety of manuals, observation tools, and child assessment measures have been developed to provide quality control for training procedures, teaching processes, and children’s academic progress. Key assumptions of the model are: (1) that all children can be taught (and that this is the teacher’s responsibility); (2) that low-performing students must be taught more, not less, in order to catch up; and (3) that the task of teaching more requires careful use of educational technology and time. (The author of this report was involved with the Direct Instruction Follow Through Project at the University of Oregon.)

Data for the big Follow Through evaluation were gathered and analyzed by two independent organizations—Stanford Research Institute and Abt Associates.10 Students taught according to the different models were compared with a control group (and, implicitly, with each other) on three types of measures: basic, cognitive, and affective.

Mean percentile scores on the four Metropolitan Achievement Test categories—Total Reading, Math, Spelling, and Language—appear in Figure 1. Figure 1 also shows the average achievement of disadvantaged children without any special help, which at that time was at about the 20th percentile.

In only one approach, the Direct Instruction (DI) model, were participating students near or at national norms in math and language and close to national norms in reading. Students in all four of the other Follow Through approaches—discovery learning, language experience, developmentally appropriate practices, and open education—often performed worse than the control group. This poor performance came in spite of tens of thousands of additional dollars provided for each classroom each year.

Researchers noted that DI students performed well not only on measures of basic skills but also in more advanced skills such as reading comprehension and math problem solving. Furthermore, DI students’ scores were quite high in the affective domain, suggesting that building academic competence promotes self-esteem, not vice versa.11 This last result especially surprised the Abt researchers, who wrote:

The performance of Follow Through children in Direct Instruction sites on the affective measures is an unexpected result. The Direct Instruction model does not explicitly emphasize affective outcomes of instruction, but the sponsor has asserted that they will be consequences of effective teaching. Critics of the model have predicted that the emphasis on tightly controlled instruction might discourage children from freely expressing themselves, and thus inhibit the development of self-esteem and other affective skills. In fact, this is not the case.12

An analysis of the Follow Through parent data found moderate to high parental involvement in all the DI school districts.13 Compared to the parents of students from schools being served by other Follow Through models, parents of DI students more frequently felt that their schools had appreciably improved their children’s academic achievement. This parental perception corresponded with the actual standardized test scores of the Direct Instruction students.

These data were collected and analyzed by impartial organizations. The developers of the DI model conducted a number of supplementary studies, which had similarly promising results.

Significant IQ gains were found in students who participated in the program. Those entering kindergarten with low IQs (below 71) gained 17 points, while students entering first grade with low IQs gained 9.4 points. Children with entering IQs in the 71-90 range gained 15.6 points in kindergarten and 9.2 points in first grade.
Longitudinal studies were undertaken using the high school records of students who had received Direct Instruction through the end of third grade as well as the records of a comparison group of students who did not receive Direct Instruction. Researchers looked at test scores, attendance, college acceptances, and retention. When academic performance was the measure, the Direct Instruction students outperformed the control group in the five comparisons whose results were statistically significant. The comparisons favored Direct Instruction students on the other measures as well (attendance, college acceptances, and retention) in all studies with statistically significant results.14

Additional research showed that the DI model worked in a wide range of communities. Direct Instruction Follow Through sites were located in large cities (New York, San Diego, Washington, D.C.); mid-sized cities (Flint, Michigan; Dayton, Ohio; East St. Louis, Illinois); rural white communities (Flippin, Arkansas; Smithville, Tennessee); a rural black community (Williamsburg, South Carolina); Latino communities (Uvalde, Texas; E. Las Vegas, New Mexico); and a Native American community (Cherokee, North Carolina).

More than two decades later, a 1999 report funded by some of the nation’s leading education organizations confirmed the efficacy of Direct Instruction. Researchers at the American Institutes of Research who performed the analysis for the Educators’ Guide to Schoolwide Reform found that only three of the 24 schoolwide reform models they examined could present solid evidence of positive effects on student achievement. Direct Instruction was one of the three.15


Direct Instruction after Project Follow Through

Before Project Follow Through, constructivist approaches to teaching and learning were extremely popular. One might have expected that the news from Project Follow Through would have caused educators to set aside such methods and embrace Direct Instruction instead. But this did not happen. To the contrary.
Even before the findings from Project Follow Through were officially released, the Ford Foundation commissioned a critique of it. One of the authors of that study, the aforementioned Gene Glass, wrote an additional critique of Follow Through that was published by the federal government’s National Institute of Education. This report suggested that the NIE conduct an evaluation emphasizing an ethnographic or descriptive case-study approach because “the audience for Follow Through evaluations is an audience of teachers that doesn’t need statistical finding of experiments to decide how best to teach children. They decide such matters on the basis of complicated public and private understandings, beliefs, motives, and wishes.”16

After the results of the Follow Through study were in, the sponsors of the different programs submitted their models to the Department of Education’s Joint Dissemination Review Panel. Evidently the Panel did not value the differences in effectiveness found by the big national study of Follow Through; all of the programs—both successful and failed—were recommended for dissemination to school districts. According to Cathy Watkins, a professor of education at Cal State-Stanislaus, “A program could be judged effective if it had a positive impact on individuals other than students. As a result, programs that had failed to improve academic achievement in Follow Through were rated as ‘exemplary and effective.’ ”17 The Direct Instruction model was not specially promoted or encouraged in any way. In fact, extra federal dollars were directed toward the less effective models in an effort to improve their results.

During the 1980s and early 1990s, schools that attempted to use Direct Instruction (originally known as DISTAR)—particularly in the early grades, when DI is especially effective—were often discouraged by members of education organizations. Many experts were convinced that the program’s heavy academic emphasis was “developmentally inappropriate” for young children and might “hinder children’s development of interpersonal understanding and their broader socio-cognitive and moral development.”18 “DI is the answer only if we want our children to swallow whole whatever they are told and focus more on consumption than citizenship,” argued Lawrence Schweinhart of the High/Scope Educational Research Foundation.19 (High/Scope had developed one of the constructivist models.)

Faced with the evidence of Direct Instruction’s effectiveness, some experts still advocated methods that had not proved effective in Project Follow Through. “The kind of learning DISTAR tries to promote can be more solidly elicited by the child doing things,” argued Harriet Egertson, an early childhood specialist at the Nebraska Department of Education. “The adult’s responsibility is to engage the child in what he or she is doing, to take every opportunity to make their experience meaningful. DISTAR isn’t connected to anything. If you use mathematics in context, such as measuring out spoons of sugar in a cooking class, the notion of addition comes alive for the child. The concept becomes embedded in the action and it sticks.”20

Tufts University professor of child development David Elkind argued that, while Direct Instruction is harmful for all children, it

is even worse for young disadvantaged children, because it imprints them with a rote-learning style that could be damaging later on. As Piaget pointed out, children learn by manipulating their environment, and a healthy early education program structures the child’s environment to make the most of that fact. DISTAR, on the other hand, structures the child and constrains his learning style.21

The natural-learning view that underlies the other four Follow Through models described above is enormously appealing to educators and to many psychologists. The dominance of this view can be traced back to Jean-Jacques Rousseau, who glorified the natural at the expense of the man-made, and argued that education should not be structured but should emerge from the natural inclinations of the child. German educators developed kindergartens based on the notion of natural learning. This romantic notion of learning has become doctrinal in many schools of education and child-development centers, and has closed the minds of many experts to actual research findings about effective approaches to educating children.22 This is a classic case of an immature profession, one that lacks a solid scientific base and has less respect for evidence than for opinion and ideology.


Learning from Other Professions

Education could benefit from examining the history of some other professions. Medicine, pharmacology, accounting, actuarial sciences, and seafaring have all evolved into mature professions. According to Theodore M. Porter, a history professor at the University of California at Los Angeles, an immature profession is characterized by expertise based on the subjective judgments of the individual professional, trust based on personal contact rather than quantification, and autonomy allowed by expertise and trust, which staves off standardized procedures based on research findings that use control groups. 23

A mature profession, by contrast, is characterized by a shift from judgments of individual experts to judgments constrained by quantified data that can be inspected by a broad audience, less emphasis on personal trust and more on objectivity, and a greater role for standardized measures and procedures informed by scientific investigations that use control groups.

For the most part, education has yet to attain a mature state. Education experts routinely make decisions in subjective fashion, eschewing quantitative measures and ignoring research findings. The influence of these experts affects all the players in the education world.

Below is a description that could very well describe the field of education:

It is hard to conceive of a less scientific enterprise among human endeavors. Virtually anything that could be thought up for treatment was tried out at one time or another, and, once tried, lasted decades or even centuries before being given up. It was, in retrospect, the most frivolous and irresponsible kind of human experimentation, based on nothing but trial and error, and usually resulting in precisely that sequence.24

Yet this quote does not describe American education today. Rather, it was written about pre-modern medicine by the late Dr. Lewis Thomas (1979), former president of the Memorial Sloan-Kettering Cancer Center. Medicine has matured. Education has not. The excerpt continues:

Bleeding, purging, cupping, the administration of infusions of every known plant, solutions of every known metal, most of these based on the weirdest imaginings about the cause of disease, concocted out of nothing but thin air—this was the heritage of medicine up until a little over a century ago. It is astounding that the profession survived so long, and got away with so much with so little outcry. Almost everyone seems to have been taken in.25

Education has not yet developed into a mature profession. What might cause it to? Based on the experience of other fields, it seems likely that intense and sustained outside pressure will be needed. Dogma does not destroy itself, nor does an immature profession drive out dogma.

The metamorphosis is often triggered by a catalyst, such as pressure from groups that are adversely affected by the poor quality of service provided by a profession. The public’s revulsion at the Titanic’s sinking, for example, served as catalyst for the metamorphosis of seafaring. In the early 1900s, sea captains could sail pretty much where they pleased, and safety was not a priority. The 1913 International Convention for Safety of Life at Sea, convened after the sinking of the Titanic, quickly made rules that are still models for good practice in seafaring.

The metamorphosis of medicine took more than a century. As the historian Theodore Porter explains:

In its pre-metamorphosis stage, medicine was practiced by members of an elite who refused . . . to place the superior claims of character and breeding on an equal footing with those of scientific merit. . . . These gentlemen practitioners opposed specialization, and even resisted the use of instruments. The stethoscope was acceptable, because is was audible only to them, but devices that could be read out in numbers or, still worse, left a written trace, were a threat to the intimate knowledge of the attending physician.26

External pressure on medicine came from life insurance companies that demanded quantitative measures of the health of applicants and from workers who did not trust “company doctors.” The Food and Drug Administration, founded in 1938 as part of the New Deal, initially accepted both opinions from clinical specialists and findings from experimental research when determining whether drugs did more good than harm. However, the Thalidomide disaster led to the Kefauver Bill of 1962, which required drugs thereafter to be proven to be effective and safe before they could be prescribed, with little attention paid to the opinions of clinical specialists. (Medical interventions and intervention devices, such as coronary stents, are subject to similar reviews of safety and efficacy.)

The catalyst that transformed accounting in the United States was the Great Depression. To restore investor confidence, the government promulgated reporting rules to guard against fraud, creating the Securities and Exchange Commission.

In general, it appears that a profession is not apt to mature without external pressure and the attendant conflict. Metamorpho-sis begins when the profession determines that this is its likeliest path to survival, respect, and prosperity. Porter writes that the American Institute of Accountants established its own standards to fend off an imminent bureaucratic intervention.27 External pressures had become so great that outsiders threatened to take over and control the profession via legislation and regulation. There are signs today that this is beginning to happen in education.


Making Education a Mature Profession

The best way for a profession to ensure its continued autonomy is to adopt methods that ensure the safety and efficacy of its practices. The profession can thereby deter extensive meddling by outsiders. The public trusts quantified data because procedures for coming up with numbers reduce subjective decision-making. Standardized procedures also are more open to public inspection and legal review.

American education is under intense pressure to produce better results. The increasing importance of education to the economic well-being of individuals and nations will continue feeding this pressure. In the past—and still today—the profession has tended to respond to such pressures by offering untested but appealing nostrums and innovations that do not improve academic achievement. At one time or another, such practices have typified every profession, from medicine to accounting to seafaring. In each case, groups adversely affected by the poor quality of service have exerted pressures on the profession to incorporate a more scientific methodology.

These pressures to mature are inevitable in education as well. Its experts should hasten the process by abandoning ideology and embracing evidence. Findings from carefully controlled experimental evaluations must trump dogma. Expert judgments should be built on objective data that can be inspected by a broad audience rather than wishful thinking. Only when the profession embraces scientific methods for determining efficacy and accepts accountability for results will education acquire the status—and the rewards—of a mature profession.



1 National Reading Panel Progress Report, 22 February 1999. <www.nationalreadingpanel.org>
2 James Collins, “How Johnny Should Read,” Time, 27 October 1997, 81.
3 Nicholas Lemann, “The Reading Wars,” Atlantic Monthly (November 1997), 133-134.
4 National Council of Teachers of Mathematics, Curriculum and evaluation standards for school mathematics (Reston, VA: Author, 1989), 253.
5 Ibid, 2.
6 Gene Glass, “Research news and comment-a conversation about educational research priorities: A message to Riley,” Educational Researcher 22, no. 6 (August-September 1993), 17-21.
7 Diane Ravitch, “What if Research Really Mattered?” Education Week, 16 December 1998.
8 Personal communication with California State Board of Education.
9 Personal communication with California State Board of Education.
10 L. Stebbins, ed., “Education experimentation: A planned variation model” in An Evaluation of Follow Through III, A (Cambridge, MA: Abt Associates, 1976), and L. Stebbins, et al., “Education as experimentation: A planned variation model,” in An Evaluation of Follow Through IV, A-D (Cambridge, MA: Abt Associates, 1977).
11 Stebbins et al., 1977.
12 Abt Associates, “Education as experimentation: A planned variation model,” An Evaluation of Follow Through IV, B (Cambridge, MA: Author, 1977), 73.
13 Walter Haney, A Technical History of the National Follow Through Evaluation (Cambridge, MA: Huron Institute, August 1977).
14 R. Gersten and T. Keating, “Improving high school performance of ‘at risk’ students: A study of long-term benefits of direct instruction,” Educational Leadership 44, no. 6 (1987), 28-31.
15 Although the data supporting Direct Instruction are quite strong, it is important to note that the model is demanding to implement and results from a poor implementation may be poor.
16 Gene Glass and G. Camilli, “FT Evaluation” (National Institution of Education, ERIC document ED244738), as cited in Nina H. Shokraii, “Why Congress Should Overhaul the Federal Regional Education Laboratories,” Heritage Foundation Backgrounder, no. 1200 (Washington, DC: The Heritage Foundation, 1998).
17 Cathy L. Watkins, “Follow Through: Why Didn’t We,” Effective School Practices, 15, no. 1 (Winter 1995-96), 5.
18 Rheta DeVries, Halcyon Reese-Learned, and Pamela Morgan, “Sociomoral development in direct instruction, eclectic, and constructivist kindergartens: a study of children’s enacted interpersonal understanding,” Early Childhood Research Quarterly 6, no. 4, 473-517, as cited in Denny Taylor, Beginning to Read and the Spin Doctors of Science (Urbana, IL: National Council of Teachers of English, 1998), 231.
19 Lawrence Schweinhart, “Back to School,” letter appearing in National Review, 20 July 1998.
20 As quoted in Ellen Ruppel Shell, “Now, which kind of preschool,” Psychology Today (December 1989).
21 Ibid.
22 E.D. Hirsch, Jr., “Reality’s revenge: Research and ideology,” American Educator (Fall 1996), excerpted from E.D. Hirsch, Jr., The Schools We Need and Why We Don’t Have Them (New York, NY: Doubleday, 1996). An interesting perspective on this topic can be found in an unpublished paper by Thomas D. Cook of Northwestern University called “Considering the Major Arguments against Random Assignment: An Analysis of the Intellectual Culture Surrounding Evaluation in American Schools of Education.”
23 Theodore M. Porter, Trust in Numbers: The Pursuit of Objectivity in Science and Public Life (Princeton, NJ: Princeton University Press, 1996).
24 Lewis Thomas, “Medical Lessons from History,” in The Medusa and the Snail: More Notes of a Biology Watcher (New York: Viking Press, 1979), 159.
25 Ibid, 159-160.
26 Porter, 202.
27 Ibid, 93.








Chester E. Finn, Jr., President
Thomas B. Fordham Foundation
Washington, DC
April 2000