Mathematically Correct
It's a weighty responsibility to decide what to speak of here, given the huge importance of California K-12 education, both here and in the nation. I sense a determination among Californians, both liberals and conservatives, to get education right this time around, to rise above politics, partisanship, and ideology and achieve a first-class improvement in the schools. If my e-mail is any indication, there's a lot of educational ferment going on in this state.
With all that energy and public attention, intelligence, determination and passion being directed to education, I believe that you are going to make significant improvements. It won't happen overnight, because people don't change overnight, and kids don't get educated overnight. But there is going to be palpable improvement. The climate of opinion is changing towards getting results, and of imposing consequences on all parties to education, including students. The late lamented Al Shanker said over and over in recent years that student learning will not rise until American students believe in their hearts that there will be serious consequences for not learning. Some of you are rightly insisting that students should be sent that message of tough love.
When I talked to Mr. Lucia about the context in which I would be speaking, he mentioned that California law requires education policy to be research-based. That's the theme I shall focus on.
The enormous problem faced in basing policy on research is that it is almost impossible to make educational policy that is not based on research. Almost every educational practice that has ever been pursued has been supported with data by somebody. I don't know a single failed policy, ranging from the naturalistic teaching of reading, to the open classroom, to the teaching of abstract set-theory in third-grade math that hasn't been research-based. Experts have advocated almost every conceivable practice short of inflicting permanent bodily harm.
So we need to discriminate between reliable and unreliable research. And of course my recommendation is going to be that only reliable research should guide policy. Now it is possible to give some rules of thumb for determining scientific reliability, but there is no formula adequate to all situations. The distinguished sociologist of science, Stephen Cole in his Harvard Press book, called Making Science has found a continuous spectrum of reliability in most of the natural and social sciences. At the core of each discipline, there develops a consensus of the learned, and this consensus is highly dependable. It is close enough to being right that you can bet your life and your children's lives on that core. But out at the edge, on the frontier of the discipline, there is a lot of disagreement, and we can't tell for sure which rival theory is right. When lawmakers say that education policy should be based on research, the spirit of the law implies reliable, consensus research. Any other interpretation would mean, and has meant, carrying out unwarranted human experimentation on our own children.
If this distinction between core and non-core research is rightly understood, and if its implications are followed in California, then I think the days of faddism, guruism, partisanship, and unwarranted experimentation may be numbered. I'm not saying that research can decide the aims of education. In a democracy, those are decided by the people. But core science can determine how best to achieve them. Take reading. As a people we have decided that we want all our children to read well. Mainstream research has been saying for some years that a naturalistic approach cannot achieve that goal for all children. The reasons why that core research was not heeded is a subject for intellectual and social history, some of which I traced in my recent book, The Schools We Need & Why We Don't Have Them.
I was forced to conclude that in the field of psychology, which is the key field for education research, much of what is accepted within the educational community has been required to conform to a so-called "constructivist" ideology that does not represent the consensus in mainstream psychology, and is almost certainly incorrect. One distinguished psychologist who receives grants from the education division of the National Science Foundation (NSF) expressed dismay at the ideological, anti-empirical sermons, as he called them, which he hears at the education division of NSF meetings in psychology.
Insistence upon ideological conformity makes for unreliable science. It hinders the best research from getting disseminated to the education world -- to journalists, policy makers, publishers, teachers, and administrators. As a result, there is an information gap regarding the findings of mainstream psychology as applied to education.
This is a situation that is reminiscent of what happened to biology in the Soviet Union under Lysenkoism, which is a theory that bears similarities to constructivism. In Stalin's day, Lysenko was the powerful bureaucrat-scientist who controlled Soviet biological research, and declined to fund any that didn't conform to the received ideology, which consisted in the view that nurture can transform nature. During the Lysenko period, the dominance of this ideology over disinterested research not only retarded Soviet biology, it caused mass starvation. There are analogies lurking in that history. Over the door of every board of education should be posted the watchword: "Remember Lysenko."
Let me illustrate with one recent incident. The premier journal of educational research is Educational Researcher. Recently, an article was submitted that refuted the claims of situated learning. (Situated learning is the supposed scientific basis of such teaching methods as project learning, integrated learning, and thematic learning). The article also refuted the claims of constructivism, which is a supposedly scientific foundation for such teaching methods as inquiry learning, discovery learning and hands-on learning. After a so-called peer review, Educational Researcher turned down the article, and agreed to print only a section of its critique of situated learning. This decision would have been unremarkable except that the three authors of the article happened to be among the most distinguished cognitive scientists in the world, John Anderson and two other colleagues at Carnegie Mellon, Lynn Reder, and Herb Simon. The latter happens also to be a Nobel prize winner.
No knowledgeable and disinterested person should doubt that Anderson, Reder, and Simon are far more likely than their journal reviewers to be expressing the consensus view at the core of mainstream psychology. It is safe to bet that they are much more likely to be right than the peer reviewers chosen by Educational Researcher. This is a rather clear example of how educational Lysenkoism closes off important and sometimes critical sources of scientific information.
Research can't flourish under such intellectual conformity. It's our collective duty to make sure that journalists, educators, and policy makers have access to the best information from mainstream science. If scientific information had been allowed to flow more freely during the past two decades, the school scene would have a different face than it does now. California math and reading scores would almost certainly be higher.
Over the past decades, educational Lysenkoism has created a conflict between the conclusions promulgated widely in education and those that are accepted in mainstream psychology. Of several such conflicts I shall choose three of the most important -- testing, math, and early education. I intend to be blunt, since forthrightness will be more useful to you than tact. I won't revisit reading research, since the Board and its advisors have already had the benefit of first-class scientific advice, and has acted accordingly, having made policy that is consistent with what is agreed on by such top researchers as Adams, Foorman, and Stanovich.
In each of the three cases, I shall briefly outline the conflicts between educational Lysenkoism and mainstream science, and then I'll list the names of a few highly-regarded scientists whom you could consult with confidence. In order to make my comments as useful as possible, I will go ahead and make some informed predictions about what those top researchers would tell you, and I will leave a copy of this presentation behind for your use. I got the names by a very simple device. I asked a number of highly reputed scientists which colleagues they considered to be the most authoritative persons in their field, and I found there was wide agreement about those names. As Cole points out, we need to depend ultimately on the consensus views of scientists who are regarded as tops in their fields by other scientists. Should doubt arise as to who those persons are, one should ask for guidance from the National Academy of Sciences. It does not make sense to depend any longer on the guru-principle.
It is a paradox that I should be offering advice about assessment here in California, where you have the recognized dean of the subject, Professor Lee Cronbach of Stanford. As you know, there is a rage in education circles for so-called performance assessments. I'd like to know what Professor Cronbach would have to say to you about the widespread rush to use these expensive and undependable modes of testing. Despite the tendency of a large body of psychometric research, the current position of the educational community is that performance tests are superior to multiple-choice tests. Educators and state legislatures have hastened to mandate these hugely expensive and unreliable instruments as high-stakes, summative assessments. Recently, The Wall Street Journal printed a front-page article about the confusing consequences of performance testing in Michigan. Before California joins the crowd, I would advise you to consult Professor Cronbach if you haven't already, or other top researchers like Samuel Messick, Robert Lind, and Eva Baker. The answers you get from them will be reliable ones.
They are likely to tell you that performance assessments are good for classroom use. And they would probably concede that a writing sample should be a component of any writing test, if only to make sure that the right message about doing lots of writing in school shall be sent out to students and teachers. But core research says that performance assessments are the least reliable and the most expensive tests that exist. Top scientists in the field would advise you against using end-of-year performance tests, if your aim is to use assessments that are accurate, dependable, and reasonably-priced. Specialists will also tell you that almost all the nasty things said about multiple choice tests are incorrect.
This example illustrates an important principle about reliability. Scientific consensus is not just a matter of counting heads. If you counted all the experts who have gotten on board the performance-test bandwagon, they would outnumber by far the toilers in the psychometric vineyards who publish meticulous articles in the best journals. Counting heads is not the way to determine a scientific consensus. The number of people who believe in flying saucers is greater than the total number of astrophysicists in the world.
I am making the perhaps disagreeable point that science is an elitist subject, and ought to be so. The consensus that counts is the consensus of the learned. That kind of consensus is determined by disinterested, high-quality peer review in high- quality journals. In cases of disagreement, that's why you should stick to people like Professor Cronbach. In the end, of course, only evidence and argument count in science. But there is evidence and evidence, argument and argument. It is an uncomfortable thing to say, but the average quality and reliability of science in the best educational journals is below the quality and reliability of science in the best mainstream journals. We laypersons cannot judge the quality of research. Figures don't lie, but how do we know which figures are accurate, complete, and rightly interpreted? Our only recourse is to depend on the reputations of the most highly-regarded journals and scientists. Sensible persons would not quickly challenge Lee Cronbach any more than they would challenge a Nobelist like Herb Simon. Such highly-regarded sources are not always right, but they are far more likely to be right. The consensus of the learned in first-rate scientific work is one of the closest connections we have with the reality principle.
Let me turn to math education. I read a recent report in Education Week which stated that there were two rival math groups in California vying for your approval. On the one side there is what Education Week called the "reform" group who want to put in place the standards of the National Council of Teachers of Mathematics (NCTM), and on the other, the so-called "anti-reform" group that calls those standards variously "fuzzy math" and "whole math." I thought that the tone of the Ed Week report was typical of current educational reporting in that the NCTM approach, which reflects the dominant view among educators, was labeled "reform" while the dissident group that is trying to effect change was labeled "anti-reform." That kind of ideological bias in reporting is characteristic of the education world, and it well illustrates the need for constant vigilance.
To this Board I hardly need to restate the details of the math debate. The NCTM group stresses conceptual understanding over mindless drill and practice, while the dissident group stresses the need for drill and practice leading to mastery. To resolve the issue, which researchers should you listen to? Here are three suggestions: John Anderson, David Geary, and Robert Siegler -- three highly distinguished scientists in the psychology of math education. What are they likely to tell you? I believe you will get strong agreement from them on the following points: that varied and repeated practice leading to rapid recall and automaticity is necessary to higher-order problem-solving skills in both mathematics and the sciences.
They would probably explain to you that lack of automaticity places limits on the mind's channel capacity for higher-order problem-solving skills. They would tell you that only intelligently directed and repeated practice, leading to fast, automatic recall of math facts, and facility in computation and algebraic manipulation can one lead to effective real-world problem solving. Anderson, Geary, and Siegler would provide you with reliable facts, figures, and documentation to support their position, and these data would come not just from isolated lab experiments, but also from large-scale classroom results. If these top scientists agreed on all these points, that is the consensus you should trust, no matter how many pronouncements to the contrary might be made by national educational bodies.
Speaking of National Educational Bodies brings me to my third and last example of conflict between educational research and mainstream research. To my mind, it is the most fateful conflict of all, since it touches on the general quality of our educational system, and its ability to realize the dream of the common school, that is, the dream of providing genuine equality of educational opportunity to all students regardless of their backgrounds. There is a National Body called the National Association for the Education of Young Children, NAEYC. It withholds its approval from schools and preschools that fail to follow what it calls "Developmentally Appropriate Practice." In its policy statements, it considers it developmentally inappropriate for a whole class to listen to a teacher as a group, or for children to learn academic topics that are deemed too challenging, too advanced, or too ... inappropriate. I have heard NAEYC experts state that the Eiffel Tower is developmentally inappropriate, and also James Monroe, though not James Madison.
Who are top researchers to whom you might turn to ask whether this position is sound? What does consensus mainstream science say about the appropriateness of giving young children challenging academic instruction in preschool through third grade? Two top scientists from California would be Rochel Gelman and James Stigler at UCLA, or, looking east, Kevin Miller at the University of Illinois. Or, looking west, Sandra Scarr would fly in from Hawaii to advise you if you beckoned. There are many other names. Any scientist who has kept up with this field would tell you that there is no foundation in fact or in desirable practice for withholding challenging content from young children.
In my recent book I discussed this discrepancy between the romantic doctrines of the NAEYC and the findings of mainstream research. Since the book appeared, the Carnegie Foundation has issued a report called "Years of Promise," which also shows that the dominant ideas about developmental appropriateness are not science but ideology. The overwhelming evidence against the positions of the NAEYC recently caused that body subtly to revise its guidelines. But the revision is skin deep, and doesn't openly admit that a retreat has occurred, and even that slight shift has not filtered down to experts in early childhood education, who still pronounce on "Developmentally Appropriate Practice."
What advice would the scientists give? They would certainly reject many of the still current positions of the NAEYC which still powerfully dominate the education world. These researchers would encourage you to create challenging, content-rich academic programs for all young children. They would say that programs like Head Start, if fortified with coherent goals and academically rich content, and if followed by coherent goals and academically rich content in kindergarten, first and second grades, that such a policy would enable students to overcome many of our current educational defects and inequities.
I have saved this supremely important topic for my final example of the pervasive conflict between science and educational ideology. The doctrine of Developmentally Appropriate Practice is drummed into almost all teachers who take early education courses. The intention is to insure caring treatment for young children, yet the ultimate effect of the doctrine is to cause social harm. To withhold demanding content from young children between preschool and third grade has an effect which is quite different from the one intended. It leaves advantaged children (who get knowledge at home) with boring pablum, and it condemns disadvantaged children to a permanent educational handicap that grows worse over time. We know that early education can overcome many of these deficits, and we also know that what is called Developmentally Appropriate Practice can not.
This doctrine and the practices that stem from it are largely responsible for the educational inadequacies that led to the recent controversy over Ebonics. It wasn't a difference between Black and White English that ultimately created desperation in Oakland. It was a much more general educational failure. Much- needed content and language-skills are not being taught to all our children at an early age. William Julius Wilson makes this point in his recent book on the urban ghetto, When Work Disappears. Disadvantaged children need precisely the sort of learning that is falsely called developmentally inappropriate. Not just Black children are being penalized by withholding the early knowledge needed for educational success. The withholding of an academic and verbal focus in early education generally handicaps all children especially disadvantaged ones. As the late, great James Coleman showed, it is ineffective early schooling coupled with economic class, not with race or ethnicity, that causes the academic achievement-gap.
I think you would get consensus from mainstream science on the following prediction: that if we bring all children to readiness in the early grades, then the achievement of excellence and equity in later grades will begin to be possible. If I were a member of your Board, I would to begin to shift rather large resources but only into academically effective, really effective very-early education. In due course, such a policy would loosen up a lot of remedial money that could be spent on improving very early education still more. Overcoming the inadequacies of early education is the most effective way of preventing the inadequacies that exist at 12th grade.
Yet the phrase "Developmentally Appropriate Practice," has been very effective politically. It has played on our love and solicitude for young children. It is used as a kind of conversation stopper. If one is told that an educational recommendation is "developmentally inappropriate," one is supposed to retreat and remove the offending item from the early curriculum. But this retreat has to stop. We must stand up to unsupported rhetorical bullying, and rely on the people who know the research. To cave in to intimidating rhetoric is to harm our children, not help them. The romantic doctrine of NAEYC is wasting minds and perpetuating social inequities.
Let me close with some graphs from a recent issue of American Educator, put out by the American Federation of Teachers, Al Shanker's union. First, here's a picture of the cover. It illustrates developmentally inappropriate practice at work. I'll read what this enthusiastic first grader is saying.
"My name is Jose Castro-Rodriguez. I'm in the first grade, and right now we're learning about Ancient Egypt -- about the sacophagus -- that's what they put the mummies in: and how they got the bodies ready to be mummies and which body parts they put into the canopic jars -- they threw away the brain because they thought the heart did the thinking: and how they had to make sure no one finds out where the mummies were, because you're not supposed to mess with dead people: and how they used an ostrich feather to measure the heart, and if it was little that meant you had been good and could go to the next life: and about the different Egyptian gods. And we've been learning about King Tut...
I also know a lot about the Aztecs. Do you want me to tell you about that, too?"
Now, here's a page with some graphs. The editor, Liz McPike of AFT tried to show graphically here some of the educational benefits of being developmentally inappropriate. I'm not using these graphs, by the way, to recommend the Core Knowledge sequence, since you probably know that I am prejudiced in its favor. I'm simply recommending that you demand an early curriculum that is equally rich, and effective, and equally inappropriate. One of the greatest services we can provide to our children would be to start inducing self doubt in those early-childhood experts who have been wielding the word "inappropriate" like a battle-ax.
[Note: These figures were approximated from an earlier document]
The graph on the right shows the progress that is made by disadvantaged children who start off below the district norm, but who in a couple of years of having a rich, coherent, and inappropriate curriculum, catch up with and exceed their advantaged peers who are following the typically fragmented, incoherent, and "appropriate" sort.
The graph on the left indicates the same sort of equity- and excellence- effects in a different way. You see on the vertical the percentage of students who are on free and reduced lunch. On the horizontal is the percentage of students who score above average. Each dot represents the performance of a school. This is a district that has just one developmentally inappropriate school. You can guess which one it is. In the other schools, academic achievement is inexorably and precisely linked to social class. Both of these graphs illustrate that a rich and coherent early curriculum improves performance for all, but improves it most for disadvantaged students, thus narrowing the equity gap -- just as Coleman predicted in his massive study "Equality of Educational Opportunity." In short, if you want to achieve excellence and equity, you have to withstand the accusation of being developmentally inappropriate.
But true to my theme, let me close by cautioning you to look with just as cold an eye on my data as you would those of any other educational guru. What do reliable experts say about these assertions? Is this a practice that has proved itself on a large scale? What is the consensus of top-notch researchers in the field?
Those should be the constant questions asked in research-based policy. I hope I have helped clarify what the term "research- based" ought to mean in practice. I earnestly hope you will follow this great principle. A lot of people will be watching what you do with intense interest. You are off to a fine beginning in the field of reading. And if reliable research becomes your guide in other domains as well, it may later be said that this was the Board that put an end to the era of fad and failure. That's your great opportunity. I hope you seize it.
Thank you.