Testing Student Learning, Evaluating Teaching Effectiveness

by Williamson F. Evers, Herbert J. Walberg


This book takes a hard look at the professional, technical, and public policy issues surrounding student achievement and teacher effectiveness—and shows how testing and accountability can play a vital role in improving American schools.

About the Author

Williamson M. Evers is a research fellow at the Hoover Institution and a member of the Institution's Koret Task Force on K-12 Education. He specializes in research on education policy - especially as it pertains to curriculum, teaching, testing, and accountability from kindergarten through high school. From July to December 2003, he served as senior adviser for education to Administrator L. Paul Bremer of the Coalition Provisional Authority in Iraq. Herbert J. Walberg is a distinguished visiting fellow at the Hoover Institution and a member of its Task Force on K–12 Education, is University Scholar and emeritus professor of education and psychology at the University of Illinois at Chicago. His research focuses on educational productivity and human accomplishments.

Read an Excerpt

By Williamson M. Evers, Herbert J. Walberg

Examinations for Educational Productivity

Herbert J. Walberg

This chapter addresses three questions: (1) Where do U.S. schools stand on international examinations relative to those in other affluent countries? (2) Why do they do so poorly at such great cost? (3) How can examinations help? I argue that objective examinations, though imperfect, are reasonable measures of important results of schooling. Though they may not tell the whole story, they can be readily employed to discover effective practices, to improve accountability, and to evaluate choice experiments. Other examinations, such as portfolios and laboratory exercises, are appropriate for assessing students' classroom work but have proven costly and impractical for evaluating schools and districts. In any case, it is important to employ value-added measures to assess the contributions of schools, programs, and staff.

Examinations can be keys to improving the productivity of U.S. schools. Educational policy makers can employ them to evaluate educational organizations, policies, and programs to determine which are most effective and efficient. Value-added analyses of examination results offer a way to achieve their policy and accountability purposes.

In a free society, however, consumer choice would seem to offer the ultimate and best accountability. Since private and public scholarships are unlikely to predominate soon, examinations can serve to help evaluate various means and degrees of enlarging choice and competition in the educational systems.

Where Do U.S. Schools Stand?

It is increasingly well-known that our secondary school students score poorly on objective examinations compared with those in other economically advanced countries. By such standards, our high school students have long done poorly in these subjects, although primary school students have scored nearer to the average. These differences suggest that our students make poor progress during the school years. But how much worse is their progress relative to that of students in other countries? My report for the Thomas B. Fordham Foundation took up this question and compiled all recent achievement comparisons.

The report compared advanced countries that are members of the Organization for Economic Cooperation and Development in North America, the Pacific Rim, and Western Europe. Among schools in OECD countries, those in the United States made the smallest achievement gains. The longer U.S. students were in school, the further they fell behind students in the other countries. Yet per-student expenditures on U.S. schools are among the very highest. More specifically:

1. In reading, science, and mathematics through eighth grade, U.S. schools ranked last in four of five comparisons of achievement progress. In the fifth case, they ranked second to last.

2. Between eighth grade and the final year of secondary school, U.S. schools slipped further behind those in other countries.

3. Because they made the least progress, U.S. secondary schools ranked last in mathematics attainment and second to last in science — far from the goal to be first in the world by the year 2000, set by the fifty governors and endorsed by Congress and the 1996 presidential candidates.

4. U.S. per-student spending (adjusted for purchasing power) on primary and secondary schools was third-highest among more than twenty advanced countries.

5. Unlike in the past, more secondary school students remain in school on average in comparable countries than in the United States. Thus, their superior gains do not depend merely on student selectivity or higher dropout rates.

6. Because they made the poorest progress and ranked in the highest category of expenditures, U.S. schools, by internationally agreed- upon standards, are the least productive among those in comparable economically advanced countries.

Value-Added Comparisons

These conclusions are based on the most recent, largest, and most rigorous international achievement surveys. Unlike other reports, the conclusions concern the value added largely by schools as indexed by progress made by students during the school years.

Value-added scores are particularly important in evaluating schools. Consider the case of reading. Until children start school at about age six, families, media, and other agencies — rather than schools — are the chief sources of influence on vocabulary and comprehension. For this reason, children start school with widely varying degrees of preparation. Some parents but not others, for example, teach their children to read before first grade. The big education question is: How much progress do students make after they start school?

Static comparisons of schools (employed in the past) are less useful for this purpose because students' tests scores are partly determined by their experiences before they begin school, attributable to parental efforts, socioeconomic status, and related factors. Thus, gains in achievement during the school years are better indexes of schools' contributions to learning than scores at a single point in time.

Gains, progress, and value added — terms used synonymously here — are particularly important for policy. They allow predictions of eventual attainments. Policies that do not add satisfactory value may be revised. Units of the system, such as primary and secondary schools, may be separately evaluated by measuring students' progress while under their responsibility. In addition, many economists, psychologists, and others believe incentives influence performance. For this reason, principals should give merit raises for recent progress rather than for degrees and years of experience, for which most teachers are paid. If carrots and sticks were employed in education, value-added progress would be one reasonable indicator of teaching merit.

Educational policy makers increasingly recognize the usefulness of value-added indicators. Internationally, the OECD pioneered the use of value-added indicators in the 1995 edition of Education at a Glance and has employed them in subsequent reports. Similarly, Dallas, Texas, and Tennessee are employing value-added indicators and incentives to increase school productivity. Other cities and states, such as Chicago and Virginia, employ static indicators to assign schools to probation and, in cases of failure to progress, eventual extinction. Such systems identify schools that serve poor children but that are not ineffective. A fairer and more efficient evaluation system would employ value-added indicators as at least one consideration in evaluating schools.

Why Do U.S. Schools Do So Poorly?

Several problems appear to account for poor productivity of U.S. public schools. After reviewing these, we can consider how effective practices, better accountability, and enlarged choice together with objective examinations may help solve them.

Lack of State Standards

Unlike many other countries, the U.S. education system has no education ministry nor well-defined national goals, curriculum, or testing system. The U.S. system leaves states largely responsible for providing schools, but states leave varying amounts of discretion to local boards. What is taught in classrooms, in turn, is highly variable even within the same school and district. For these reasons, a teacher in any grade cannot depend on what the teacher in the previous grade has taught. The lack of coordination across grades and subjects is especially harmful to children who move, particularly if they also are poor.

Lack of standards means that state and local boards can hardly assess progress made by districts, schools, and teachers. To the extent that curriculum and goals vary, it is difficult to compare schools, which makes accountability for results nearly impossible.

Centralized Finance and Control

Despite the lack of uniform standards and accountability, the governance and funding of public schools have become more centralized in the last half-century, leading to other kinds of inefficiency. States have increasingly assumed responsibility for educational finance, goals, and operations. They paid ever-larger shares of school costs, but the higher the state's share, the worse the state's achievement, despite vast increases in inflation-adjusted per-student spending. Higher state shares make local school boards and administrators less accountable to local citizens since they need not justify expenditures as carefully. California's tie for last place in recent national reading assessments may be attributable to whole-language teaching and highly centralized state funding rather than the greater local control and accountability afforded by local funding.

Larger state shares also entail increased regulation, reporting, bureaucracy, and distraction from learning. Much energy goes into the question of who governs — the federal government, the state, the local district, the school, or the teacher. It is nearly impossible to affix responsibility for results.

Schools and school districts, moreover, have increasingly consolidated into larger units that achieve less. Over the course of a recent fifty-year period, average school enrollments in the United States multiplied by a factor of five, even though large schools tend to be more bureaucratic, impersonal, and less humane. Large middle and junior high schools tend to departmentalize and employ specialized teachers and ancillary staff who confine themselves to their specialties rather than imparting a broad view of knowledge. The teachers in large, departmentalized schools tend to know their students much less well than teachers who have the same students for most subjects for nearly the whole day.

About a half-century ago, there were 115,000 U.S. school districts; now there are about 15,000, the largest of which tend to be least effective. The reasons for their inefficiency are best seen in New York and other large cities that have up to 900 schools. In such huge districts, school board members can hardly name the schools let alone hold them accountable.

On the other hand, small adjacent public school districts and private schools within districts give rise to incentives that cause all schools to compete and raise their productivity, that is, raise achievement and student retention while lowering costs. Choice plans that allow students to cross school and district boundaries may also prove to increase competition and productivity. Choice among schools, nonetheless, is severely constrained, which helps account for poor U.S. productivity.

Lack of Board Accountability

School boards frequently split into factions. And few members have extensive board, business, or education experience. Often serving limited terms, they seem more interested in personnel and ideological issues than in whether the schools are achieving results. Assessing learning progress, moreover, requires some mastery of educational productivity research, psychometrics, and statistics, just as assessing businesses' progress requires accounting and other skills. Few board members or educational administrators have mastered such skills. Instead, they take up such fads as Ebonics, whole language, authentic tests, and bilingual education — the success of which remain undemonstrated in randomized experiments or statistically controlled research.

Unaccountable Management

Public schools are government-subsidized quasi-monopolies. They are unchallenged by entrepreneurial leadership and the incentives, efficiency, and consumer appeal provided by market competition. With legislators and school boards often under their thumbs, teachers' unions and administrators can exploit forced-choice customers in service of their interests in minimizing workload and maximizing pay and perquisites.

In particular, teachers' unions — few call them professional associations — have actually done well for their members. In college, education majors typically have scored worst or near worst on ability tests among undergraduate majors. Yet as teachers, they have a 180-day school year — the shortest among teachers in industrialized countries (and much less than the 220 or so days most salaried U.S. professionals normally work). In large cities and elsewhere, according to contract, many teachers are in school only about six hours daily. Some grade papers in the evening, but many professionals take work home. In addition, teachers have little accountability, nearly inviolable tenure, and early and generous pensions that increasingly threaten city and state budgets.

Teachers' unions have done better for themselves than for their members. During the last half-century when membership in private-sector unions declined, teachers' unions increased their membership. They contracted for expensive smaller classes, which do little for learning. With fixed budgets, smaller classes actually mean lower teacher salaries because costs must be spread among more teachers. Thus, smaller classes, which increase the number of teachers, indirectly result in an increase in union membership, central coffers, and legislative influence.

Teachers' unions are understandably acting in their own interests of maximizing their benefits while reducing their efforts. It is school boards and state legislators that have been remiss in failing to provide effective management, informed stewardship, and accountability to citizens who pay the bills. School boards have hardly been a match for nationally organized unions that can bring to negotiations strong, narrow self-interests, statistical research, and specialized expertise.

Harvard and University of Chicago economists Caroline Hoxby and Samuel Peltzman showed that teachers' union success was associated with worse results for students. Their analyses showed that the sharp rise in teachers' union membership and militancy for the period 1971–1991 not only increased per-student costs dramatically but also increased dropout rates and adversely affected examination scores in the forty-eight states surveyed. As teachers' unions grew in membership, income, and power, they gained greater influence over state legislatures, which, in turn, increasingly usurped local control and left the schools increasingly ineffective and unaccountable to local taxpayers.

Lack of Incentives

American schools provide little incentive for educators and students to attain higher standards. A 1996 Public Agenda national survey of high school students showed that three-fourths believe stiffer examinations and graduation requirements would make students pay more attention to their studies. Three-fourths also said students who have not mastered English should not graduate, and a similar percentage said schools should promote only students who master the material. Almost two-thirds reported they could do much better in school if they tried. Nearly 80 percent said students would learn more if schools made sure they were on time and did their homework. More than 70 percent said schools should require after-school classes for those earning Ds and Fs. In these respects, many teacher educators differ sharply from students and the public. A 1997 Public Agenda survey of education professors showed that 64 percent think schools should avoid competition. More favored giving grades for team efforts than for individual accomplishments.

Teacher educators also differ from employers and other professions on preferred ways of measuring standards or even employing such measures at all. Many employers use standardized multiple-choice examinations with job candidates. So do selective colleges and graduate and professional schools with candidates for admission. Such examinations are required in law, medicine, and other fields for licensing because they are objective and reliable. In the case of teachers, academic mastery (as indicated by objective examination results and completion of rigorous courses) influences their students' achievement. Yet, 78 percent of teacher educators wanted less reliance on objective examinations.

Because of such views, schools — the very institutions that should academically prepare youth for doing well in adult life — make little use of high-stakes examinations and effective incentives for accomplishments. School boards and administrators, for example, rarely measure and reward teachers' individual performance. Unions prevail in contracts that require paying public school teachers according to their degrees and years of experience, neither of which affects how much their students learn. After decades of declining union membership in other sectors, schools remain one of the few institutions that provide no merit incentives for their workforce.

The Social Promotion Disincentive

Examinations can allow educators to employ sticks as well as carrots. Consider the case of social promotion. Perhaps because the U.S. school system lacks accountability and incentives, students are usually promoted from one grade to the next whether they have or have not mastered the subject matter. Promoting failed students, however, does many harms. It wrongly informs them that they have learned what they need to know. It robs them of motivation. Why study if you know you will be promoted and graduate?


Table of Contents


Introduction and Overview Williamson M. Evers and Herbert J. Walberg,
Part One: Setting the Stage,
1. Examinations for Educational Productivity Herbert J. Walberg,
2. Why Testing Experts Hate Testing Richard P. Phelps,
Part Two: Constructive Uses of Tests,
3. Early Reading Assessment Barbara R. Foorman, Jack M. Fletcher, and David J. Francis,
4. Science and Mathematics Testing: What's Right and Wrong with the NAEP and the TIMSS? Stan Metzenberg,
5. Telling Lessons from the TIMMS Videotape Alan R. Siegel,
Part Three: Constructive Tests for Accountability,
6. Portfolio Assessment and Education Reform Brian Stecher,
7. Using Performance Assessment for Accountability Purposes William A. Mehrens,
Part Four: State Testing Policies,
8. Learning from Kentucky's Failed Accountability System George K. Cunningham,
9. Accountability Works in Texas Darvin M. Winick and Sandy Kress,
Appendix: Conference Agenda,

