Standardized tests’ Holy Grail
The much-maligned multiple-choice test, the crux of California’s and other states’ accountability exams, will be replaced partly, if not entirely, by more complex, lengthier and probably more costly, state tests. As part of its Race to the Top program, Education Secretary Arne Duncan has set aside $350 million to pay for the development of new standardized tests, plus high schools measures of career and college readiness, over the next four to five years.
Duncan and President Obama, who has derided “fill-in-a-bubble” standardized tests, are expecting that the new “performance assessments,” along with the common-core standards to which they’ll be aligned, will guide teachers’ instruction and improve student results. Skeptics – primarily defenders of current versions such as California’s STAR tests – are doubting whether the next generation tests can deliver what Duncan is demanding.
Stanford Education Professor Linda Darling-Hammond, author of the new book “The Flat World and Education,” has no doubt that they can. She is leading a consortium of groups, backed by a majority of states, that is in the running for one of two $160 million grants that Duncan will award this year. At a public briefing in Washington on Tuesday, Darling-Hammond and others affiliated with SCOPE – Stanford Center for Opportunity Policy in Education – will present seven research papers that will support their effort.
In an overview paper she co-authored, “Beyond Basic Skills : The Role of Performance Assessment in Achieving 21st Century Standards of Learning,” Darling-Hammond wrote that standardized tests developed to comply with No Child Left Behind measure “mostly lower-level skills, such as the recall or recognition of information.” Particularly in lower-achieving schools, these tests have led to a narrowing of the curriculum, while, at higher-achieving schools, have “placed a glass ceiling over more advanced students, who are unable to demonstrate the depth and breadth of their abilities on such exams.” And, because tests tend to influence what teachers teach, multiple-choice tests have discouraged teachers “from having students conduct experiments, make oral presentations, write extensively, and do other sorts of intellectually challenging activities.”
That’s quite an indictment. Defenders of California’s STAR program and other states’ tests argue that multiple-choice questions have gotten a bad rap. If written well, they can measure a deep understanding of a subject. But there’s also no denying that this time of year, in the weeks preceding STAR tests, many schools are slaves to weeks of rote test preparation. Particularly for low-achieving students, April is the cruelest month.
The alternative is performance assessments, which require students to construct their own responses to questions. These can take the form of supplying short phrases or sentences to questions, writing essays or conducting complex and time-consuming activities, such as a lab experiment. “By tapping into students’ advanced thinking skills and abilities to explain their thinking, performance assessments yield a more complete picture of students’ strengths and weaknesses,” Darling-Hammond wrote.
Duncan must agree. The regulations for competing for the assessment grants lean toward performance measures, both for formative assessments that will measure students’ progress during the year, and end-of-year statewide accountability tests.
But performance assessments face big challenges if they’re going to be used for high-stakes tests whose results will determine which schools are failing and, in many states. how teachers will be evaluated and paid. Not only are Obama and Duncan not backing away from accountability under No Child Left Behind, but they want a longtitudinal growth model that can measure individual students’ improvement over the course of a year and from year to year. They also want tests that offer valid state comparisons.
But performance assessments face obstacles of cost, reliability and testing time. Short constructed-response questions, requiring students to fill in phrase or, with a math problem, to show their work, can probably be done without too much extra time or money through the use of computer assisted technology. But more in-depth items, from essays to experiments, will take a lot longer to take and score, and will present daunting challenges to score uniformly across districts, not to mention states.
Darling-Hammond’s paper gives numerous examples of performance-type questions and assessments used in states and abroad. But even in her prototype high-quality assessment of the future, multiple-choice comprises half of the questions in math and English-language arts –a nod to the time and cost challenges. Multiple choice is not likely going away entirely, just sharing the stage.
One skeptic of performance assessments for high-stakes accountability purposes is Doug McRae, a retired publisher for the testing division of McGraw-Hill. (The Educated Guess readers may recall that McRae first drew attention to methodology problems with the state’s selection of persistently low-performing schools. You can read his three-page critique of Darling-Hammond’s overview paper here. McRae concludes that it’s an open question whether performance assessment methodology realistically can be used in high-stakes testing. Relying on it may be a “Trojan horse” that undermines the Obama’s accountability goals.
Whichever consortium wins the grant will write the assessments to common-core standards. If California or any state rejects the adoption of common core, then it would be also in effect rejecting a switch to new federal assessments.