Expansion of test for special ed students distorts latest API results
The 2011 California Academic Performance Index (API) results were released yesterday, and revealed slow but steady gains compared to previous years’ results. However, changes in California’s Standardized Testing and Reporting (STAR) program have led to artificial increases in API gains over the past four years. Sorting out how much of the “good news” announced yesterday is based on true academic achievement gain made by California kids and how much may be due to changes in the STAR program is not a trivial issue.
The numbers. If we record statewide API results for the last four years by grade span, including the results announced today, we find the following (see API chart):
Overall, the gains for 2011 appear to be marginally less than the reasonably steady gains recorded in 2008, 2009, and 2010. The multiyear gain scores from 2007 to 2011 are more noteworthy: 47 points for grades 2-6 and 58 points for grades 7-8. The question is: Are these multiyear gains true achievement gains?
The program changes. In 2008, California began introducing the California Modified Assessments (CMAs) to the STAR program for selected Special Education students. The CMAs have been phased in over four years – for grades 3-5 in 2008, grades 6-8 in 2009, and grades 9-11 over two years in 2010 and 2011. The CMAs are easier tests than the mainstream STAR California Standards Tests (CSTs), designed for Special Education students who score very low on the mainstream CSTs. Special Education students who scored Far Below Basic or Below Basic on a STAR CST in the previous year are eligible to take a CMA, with final determination for which test to take left to the student’s Individualized Education Program team. The CMAs were initiated in response to a federal flexibility program, and were to be targeted for no more than 2 percent of overall enrollment, or roughly 20 percent of Special Education enrollment.
As documented in a TOPed post, initiation of the CMAs over the past four years has affected STAR statewide summary data reported by the State Superintendent. By systematically eliminating students who score Below Basic or Far Below Basic from the STAR CST program, the percentages of students who scored Proficient or Advanced on CSTs have been artificially increased by an estimated 26 percent over four years. In addition to the problem of accuracy for summative data for STAR, the TOPed post also documented the fact that CMA usage has increased far more rapidly than anticipated and has far outstripped the anticipated 2 percent of overall enrollment or 20 percent of Special Education enrollment.
For grades 3-8, where CMAs have now been used for several years, CMAs were administered to 5.2 percent of the total number of students tested in grades 3-8 in 2011, and more CMAs than CSTs were administered to Special Education students. Finally, when one looks at CMA usage by district, one finds a wide discrepancy among districts in the percentage of Special Education students administered the CMAs – from less than 20 percent for some districts to almost 70 percent for other good-sized districts. For a sampling of wide differences, see the chart for Santa Clara County.
The crux of the issue for the 2011 API results is that the introduction of easier tests for a significant portion of Special Education students generates inflated API multiyear scores. If the 2 percent limit originally envisioned had been strictly applied to API calculations, the artificial increases in API scores would likely not be a major factor. However, with the CMA usage running 2 to 3 times higher than anticipated, the artificial increases in API scores are quite notable. Also, CMA now becomes a potential way for local districts to game the API system, testing more Special Education students with CMA to artificially increase their API scores.
The Big Picture. I discussed the larger policy implications for the introduction of CMAs to California’s STAR statewide assessment system in a TopED commentary posted earlier this year. In that commentary, I note the issue is not the idea of a modified assessment – I agree that CMAs yield better data for selected Special Education students than counterpart CSTs. The issues raised here are (1) How should CMA scores be treated for API calculations? and (2) Are CMAs being overused and/or abused by local districts to artificially raise API scores?
The API Inflation Effect. How much are the API scores inflated? It is relatively easy to estimate this; a year ago I computed the API inflation effect for the 2010 API scores; it was in the 35 to 40 percent range for grades 3-8. With the same methodology used a year ago, I’ve computed the inflation effect for the 2011 API scores due to the introduction of the CMAs over the past four years. Since the CMA performance levels for all tests grades 9-11 have not been set as yet, it is possible to estimate the inflation effect only for grades 3-8 at this time. The results of the 2011 API inflation effect are (see CMA inflation chart):
The bottom line here is that the statewide API increases reported for the past four years for grades 2-6 have been inflated by 42.4 percent, and the statewide API increases reported for the past three years for grades 7-8 have been inflated by 34.4 percent. Stated another way, more than a quarter of the statewide gains claimed by the SPI/CDE over the past 3-4 years have been artificial gains due to changes in the STAR program (i.e., introduction of CMAs) rather than true achievement gains.
Counterarguments/Remedies. When I raised the issue of inflated API statewide gain results last year, CDE staff presented two counterarguments:
- Initially, CDE staff claimed 2010 API results were not inflated; instead, API scores in previous years were deflated due to the administration of inappropriate tests to selected Special Education students. In other words, the previous years’ CST scores were not valid while the current year’s CMA scores are valid. However, even if this argument is true (and I do not challenge it from an individual student point of view), the gains in API scores over multiple years will be inflated unless either the “deflated” API scores from previous years or the “inflated” API scores from this year are adjusted so that there are apples-to-apples comparisons to calculate gains.
- CDE staff claimed that appropriate adjustments are made for each year’s Base API to Growth API calculations, to take into account the replacement of scores from the more rigorous CSTs by scores from the less rigorous CMAs. However, this analysis does not address API 1-yr Base-to-Growth results; rather it addresses multiyear API trend data. Other technical adjustments made during the API calculation process also do not address the distortion that CMAs have introduced to multiyear API gain data.
There are relatively simple remedies for the problem that California is reporting inflated API multiyear gain data. Adjusting CMA performance level scores downward for API calculations is one solution that has precedent in the API system. However, remedies need to be vetted by California’s API advisory committee and then presented to the State Board of Education for approval before they can be executed. So far, neither CDE staff nor the SPI nor the SBE has shown the leadership needed to fully vet the issues involved with the introduction of CMAs to California’s statewide STAR assessment system and to consider appropriate remedies.
Doug McRae is a retired educational measurement specialist living in Monterey. In his 40 years in the K-12 testing business, he has served as an educational testing company executive in charge of design and development of K-12 tests widely used across the US, as well as an adviser on the initial design and development of California’s STAR assessment system. He has a Ph.D. in Quantitative Psychology from the University of North Carolina, Chapel Hill.