The students’ improved grasp of mathematical concepts stunned Walter
Stroup, the University of Texas at Austin professor behind the program.
But at the end of the year, students’ scores had increased only
marginally on state standardized
TAKS tests, unlike what Mr. Stroup had seen in the classroom.
A similar dynamic showed up in a comparison of the students’ scores on
midyear benchmark tests and what they received on their end-of-year
exams. Standardized test scores the previous year were better predictors
of their scores the next year than the benchmark test they had taken a
few months earlier.
Now, in studies that threaten to shake the foundation of high-stakes
test-based accountability, Mr. Stroup and two other researchers said
they believe they have found the reason: a glitch embedded in the DNA of
the state exams that, as a result of a statistical method used to
assemble them, suggests they are virtually useless at measuring the
effects of classroom instruction.
Pearson, which has a five-year, $468 million contract to create the
state’s tests through 2015, uses “item response theory” to devise
standardized exams, as other testing companies do. Using I.R.T.,
developers select questions based on a model that correlates students’
ability with the probability that they will get a question right.
That produces a test that Mr. Stroup said is more sensitive to how it
ranks students than to measuring what they have learned. That design
flaw also explains why Richardson students’ scores on the previous
year’s TAKS test were a better predictor of performance on the next
year’s TAKS test than the benchmark exams were, he said. The benchmark
exams were developed by the district, the TAKS by the testing company.
Mr. Stroup, who is preparing to submit the findings to multiple research
journals, presented them in June at a meeting of the Texas House Public
Education Committee. He said he was aware of their implications for a
widely used and accepted method of developing tests, and for how the
state evaluates public schools.
“I’ve thought about being wrong,” Mr. Stroup said. “I’d love if everyone
could say, ‘You are wrong, everything’s fine,’ ” he said. “But these
are hundreds and hundreds of numbers that we’ve run now.”
Gloria Zyskowski, the deputy associate commissioner who handles
assessments at the Texas Education Agency, said in a statement that the
agency needed more time to review the findings. But she said that Mr.
Stroup’s comments in June reflected “fundamental misunderstandings”
about test development and that there was no evidence of a flaw in the
test.
After a lengthy back and forth at the meeting, the committee’s chairman,
Rob Eissler, suggested a “battle of the bands” — a hearing where the
test vendors and researchers traded questions. Mr. Eissler, Republican
of The Woodlands, said recently that he found Mr. Stroup’s research
“very interesting” and that he was weighing another hearing.
Mr. Stroup’s research comes as opposition to high-stakes standardized
testing in Texas is creating an alliance between parents, educators and
school leaders who wonder how the tests affect classroom instruction and
small-government conservatives who question the expense and bureaucracy
they impose.
This is not first time the use of standardized test scores in Texas has
been questioned. In 2009, the state implemented the Texas Projection
Measure, a formula that critics said allowed schools to count students
as passing who did not. After outcry from lawmakers, the state dropped
the measure in 2011.
State Representative Scott Hochberg, Democrat of Houston, led the charge
against the measure and has since proposed legislation aimed at
reforming the role of standardized testing because of data showing that a
student’s test score on the first year highly predicted it for the
next.
“I have for a long time said that the accountability system doesn’t give
us all the information that the numbers are used to generate,” Mr.
Hochberg said, adding that basing accountability “more on the kid’s
history than the specifics of what happened in the classroom that year
may make us feel good but it doesn’t give us any true information.”
No comments:
Post a Comment