Sunday, August 04, 2013

October Surprise? The “Texas Miracle” & Its Critics: Part III: Rand Researchers’ Cheatin’ Hearts

By Nicholas Stix
November 6, 2000
Toogood Reports

[Previously, in this series:

Part I: “The Texas Testing Controversy”; and

Part II: “‘Dr. Spin’ Flips the Rand Report.”]


On October 24th, the Rand Corporation released its now notorious report questioning the gains that Texas public school children had made, as measured by the Texas Assessment of Academic Skills (TAAS). The report suggested that the “Texas Miracle,” as Gov. George W. Bush had dubbed it, was closer to a hoax.

So far, I have examined the spin-doctoring campaign that the Rand report’s lead researcher, Dr. Stephen Klein, whom I dubbed, “Dr. Spin,” engaged in, with the help of mainstream news organizations.

While littering the 18-page “report” with suggestions and innuendoes, and playing word games, all to insinuate the existence of a regime of test fraud in Texas, Klein failed to offer one shred of evidence that Texas educators or officials were engaging in test fraud. In my last column, I showed what a real report on test fraud looks like — the 66-page, December, 1999 Stancik Report on massive test fraud in New York City’s public schools.

The Klein Report claims to have two foundations: A study of students’ performance on math tests that Klein and his research team had administered in 20 Texas schools in 1996, and a comparison of Texas and national scores on the National Assessment of Education Progress (NAEP) during the same period.

When Stephen Klein spoke to Washington Post reporter John Mintz in April — when his research team’s study, according to Rand’s official story, had just begun — he alluded to his team’s study of Texas students’ performance on math tests as the basis for his suspicions: “We knew something strange was going on.”

But in the report itself, Klein denies the validity of his team’s math test results:

“Because our set of schools was small and not representative of the state, we decided to explore statewide patterns of achievement on TAAS and on NAEP.”
In simple English, the first study was a botch: “small and not representative.” So, how does a botched, 1996 study suddenly become proof that “something strange was going on?” When a Washington Post reporter is on the other end of the telephone line.

And how does a botched study become the basis for a second study trying to prove the same thing? When the White House is in danger of being lost.

In the case of the National Assessment of Educational Progress (NAEP), Klein isn’t on quite such shaky ground. Texas eighth-graders did not blow away students across the country on the NAEP to the degree that the TAAS figures say they did, nor did Texas fourth-graders taking the NAEP reading test.

Those test groups of Texas students made some progress on the NAEP, but so did students everywhere else. And the racial gap that supposedly had been almost erased on the TAAS, persisted among Texas students on the NAEP.

However, on the NAEP, Texas fourth-graders of all races DID blow away their peers nationally on the math section, posting gains that were double those of children outside of Texas. And it was the math scores that had aroused Klein’s suspicions. And yet, he turns this praiseworthy fact into an indictment: “The main finding is that over a four-year period, the average test score gains on the NAEP in Texas exceeded those of the nation in only one of the three comparisons, namely fourth grade math.”

The NAEP is so important, because it is the closest thing we have to a national exam; it is given voluntarily in 44 states, and as the Klein Report notes, is considered the “gold standard” of testing. (The NAEP’s nickname is, “the nation’s report card.”) Meanwhile, the various state examinations are all different, and thus not comparable with one another — or with the NAEP.

As influential, neo-conservative education scholar Chester E. Finn Jr. pointed out in an October 27 New York Times op-ed essay in defense of the TAAS,

“Mr. Klein neglected to provide some vital context. It is normal for state tests to show better results than national ones. There are straightforward reasons for this: State tests are more narrowly designed; they test more basic skills; they intentionally align themselves to the state standards and curriculums (which national tests do not); and they provide more incentives, like grade promotion, for students to do well.”
Klein chooses to emphasize that gains were less for eighth-graders, but neglects to mention that in an intensive academic environment, it is normal for gains to be larger, the younger the students. The older students are, the harder it is for them to catch up.

Klein’s method is one of suggestion, innuendo, and word games. He claims to smell smoke, and insists that the odor is proof that there is a fire, but in fact, he is the source of the smoke.

Klein writes,

“For example, the media have reported concerns about excessive teaching to the test, and there is some empirical support for these criticisms.”
As I showed in my last column, Klein himself was the source of the media reports.

John Mintz of the Washington Post reported such “concerns” on April 21: “He believes that without meaning to, Texas officials design TAAS tests so they’re vulnerable to Texas teachers’ coaching. He also thinks that kids who ‘prepped’ for TAAS not only didn’t get a deep understanding of the subject, but also weren’t helped to pass non-TAAS tests.”

Klein’s deeply problematic charge that the “TAAS tests ... are vulnerable to Texas teachers’ coaching” implies that a legitimate test is impervious to coaching, and thus, that “coaching” is a form of cheating. Both implications are false, and thus misleading. Indeed, the Klein Report claims that the scores by the coached students are “inflated.”

Klein Report:

“However, some educators and analysts have raised questions about the validity of these gains and the possible negative consequences of high-stakes accountability systems, particularly for low-income and minority students.”
“Raised questions”? “Possible negative consequences”? And why “particularly for low-income and minority students”?

The point of a narrow, scientific study is not to RAISE questions, but to ANSWER them, and to determine not whether a policy has POSSIBLE, but if it has REAL, negative consequences.

The heightened concern for “low-income and minority students” — a redundant phrase, since in the antiversity and in edworld, “low-income” is a euphemism for “minority” — is a cliche, like the phrase, “for the children.” (Actually, when it comes to education writing, “conservatives” and “liberals” alike profess exaggerated concern for black and Hispanic students. Apparently, nobody gives a hoot about white or Asian students.)

And the hits just keep on coming:

“There are also concerns that score trends may be biased by a variety of formal and informal policies and practices.”
“Concerns”? “May be”? Again, Klein provides no evidence of “bias,” which in the context of testing is a euphemism for “fraudulent,” “illegitimate,” “invalid.”

“Another concern is inappropriate test preparation practices, including outright cheating. There have been documented cases of cheating across the nation, including in Texas. If widespread, these behaviors could substantially distort inferences from test score gains.”
Klein is trying, through verbal sleight of hand, to go from “documented cases of cheating” outside of Texas, to the implication that — in the absence of any such documentation, widespread cheating in Texas has distorted TAAS test scores. “If ... could” is in the realm of possibility, not reality. IF I were seven feet tall, I COULD play in the NBA.

Although Klein mentions “documented cases” in Texas, he gives no specifics. That is because the cases were so limited, that they fail to bolster his case.

In New York, massive test fraud was uncovered, despite a lack of any pressure by anti-testing zealots to uncover the scandal.

Conversely, in spite of massive scrutiny by anti-testing zealots and the mainstream media, no major testing fraud scandal was ever uncovered in Texas. [N.S. 2007: I should have used “hostility” instead of “scrutiny,” since had the zealots and the media not been so lazy, they would have uncovered the real scandal.]

Stephen Klein then builds on his previous imaginary proof, upping the ante in the next paragraph to “the pressure to raise scores may be felt most intensely in the lowest-scoring schools, which typically have large populations of low-income and minority students.”

In other words, ‘I didn’t prove my earlier point about widespread cheating, and now I’m going to build on that, by insinuating that the unproven or non-existent cheating’s greatest prevalence is in black and Hispanic schools.’

Word games:

“Evidence regarding the validity of score gains on the TAAS can be obtained by investigating the degree to which these gains are also present on other measures of these same general skills.”
Wrong again! Substitute for “evidence,” “suspicion.”

Seeing a discrepancy between the same group’s scores on DIFFERENT exams is a reason for investigating IN SEARCH OF evidence of test fraud. Large discrepancies by the same group on the SAME test would arouse very strong suspicions, but even then, one would still have to prove one’s case. The fact that the tests (TAAS and NAEP) are QUITE DIFFERENT, makes the case for test fraud harder, not easier to prove.

To my knowledge, in the early 1970s, multiculturalists began abusing the pseudo-scientific theory of “disparate impact,” which is central to affirmative action (aka “diversity”), to gut scientific standards of evidence in the social sciences and the already intellectually dubious field of education.

In the politically correct antiversity and in edworld, the mere existence of a statistical disparity between an officially defined “oppressor group” (white, heterosexual, able-bodied males) and an officially defined “oppressed group” (anyone else) has become accepted as “proof” of discrimination. Requesting evidence that the “disparate impact” in question is in fact due to white racism, as opposed to other factors (black racism, parental neglect, incompetent teachers, etc.), has long resulted in questioners being derided as “racists,” and fired from their jobs. That is why, one “goes along” in today’s academic world.

More word games: In his report, Klein always uses “coaching,” “prepping,” and “intensive preparation” in a pejorative fashion, suggesting they are forms of cheating. Part of this web of suggestion is his reference to the use of older TAAS tests, and drilling for the TAAS using questions similar to those on the exam. And then there is the criticism, pervasive today in edworld (especially in the Anti-Testing Brigade) and the mainstream media, of “teaching to the test.”

Call it “coaching,” “prepping,” or “teaching to the test” — not only is this practice not a form of cheating which inflates test scores, but it is every teacher’s duty, a duty I felt acutely during seven years as a college instructor teaching remedial and allegedly college-level courses.

And it is standard practice to use questions similar to those that will be on the exam, culled from past exams. A case of fraud or “bias” would only obtain, if a teacher used the questions from the actual test his students were about to take.

To read Stephen Klein, you’d never know that a multi-billion-dollar learning and test preparation industry existed, that middle-class and well-to-do students of all colors and ability levels routinely used, often in addition to private tutors, to increase their test scores.

Apparently, only poor “students of color” are to be denied such advantages. (Oops, there I go racially pandering! There’s just no escaping it!)

And the commonsensical “tricks” students learn through their TAAS preparation — such as reading the test question before reading the passage it tests — are things all students should know. Besides, they are nothing, compared to the sorts of tricks that journalists and education scholars routinely depend on.

Finally, the report does not cite a single sample question from the TAAS. Many journalists did give examples of these questions, and used them to offer more scientifically-grounded explanations of the TAAS scores than Stephen Klein did.

At the same time, journalists Klein had spoken to gave — much like Klein himself — analyses of the Klein Report that went far beyond its actual contents. And the anti-Bush newspaper articles almost always contained quotes from the same anti-testing zealots — e.g., Angela Valenzuela of the University of Texas and Linda McNeil of Rice University — whom Klein cited approvingly in his report, even though in it, he adamantly denied being from the anti-testing camp.

Note too that the anti-Bush articles that immediately followed the release of the Klein Report combined analyses of the report’s criticisms of TAAS (saying literally what the report only insinuated) with additional criticisms of TAAS by Valenzuela, McNeil, and other, unnamed sources, that bore no relation to the report, which often were irrelevant to it, and which in some cases flat-out contradicted Klein’s criticisms.

However, a reader who had not studied the Klein Report, and who had a limited knowledge of the world of testing, would be hard-pressed to cut through the thicket of propaganda.

No comments:

Post a Comment