April 12, 2012

Caution for data misuse by Jing Liu

ibilladay — Creative Commons image by Flickr user billaday

The No Child Left Behind Act (NCLB) has triggered increased attention to the uses of performance data related to student test scores, graduation rates, and other indicators of school and teacher quality. As the misuse of data emerging, we as future researchers should always be skeptical with the policy reports and related studies before analyzing the data first. Also, we should prevent the same error in our own researches.

Since the use of a single indicator of student performance leads to important consequences, the pressure to get positive data may produce counterproductive and destructive results. This principle is the so-called Campbell’s Law, developed by the famous social psychologist, evaluator, methodologist, and philosopher of science Donald Campbell. He phrased it as “the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it was intended to monitor” (Campbell, 1979)[1]. One obvious example of Campbell’s Law in public education is that states and schools try to manipulate data to gain awards as well as to avoid punishments. For instance, low-performing students were assigned by schools to new schools just before the standardized test so they don’t weigh down the scores (Snell, 2005). At the same time, other organizations calling for school choices also manipulate data to support their claims about private schooling (Ladner & Burke, 2010).

To avoid the misuse of data, one should be aware of the Simpson’s Paradox. Simply put, the correlation present in different groups is reversed when the groups are combined. Here I will use the Pennsylvania public high school graduates data from 2008-09 to 2009-10 school years to illustrate the Simpson’s Paradox. In total, there were 130,647 and 131,343 public graduates in the 2008-09 and 2009-10 school years, respectively. Table 1 shows the male and female graduates by race in 2008-09 and 2009-10 and the variation over the two years.

Table 1. Public graduate by gender and race in 2008-09 and 2009-10 school years[2]

It is clear that the total number of Pennsylvania public school graduates increased from 2008-09 to 2009-10. However, when we look at the graduates by gender and race, the data can be interpreted differently.

Figure 1. Change in total number of Pennsylvania public school graduates by gender and race between AY2009 – AY2010

Figure 1 displays change in the total number of Pennsylvania public school graduates among different race groups by gender between the years of 2008-2009 and 2009-2010. The result is not as simple as the increase of total number. There is big variation among each racial group and between genders as well. Obviously, all racial groups, with the exception of Whites, experienced different levels of growth in graduates during the 2008-09 and 2009-10 school years. For example, among those groups with graduates’ growth, the Black or African American increased more in both male and female when compared to other racial group, with Black or African American males increasing more than females.

Therefore, further analysis is needed to actually explain the increases in graduation rates. For example, Does the increase of Black graduates stem from improved achievement or increased population? Why does the number of white graduates decrease whereas all other racial groups increase? Interpretation of this simple graph is proof that the Pennsylvania Department of Education should consider additional data and factors to further explain the trend and make any reasonable conclusions.

There are many other examples of misleading comparisons of education data that are unhelpful in answering proposed questions. The first question to be asked when making comparisons should be: are the groups comparable? If the test is taken by a student sample or partial population, such as NAEP (National Assessment of Educational Progress) and SAT, the comparison among states will lack validity if it neglects the different demographic backgrounds of test-takers. This principle is also applicable for comparison between state and national assessments. False comparison also happens when achievement gaps are discussed. States are trying hard to narrow the achievement gap among different racial students as required by NCLB. Even their reports indicating significant reductions in the achievement gap cannot be trusted without careful examination. For example, if the student proficiency is measured by percentage change instead of scale score, groups cannot be compared to each other without knowing the total population of each group. Similarly, we cannot compare the percentage change of student yearly progress if the cutting scores are changing during years. Further, the educational quality indicated by graduation rate cannot be authentic once school districts calculate the rate in Grade 12 only without including the enrollment number in Grade 9.

There are much more misuses of data in public education beyond those discussed above. We should keep in mind that it is very important to understand the data before reaching conclusions.

References

Campbell, D. (1979). Assessing the impact of planned social change. Evaluation and program planning, 2(1), 67-90. doi: 10.1016/0149-7189(79)90048-X.

Snell, L. (2005). How schools cheat. Resaon online. Retrieved from http://reason.com/archives/2005/06/01/how-schools-cheat/singlepage.

Ladner, M. & Burke, L. (2010). Closing the racial achievement gap: learning from Florida’s reforms. Retrieved from http://pablo.www.nhclc.orgwww.nhclc.org/files/nhclc/Heritage%20Foundation%20Research%20Paper%20English.pdf.

Footnotes

[1] The paper mentioning the concept was originally proposed by Campbell in 1975 and then reprinted with minor revisions and additions at Evaluation and Program Planning in 1979.

[2] http://www.portal.state.pa.us/portal/server.pt/community/graduates/7426