Thursday, April 12, 2012

The Cultural Bias of Intelligence Tests and its Effects on Special Education Placement


     Standardized testing is probably the most important aspect of a school-aged child’s life, in the United States. Based on the results, students are placed in programs, assigned to remediation, special education services, or gifted programs. Unfortunately, for culturally and linguistically diverse students, most standardized test questions used in the United States today are biased toward, white, middle-class students, using background information and references to American life, references that cannot be easily understood by students of diverse cultures. Poor scores on these assessments may cause a student to be incorrectly classified as a special education student when, common, in fact, the student is very bright—only unable to correctly answer questions based on American culture, rather than their background, which is very different.
      Today, schools in the United States, are more racially and culturally diverse than ever before. As of 2005, almost 50% of students in the U.S. could be classified as “culturally different” (Ford, 2010, p.51). According to a 2010 report published by the Institute of Education Sciences, in the coming years, the Hispanic population is predicted to grow at a faster rate than most other groups in the United States (Utley, Obiakor, & Bakken, 2011, p. 5). In addition, the growth rate of Caucasians is expected to be slower than all other races.
         In the years to come, classrooms in the United States will only continue to grow more diverse. Unfortunately, culturally diverse students have much less school success than the dominant population (p.50). One major problem with the ever changing cultural climate of classrooms in the United States is that culturally and linguistically diverse (CLD) students typically score lower than Caucasians on educational assessments. While the culture of students continues to change, teachers and assessment tools in the United States are not. Around 85% of educators are female and the majority of teachers as a whole—83%—are white (Ford, 2010, p.51).  In addition, the tests used to assess children for special education and gifted education programs are created for white, middle class students. The cultural atmosphere of the United States is changing dramatically, while intelligence tests and other high-stakes assessment practices focus primarily on Caucasian students.
      Many times, in order to answer certain questions correctly, an individual must have specific culturally based information or knowledge. Unfortunately, many bilingual students do not have this specific cultural information, making it hard to answer questions on tests (Baca & Cervantes, 1989, p.165). Culturally and linguistically diverse students are at a great disadvantage when it comes to traditional assessments, because those assessments are extremely biased against students that are culturally and linguistically diverse or from varying economic groups (p.165). Also, students that are limited English proficient are often misplaced when it comes to special education services (p.165). There is an over representation of Hispanic students in the learning disabled population. Educators seem to have trouble determining whether student’s problems stem from underlying disabilities, or simply his or her lack of English proficiency (p.165). Also, the characteristics of acculturation can be confused with signs of a disability (p.165).
     There are four types of bias related to the differential performance between and among members of cultural groups (Whiting & Ford). The first is bias in construct validity—this type of bias exists when a test is shown to measure different hypothetical constructs for members of one group than another (Whiting et al.). The second type of bias is bias in content validity—this type of bias exists when an item or subscale is more difficult for members of one group than members of another, even though the ability level of both groups is generally equal. Three examples of content bias are: items asking for information that minority persons have not had an equal opportunity to learn, when the scoring of an item is inappropriate because the maker of the test decided on only one correct answer, and when minorities are penalized for giving an answer that would have been correct in their own culture (Whiting et al.). The third type of bias is in item selection—item selection bias occurs when the items and tasks selected are based on experiences and language of the dominant group, much like content validity, however, this bias is more concerned with the appropriateness of individual test items. The fourth kind of bias is predictive or criterion-related validity—criterion-related validity occurs when the answer to items and tasks require prior cultural knowledge of the dominant group (Whiting et al).
       Intelligence tests contain examples of all four typed of bias. Unfortunately, IQ tests are thought of, by many people who are not familiar with the purpose and limitations of testing, as a test of innate ability (Ford, 2004, p.v). Therefore, when certain groups of people score lower than others, people unfamiliar with intelligence tests may consider those groups to be inferior genetically or have lesser intelligence due to heredity (p.v). This school of thought ignores many factors that affect the results of intelligence tests, including the environment, level and quality of education, and opportunity to learn (p.v).
     Another school of thought purports that intelligence tests measure unlearned abilities, therefore, if a person scores low on an intelligence test, they are assumed to have inferior cognitive ability and potential (Ford, 2004, p.v). This belief is common among people who are not trained in testing and assessment, people who believe “intelligence is fixed, innate and unchangeable”, and people who believe that intelligence tests are “comprehensive, exact, and precise measures of intelligence (p.v). Fagan (2000) hypothesized that the lack of intelligence tests that are fair across different cultural groups originates from a theoretical bias to associate the IQ score with intelligence rather than with knowledge (Fagan, 2008, p.vi).
          There is no consensus in education regarding why diverse students score lower on intelligence tests than white students (Ford, 2004, p. vii). There are two major debates surrounding the performance of minority students (Ford, 2004, p.vi). One group of scholars believes that the low intelligence test performance of minorities is due to the cultural deprivation and economic disadvantage experienced by minorities (p.vi). They believe that culturally diverse students are inferior to the norm (p.vi). On the other side of the debate, scholars believe that minority students are different culturally, but not disadvantaged. These scholars believe that culture impacts test performance, but, do not see low scores as evidence of inferiority (p.vi).
     Examiners that assess culturally and linguistically diverse children are often uncertain about which tests provide the most reliable, valid and unbiased results (Edwards, 2006, p.246). Historical data shows a significant discrepancy in the intelligence test performance of different cultural groups (p.246). Edwards (2006) reveals “on average, when adjusted for differences in socio-economic status, individuals of Asian descent scored higher than those of European descent, individuals of African descent scored lower than those of European descent, and individuals of Hispanic descent scored somewhere in between the latter two groups on tests of intelligence”(p.246).
     Because of the great debate of cultural bias in standardized tests in the 1970s, test developers have tried to decrease or eliminate cultural bias in assessments (Ford, 2004, p.vi). Some scholars argue that there is no longer test bias due to the changes made by test developers (p.vi). Others contend that tests can never be free of cultural bias because they are developed by people and they reflect the test developers culture or cultures (p.vi). Absolute fairness of tests is impossible to attain because tests are not perfectly reliable or valid in any particular context (p.vi).
      Currently, most intelligence test makers do not release the statistics about the difference in performance by various ethnic groups (Edwards, 2006, p.246). In doing this, the test developers are attempting to avoid controversy about the differences between ethnic groups, and appear socially sensitive (p.246). The lack of data provided by these developers makes it hard for educators, and other test users, to make informed and fair decisions about which test would be the most effective for a given student (p.246). The data that is available suggests that the results of biased intelligence tests lead to a disproportionate representation of culturally and linguistically diverse students in special education programs (p.247). Limiting or avoiding this overrepresentation would require data about which intelligence tests are most representative of minority group scores, by fairly and reliably assessing them (p.247).
     In 1998, Jensen reported that his meta-analysis showed an average IQ” range of 10-20 points between ethnic groups on different IQ tests” (Edwards, 2006, p.247). Different tests vary in their discrepancy between ethnic groups, one test may have an average difference of 10, while another difference may be closer to 20 (p.247). If test producers provide data about their IQ discrepancies it will help test users to make informed decisions regarding which intelligence test will provide the most accurate results. The outcome of these fair and reliable results would most likely lessen the disproportionate representation of minority groups in special education programs (p.247).
   Wasserman and Becker (2000) reviewed studies on the WISC-III, Stanford-Binet IV, and the Woodcock-Johnson Tests of Cognitive Ability that used samples corresponding to key demographic variables. They found that the mean discrepancy, in favor of whites, between standard scores for matched samples of African American and Caucasian groups were as follows: WISC-III=11.0; Stanford-Binet IV=8.1; and Woodcock- Johnson Test of Cognitive Ability=11.7. These considerable average score differences imply that when these tests are used to refer students to gifted programs that fewer culturally and linguistically diverse children may be identified as meeting the criteria for giftedness (Ford, 2004, p.8).
     High stakes testing is widespread in the United States; Lamb (1993) observed that test scores in student files create the basis for high stakes decisions (Ford, 2004, p.5). Hilliard (1991), Korchin (1980), Olmedo (1981), and others argue that standardized tests have added to the continuation of barriers that diverse groups are faced with politically, socially and economically (Ford, 2004, p. 5). Donna Ford (2004) in her paper “Intelligence Testing and Cultural Diversity: Concerns, Cautions, and Considerations” notes that “when tests are used for selecting and screening, the potential for denying diverse groups access to educational opportunities, such as gifted education programs, due to bias is great” (Ford, 2004, p.5). Many scholars believe that intelligence tests contain cultural bias that is in favor of middle class Caucasian groups because they assess knowledge and content normed for White, middle class students and use language and situations that are often unfamiliar to culturally and linguistically diverse students (p.6).
     The debates and arguments regarding intelligence and intelligence testing are common in education, more specifically, special education and gifted education programs. These programs rely heavily on assessments, like intelligence tests, to make decisions regarding the placement of students in one of the programs and what services they require to help them be successful (Ford, 2004, p.2).
     Oliver Edwards (2006) writes, in his article “Special education disproportionality and the influence of intelligence test selection” in the Journal of Intellectual & Developmental Disability that, “for a test to have equitable effects, examiners need to interpret them not only in light of their statistical properties, but also in light of the consequences of test score use (Edwards, 2006, p.247). Biased intelligence test scores can be detrimental to minority students. The interpretation of these scores is a double edged sword affecting entrance into gifted education programs and special education programs alike. Low test scores can prevent minority students from being identified as gifted and entering gifted education programs (Ford, 2004, p.vi). Depending on the student, low scores can result in the student being identified as learning disabled, mentally retarded etc., this label will likely follow this student for the duration of his or her education (p.vi).
     Being mistakenly assigned to special education services has many consequences, present and future. There is a stigma that goes along with the label “special education student”. This stigma can follow them throughout their lives, influencing how teachers and classmates treat them and may give them a poor self-image and affect their self esteem (Gay, 2002, p.615). 
     Intelligence tests reveal information regarding the test takers educational attainment, social judgment, reasoning and comprehension. Because the scores of intelligence tests are considered to accurately reveal differences in these four areas and are used to make important educational decisions, it is important to turn to research regarding the cultural accuracy of each test to help guide decision on the use of each intelligence test (Edwards, 2006, p.247). The intelligence test that is most likely to decrease the overrepresentation of culturally and linguistically diverse students from special education programs is the test that results in a smaller average score variation between different ethnic groups.
      Disproportionality in special education is not a new problem. Despite efforts of teachers, school districts, Individuals with Disabilities Education Act (IDEA) and the Education of All Handicapped Children Act, minorities, unfortunately, are still overrepresented in special education and underrepresented in gifted and talented programs. Educators and other test users must place an importance on picking intelligence tests that have the lowest discrepancy between cultural groups so that the scores from culturally and linguistically diverse students are accurate and fairly reflect that students needs regarding special education and gifted education programs. Educators must never assume that everyone shares the same opportunities and experiences—intelligence tests are rooted in this belief.
     Overall, assessments used in schools can be made more culturally fair and valid in the following ways: by administering tests in the primary language of the person taking the exam, have interpreters translate test questions, review tests and eliminate questions that groups perform very differently on, eliminate items that may be offensive to certain groups, keep in mind the background of the person who took the test when examining their answers, never assume that everyone shares the same opportunities and experiences, never base decisions about a person on one test score, instead collect multiple sources of data, and when a whole group scores low on a test, consider that the test may be the problem.
      I believe that intelligence cannot be tested independently of the culture that gives rise to the test. I believe that many tests are geared toward a certain group of people or cultures and there is a bias to them. Depending on what each student’s strengths are and what he has been exposed to, he may score a lot higher on the test than a person of a different background. I think that students need to be evaluated for gifted and special education programs based on the results of multiple assessments, rather than solely relying on an intelligence test that may or may not yield different results for different cultural groups. I also believe that there needs to be a value put on teacher input for considerations in these programs. A student’s teacher is a great source of information about a student’s individual strengths and weaknesses.
     In addition I think that it would be advantageous, not only to use multiple assessment measures, but to reevaluate students in special education programs multiple times throughout the year to see if they still require special education services or to see if there may have been an error in original testing. The culturally linguistically diverse students, in particular, must be reevaluated to determine whether or not they have been placed correctly. Some diverse students may not have tested into the gifted and talented program at their school that should have been; others may not require special education services but are being given them. Human and assessment error is an unfortunate part of life, it is an educators responsibility to make sure that if an error has occurred that it be rectified.
     I agree that cultural bias may be lessened in intelligence tests but cannot be fully removed. The test will always reflect the culture of the person that created the test. Ford and Gilman (n.d), for example, state that even with the best intentions to create tests that have little to no bias, “human error, stereotypes and prejudice undermine test administrations, interpretation, and use” (Ford & Gilman, n.d). Culturally diverse students are, more often than not, are affected by this bias.
    In the future, I hope that test developers realize that withholding the statistical information related to the mean scores of different cultural groups does more harm than good. I understand that the developers are trying to be socially conscious, however, that data could possibly be used to spark future research related to test bias so that, hopefully, in the future, less culturally and linguistically diverse students are incorrectly placed in a certain program, or do not have a chance to participate in a program due to faulty test scores. The release of this data can also be used to help test users choose an assessment that has be proven to produce scores with less discrepancies between difference cultural groups.











References
Baca, L., & Cervantes, H. T. (1989). The bilingual special education interface (2nd ed.). Columbus, Oh.: Merrill.
Edwards, O. (2006). Special Education Disproportionality and the Influence of Intelligence Test Selection. Journal of Intellectual & Developmental Disability, 31(4), 246-248. Retrieved April 12, 2012, from the Ebscohost database.
Fagan, J. (n.d.). A valid, culture-fair test of intelligence. Headquarters, department of the Army. Retrieved April 12, 2012, from www.hqda.army.mil/ari/pdf/TR_1225.pdf
Ford, D. (2010). Culturally Responsive Classrooms: Affirming Culturally Different Gifted Students. Gifted Child Today, 33, 50-53. Retrieved April 1, 2012, from the Ebscohost database.
Ford, D. (2004). Intelligence testing and cultural diversity: Concerns, cautions, and considerations. National Research Center on the Gifted and Talented, 1, 1-71. Retrieved April 2, 2012, from the Educational Resources Information Center database.
Ford, D., & Whiting, G. (n.d.). Cultural Bias in Testing | Education.com. Education.com | An Education & Child Development Site for Parents | Parenting & Educational Resource. Retrieved April 20, 2012, from http://www.education.com/reference/article/cultural-bias-in-testing/
Lopez, R. (1997). The practical impact of current research and issues in intelligence test interpretation and use for multicultural populations. School Psychology Review, 26(2), 249-254. Retrieved April 22, 2012, from Ebscohost

Gay, Geneva . "Culturally responsive teaching in special education for ethnically diverse students: setting the stage." Qualitative Studies in Education 15.5 (2002): 613-629. http://www.cehd.umn.edu/. Web. 6 March 2012.’
Skiba, R., Simmons, A., Ritter, S., Gibb, A., Rausch, K., Cuadrado, J., et al. (2008). Achieving Equity In Special Education: History, Status, and Current Challenges. Council for Exceptional Children, 74, 264-288. Retrieved April 16, 2012, from the Educational Resources Information Center database.


No comments:

Post a Comment