RSS
热门关键字:  UG 视频  视频教程  proe 教程  UG 视频  UG 视频
当前位置 :| 主页>文章>其它相关>

英语毕业论文

来源:Internet 作者:huier 时间:2007-05-15 点击:
What Does Language Testing Have to Offer? University of California, Los Angeles   Advances in language testing in the past decade have occurred in three areas: (a) the development of a theoretical view that considers language ability to be multicomponential and recognizes the influence of the test method and test taker characteristics on test performance, (b) applications of more sophisticated measurement and statistical tools, and (c) the development of “communicative” language tests that incorporate principles of “communicative” language teaching. After reviewing these advances, this paper describes an interfactional model of language test performance that includes two components, language ability and test method. Language ability consists of language knowledge and metacognitive strategies, whereas test method includes characteristics of the environment, rubric, input, expected response, and relationship between input and expected response. Two aspects of authenticity are derived from this model. The situational authenticity of a given test task depends on the relationship between its test method characteristics and the features of a specific language use situation, while its interfactional authenticity pertains to the degree to which it invokes the test taker’s language ability. The application of this definition of authenticity to test development is discussed.   Since 1989, four papers reviewing the state of the art in the field of language testing have appeared (Alderson, 1991; Bachman, 1990a; Skehan, 1988, 1989, 1991). All four have argued that language testing has come of age as a discipline in its own right within applied linguistics and have presented substantial evidence, I believe, in support of this assertion. A common theme in all these articles is that the field of language testing has much to offer in terms of theoretical, methodological, and practical accomplishments to its sister disciplines in applied linguistics. Since these papers provide excellent critical surveys and discussions of the field of language testing, I will simply summarize some of the common themes in these reviews in Part 1 of this paper in order to whet the appetite of readers who may be interested in knowing what are the issues and problems of current interest to language testers. These articles are nontechnical and accessible to those who are not themselves language testing specialists. Furthermore, Skehan (1991) and Alderson (1991) appear in collections of papers from recent confer-ences that focus on current issues in language testing. These collections include a wide variety of topics of current interest within language testing, discussed from many perspectives, and thus constitute major contributions to the literature on language testing.   The purpose of this paper is to address a question that is, I believe, implicit in all of the review articles mentioned above, What does language testing have to offer to researchers and practitioners in other areas of applied linguistics, particularly in language learning and language teaching? These reviews discuss several specific areas in which valuable contributions can be expected (e.g., program evaluation, second language acquisition, classroom learning, research methodology). Part 2 of this paper focuses on two recent developments in language testing, discussing their potential contributions to language learning and language teaching. I argue first that a theoretical model of second language ability that has emerged on the basis of research in language testing can be useful for both researchers and practitioners in language learning and language teaching. Specifically, I believe it provides a basis for both conceptualizing second language abilities whose acquisition is the object of considerable research and instructional effort, and for designing language tests for use both in instructional settings and for research in language learning and language teaching. Second, I will describe an approach to characterize the authenticity of a language task which I believe can help us to better understand the nature of the tasks we set, either for students in instructional programs or for subjects in language learning research and which can thus aid in the design and development of tasks that are more useful for these purposes. PART 1: LANGUAGE TESTING IN THE 1990s   In echoing Alderson’s (1991) title, I acknowledge the commonal-ities among the review articles mentioned above in the themes they discuss and the issues they raise. While each review emphasizes specific areas, all approach the task with essentially the same rhetorical organization: a review of the achievements in language testing, or lack thereof, over the past decade; a discussion of areas of likely continued development; and suggestions of areas in need of increased emphasis to assure developments in the future. Both Alderson and Skehan argue that while language testing has made progress in some areas, on the whole “there has been relatively little progress in language testing until recently” (Skehan, 1991, p. 3). Skehan discusses the contextual factors—theory, practical consider-ations, and human considerations—that have influenced language testing in terms of whether these factors act as “forces for conserva-tism” or “forces for change” (p. 3). The former, he argues, “all have the consequence of retarding change, reducing openness, and gen-erally justifying inaction in testing” (p. 3), while the latter are “pres-sures which are likely to bring about more beneficial outcomes” (p. 7). All of the reviews present essentially optimistic views of where language testing is going and what it has to offer other areas of applied linguistics. I will group the common themes of these reviews into the general areas of (a) theoretical issues and their im-plications for practical application, (b) methodological advances, and (c) language test development. THEORETICAL ISSUES   One of the major preoccupations of language testers in the past decade has been investigating the nature of language proficiency. In 1980 the “unitary competence hypothesis” (Oller, 1979), which claimed that language proficiency consists of a single, global ability was widely accepted. By 1983 this view of language proficiency had been challenged by several empirical studies and abandoned by its chief proponent (Oller, 1983). The unitary trait view has been replaced, through both empirical research and theorizing, by the view that language proficiency is multicomponential, consisting of a number of interrelated specific abilities as well as a general ability or set of general strategies or procedures. Skehan and Alderson both suggest that the model of language test performance proposed by Bachman (1990b) represents progress in this area, since it includes both components of language ability and characteristics of test methods, thereby making it possible “to make statements about actual performance as well as underlying abilities” (Skehan, 1991, p. 9). At the same time, Skehan correctly points out that as research progresses, this model will be modified and eventually superseded. Both Alderson and Skehan indicate that an area where further progress is needed is in the application of theoretical models of language proficiency to the design and development of language tests. Alderson, for example, states that “we need to be concerned not only with . . . the nature of language proficiency, but also with language learning and the design and researching of achievementtests; not only with testers, and the problems of our professionalism,but also with testees, with students, and their interests, perspectivesand insights” (Alderson, 1991, p. 5).   A second area of research and progress is in our understanding of the effects of the method of testing on test performance, A number of empirical studies conducted in the 1980s clearly demonstrated that the kind of test tasks used can affect test performance as much as the abilities we want to measure (e.g., Bachman & Palmer, 1981, 1982, 1988; Clifford, 1981; Shohamy, 1983, 1984). Other studies demonstrated that the topical content of test tasks can affect performance (e.g., Alderson & Urquhart, 1985; Erickson & Molloy, 1983). Results of these studies have stimulated a renewed interest in the investigation of test content. And here the results have been mixed. Alderson and colleagues (Alderson, 1986, 1990; Alderson & Lukmani, 1986; Alderson, Henning, & Lukmani, 1987) have been investigating (a) the extent to which “experts” agree in their judgments about what specific skills EFL reading test items measure, and at what levels, and (b) whether these expert judgments about ability levels are related to the difficulty of items. Their results indicate first, that these experts, who included test designers assessing the content of their own tests, do not agree and, second, that there is virtually no relationship between judgments of the levels of ability tested and empirical item difficulty. Bachman and colleagues, on the other hand (Bachman, Davidson, Lynch, & Ryan, 1989; Bachman, Davidson, & Milanovic, 1991; Bachman, Davidson, Ryan, & Choi, in press) have found that by using a content- rating instrument based on a taxonomy of test method characteristics (Bachman, 1990b) and by training raters, a high degree of agreement among raters can be obtained, and such content ratings are related to item difficulty and item discrimina-tion. In my view, these results are not inconsistent. The research of Alderson and colleagues presents, I believe, a sobering picture of actual practice in the design and development of language tests: Test designers and experts in the field disagree about what language tests measure, and neither the designers nor the experts have a clear sense of the levels of ability measured by their tests. This research uncovers a potentially serious problem in the way language testers practice their trade. Bachman’s research, on the other hand, presents what can be accomplished in a highly controlled situation, and provides one approach to solving this problem. Thus, an important area for future research in the years to come will be in the refinement of approaches to the analysis of test method character-istics, of which content is a substantial component, and the inves-tigation of how specific characteristics of test method affect test performance. Progress will be realized in the area of language test-ing practice when insights from this area of research inform the de-sign and development of language tests. The research on test con-tent analysis that has been conducted by the University of Cam-bridge Local Examinations Syndicate, and the incorporation of that research into the design and development of EFL tests is illustrative of this kind of integrated approach (Bachman et al., 1991), The 1980s saw a wealth of research into the characteristics of test takers and how these are related to test performance, generally under the rubric of investigations into potential sources of test bias; I can do little more than list these here. A number of studies have shown differences in test performance across different cultural, linguistic or ethnic groups (e.g., Alderman & Holland, 1981; Chen & Henning, 1985; Politzer & McGroarty, 1985; Swinton & Powers, 1980; Zeidner, 1986), while others have found differential performance between sexes (e.g., Farhady, 1982; Zeidner, 1987). Other studies have found relationships between field dependence and test performance (e.g., Chapelle, 1988; Chapelle & Roberts, 1986; Hansen, 1984; Hansen & Stansfield, 1981; Stansfield & Hansen, 1983). Such studies demonstrate the effects of various test taker characteristics on test performance, and suggest that such characteristics need to be considered in both the design of language tests and in the interpretation of test scores. To date, however, no clear direction has emerged to suggest how such considerations translate into testing practice. Two issues that need to be resolved in this regard are .(a) whether and how we assess the specific characteristics of a given group of test takers, and (b) whether and how we can incorporate such information into the way we design language tests. Do we treat these characteristics as sources of test bias and seek ways to somehow “correct” for this in the way we write and select test items, for example? Or, if many of these characteristics are known to also influence language learning, do we reconsider our definition of language ability? The investigation of test taker characteristics and their effects on language test performance also has implications for research in second language acquisition (SLA), and represents what Bachman (1989) has called an “interface” between SLA and language testing research. METHODOLOGICAL ADVANCES   Many of the developments mentioned way we view language ability, the effects taker characteristics—have been facilitated that are available for test analysis. These above—changes in the of test method and test by advances in the tools advances have been in three areas: psychometrics, statistical analysis, and qualitative approaches to the description of test performance. The 1980s saw the application of several modern psychometric tools to language testing: item response theory (IRT), generalizability theory (G theory), criterion-referenced (CR) measurement, and the Mantel-Haenszel procedure. As these tools are fairly technical, I will simply refer readers to discussions of them: IRT (Henning, 1987), G theory (Bachman, 1990b; Bolus, Hinofotis, & Bailey, 1982), CR measure-ment (Bachman, 1990b; Hudson & Lynch, 1984), Mantel-Haenszel (Ryan & Bachman, in press). The application of IRT to language tests has brought with it advances in computer-adaptive language testing, which promises to make language tests more efficient and adaptable to individual test takers, and thus potentially more useful in the types of information they provide (e.g., Tung, 1986), but which also presents a challenge not to complacently continue using familiar testing techniques simply because they can be administered easily via computer (Canale, 1986). Alderson (1988a) and the papers in Stansfield (1986) provide extensive discussions of the applications of computers to language testing.   The major advance in the area of statistical analysis has been the application of structural equation modeling to language testing research. (Relatively nontechnical discussions of structural equation modeling can be found in Long, 1983a, 1983b.) The use of confirmatory factor analysis was instrumental in demonstrating the untenability of the unitary trait hypothesis, and this type of analysis, in conjunction with the multitrait/multimethod research design, continues to be a productive approach to the process of construct validation. Structural equation modeling has also facilitated the investigation of relationships between language test performance and test taker characteristics (e.g., Fouly, 1985; Purcell, 1983) and different types of language instruction (e.g., Sang, Schmitz, Vollmer, Baumert, & Roeder, 1986).   A third methodological advance has been in the use of introspec-tion to investigate the processes or strategies that test takers employ in attempting to complete test tasks. Studies using this approach have demonstrated that test takers use a variety of strategies in solving language test tasks (e.g., Alderson, 1988c; Cohen, 1984) and that these strategies are related to test performance (e.g., Anderson, Cohen, Perkins, & Bachman, 1991; Nevo, 1989).   Perhaps the single most important theoretical development in language testing in the 1980s was the realization that a language test score represents a complexity of multiple influences. As both Alderson and Skehan point out, this advance has been spurred on, to a considerable extent, by the application of the methodological tools discussed above. But, as Alderson (1991) notes, “ the use of more sophisticated techniques reveals how complex responses to test items can be and therefore how complex a test score can be” (p. 12). Thus, one legacy of the 1980s is that we now know that a language test score cannot be interpreted simplistically as an indicator of the particular language ability we want to measure; it is also affected to some extent by the characteristics and content of the test tasks, the characteristics of the test taker, and the strategies the test taker employs in attempting to complete the test task. What makes the interpretation of test scores particularly difficult is that these factors undoubtedly interact with each other. The particular strategy adopted by a given test taker, for example, is likely to be a function of both the characteristics of the test task and the test taker’s personal characteristics. This realization clearly indicates that we need to consider very carefully the interpretations and uses we make of language test scores and thus should sound a note of caution to language testing practitioners. At the same time, our expanded knowledge of the complexity of language test perfor-mance, along with the methodological tools now at our disposal, provide a basis for designing and developing language tests that are potentially more suitable for specific groups of test takers and more useful for their intended purposes. ADVANCES IN LANGUAGE TEST DEVELOPMENT
最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
用户名: 密码:
匿名?
注册
栏目列表