Developing and Testing Alternative Benchmarks of Lexical Sophistication: L2 Lexical Frequency, Semantic Context, and Word Recognition Indices
Vanderbilt, Katia
Citations
Abstract
Previous research has traditionally used first language (L1) English linguistic norms as a benchmark to assess second language (L2) production (Cook, 1992) and to select experimental stimuli in bilingual studies (Vaid & Meuter, 2017). Despite the immense contribution of this approach, L1 benchmarks may not completely represent the linguistic experience of L2 users, and they might limit our understanding of multicompetence or the state of knowing multiple languages (Cook, 1991; Klein, 1998; Vaid & Meuter, 2017). A few attempts to develop indices that more closely represent L2 linguistic experience have been made (e.g., Monteiro et al., 2020; Naismith et al., 2018), but researchers have been slow to respond to the need for more L2 benchmarks. The primary aim of this dissertation is to help address this gap by developing lexical benchmarks based on L2 corpora and L2 behavioral data collected for this dissertation. The corpus-based benchmarks included L2 lexical frequency indices, L2 range indices, and L2 semantic context indices based on Latent Semantic Analysis (LSA) and Word to Vector (Word2vec) computational methods. The benchmarks based on behavioral data included L2 word recognition indices from a word naming task performed by bilinguals studying in the United States (N = 94). These benchmarks were validated against psycholinguistic data of L2 lexical processing and human judgments of L2 writing proficiency. The results suggested that the L2 benchmarks were successful predictors of L2 writing quality and L2 word processing and were more predictive than L1 benchmarks in some cases. Analysis of individual output also suggested that the L2 benchmarks provide frequency and word recognition information that may be unique to L2 users.
