Subtitle-based word frequencies as the best estimate of reading behavior: the case of greek.

Research paper by Maria M Dimitropoulou, Jon Andoni JA Duñabeitia, Alberto A Avilés, José J Corral, Manuel M Carreiras

Indexed on: 01 Jan '10Published on: 01 Jan '10Published in: Frontiers in psychology


Previous evidence has shown that word frequencies calculated from corpora based on film and television subtitles can readily account for reading performance, since the language used in subtitles greatly approximates everyday language. The present study examines this issue in a society with increased exposure to subtitle reading. We compiled SUBTLEX-GR, a subtitled-based corpus consisting of more than 27 million Modern Greek words, and tested to what extent subtitle-based frequency estimates and those taken from a written corpus of Modern Greek account for the lexical decision performance of young Greek adults who are exposed to subtitle reading on a daily basis. Results showed that SUBTLEX-GR frequency estimates effectively accounted for participants' reading performance in two different visual word recognition experiments. More importantly, different analyses showed that frequencies estimated from a subtitle corpus explained the obtained results significantly better than traditional frequencies derived from written corpora.