While a study of high frequency verbs makes it possible to access a
wide range of phraseological units, it is not the ideal way of accessing
the more routine aspects of phraseology. These are best accessed via
automatic extraction and analysis of recurrent combinations. i.e. continuous
strings of words occurring more than once in identical form (Altenberg
1998).
We
will use this method to extract continuous recurrent n-grams (i.e. sequences
of n orthographic words) from native and learner corpora of English
essays. This method has very strong heuristic power as it is not based
on a pre-established list of phraseological units. The downside is that
some of the units brought up by the method are not interesting to analyse
from a phraseological point of view. We will disregard fragments, such
as because of the, one of the, there is a, which are highly frequent
recurrent combinations simply by virtue of the fact that the words that
compose them are highly frequent, and focus on formulae, or formulaic
expressions, i.e. «multi-word units that perform pragmatic and/or
discourse structuring functions» (De Cock 1998).
Using
this approach, we will be able to establish whether, as claimed by Kjellmer
(1991: 124) learners’ «building material is individualized
bricks rather than prefabricated sections». A number of studies
that have used this approach prove that this statement needs to be qualified.
Milton (1998) shows that learners seem to rely on a smaller repertoire
of prefabs (prefab types), some of which are shared with native speakers,
others learner idiosyncratic, which are used with very high frequencies
of occurrence (prefab tokens). Among the prefabs which are overused
by Chinese-speaking learners of English, he notes a high frequency of
connectors (first of all, on the other hand), especially metaphorical
ones such as in a nutshell, a phenomenon which he interprets as being
teaching-induced (‘NNSs are using those expressions which they
have been instructed to use’) rather than transfer-related. De
Cock’s (2000) results, on the other hand, show that French-speaking
learners overuse the least salient (i.e. most transparent) word combinations,
such as we can say that, I think that, a phenomenon which is clearly
not teaching-induced. Therefore results here are also highly inconclusive.
[back]