All Topics  
Collocation

 

   Email Print
   Bookmark   Link






 

Collocation



 
 
Within the area of corpus linguistics
Corpus linguistics

Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language....
, collocation is defined as a sequence of words or terms
Terminology

Terminology is the study of terms and their use. Terms are words and compound words that are used in specific contexts. Not to be confused with "terms" in colloquial usages, the shortened form of technical terms which are defined within a Academic discipline or speciality field....
 which co-occur
Co-occurrence

Co-occurrence can either mean concurrence / coincidence or, in a more specific sense, the above-chance frequent occurrence of two words from a text corpus alongside each other in a certain order....
 more often than would be expected by chance.

Collocation comprises the restrictions on how words can be used together, for example which prepositions are used with particular verbs, or which verbs and nouns are used together. Collocations are examples of lexical units. Collocations should not be confused with idiom
Idiom

An idiom is a phrase whose meaning cannot be determined by the literal definition of the phrase itself, but refers instead to a figurative language meaning that is known only through common use....
s.

Collocation extraction
Collocation extraction

Collocation extraction is a task that extracting collocations automatically from a corpus using computer.Within the area of corpus linguistics, collocation is defined as a sequence of words or terminology which co-occurrence more often than would be expected by chance.'Crystal clear', 'middle management', 'nuclear family', and 'cosmet...
 is a task that extracts collocations automatically from a corpus-using computer in computational linguistics
Computational linguistics

Computational linguistics is an interdisciplinary field dealing with the Statistics and/or rule-based modeling of natural language from a computational perspective....
.

substitutability: We cannot substitute a word in a collocation with a related word.






Discussion
Ask a question about 'Collocation'
Start a new discussion about 'Collocation'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Within the area of corpus linguistics
Corpus linguistics

Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language....
, collocation is defined as a sequence of words or terms
Terminology

Terminology is the study of terms and their use. Terms are words and compound words that are used in specific contexts. Not to be confused with "terms" in colloquial usages, the shortened form of technical terms which are defined within a Academic discipline or speciality field....
 which co-occur
Co-occurrence

Co-occurrence can either mean concurrence / coincidence or, in a more specific sense, the above-chance frequent occurrence of two words from a text corpus alongside each other in a certain order....
 more often than would be expected by chance.

Collocation comprises the restrictions on how words can be used together, for example which prepositions are used with particular verbs, or which verbs and nouns are used together. Collocations are examples of lexical units. Collocations should not be confused with idiom
Idiom

An idiom is a phrase whose meaning cannot be determined by the literal definition of the phrase itself, but refers instead to a figurative language meaning that is known only through common use....
s.

Collocation extraction
Collocation extraction

Collocation extraction is a task that extracting collocations automatically from a corpus using computer.Within the area of corpus linguistics, collocation is defined as a sequence of words or terminology which co-occurrence more often than would be expected by chance.'Crystal clear', 'middle management', 'nuclear family', and 'cosmet...
 is a task that extracts collocations automatically from a corpus-using computer in computational linguistics
Computational linguistics

Computational linguistics is an interdisciplinary field dealing with the Statistics and/or rule-based modeling of natural language from a computational perspective....
.

Common features

Non-substitutability: We cannot substitute a word in a collocation with a related word. For example, we cannot say yellow wine instead of white wine although both yellow and white are the names of colours. Non-modifiability: We cannot modify a collocation or apply syntactic transformations.

Expanded definition

If the expression is heard often, transmitting itself meme
Meme

A meme is a unit or element of culture ideas, symbols or practices; such units or elements transmit from one mind to another through speech, gestures, rituals, or other imitable phenomena....
tically, the words become 'glued' together in our minds. 'Crystal clear', 'middle management', 'nuclear family', and 'cosmetic surgery' are examples of collocated pairs of words. Some words are often found together because they make up a compound noun, for example 'riding boots' or 'motor cyclist'.

Collocations can be in a syntactic
Syntax

In linguistics, syntax is the study of the principles and rules for constructing Sentence s in natural languages. In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the Irish syntax"....
 relation (such as verb-object: 'make' and 'decision'), lexical relation (such as antonymy), or they can be in no linguistically defined relation. Knowledge of collocations is vital for the competent use of a language: a grammatically
Grammar

Grammar is the field of linguistics that covers the conventions governing the use of any given natural language. It includes morphology and syntax, often complemented by phonetics, phonology, semantics, and pragmatics....
 correct sentence will stand out as 'awkward' if collocational preferences are violated. This makes collocation an interesting area for language teaching.

Corpus Linguists specify a Key Word
Keyword (linguistics)

In corpus linguistics a keyword is a word which occurs in a text more often than we would expect to occur by chance alone. Keywords are calculated by carrying out a statistical test which compares the word frequencies in a text against their expected frequencies derived in a much larger corpus, which acts as a reference for general language...
 in Context (KWIC) and identify the words immediately surrounding them. This gives an idea of the way words are used.

The processing of collocations involves a number of parameters, the most important of which is the measure of association, which evaluates whether the co-occurrence
Co-occurrence

Co-occurrence can either mean concurrence / coincidence or, in a more specific sense, the above-chance frequent occurrence of two words from a text corpus alongside each other in a certain order....
 is purely by chance or statistically significant
Statistical significance

In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important, or significant in the common meaning of the word....
. Due to the non-random nature of language, most collocations are classed as significant, and the association scores are simply used to rank the results. Commonly used measures of association include mutual information
Mutual information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
, t scores
Student's t-test

A t-test is any statistical hypothesis testing in which the test statistic has a Student's t-distribution if the null hypothesis is true. It is applied when the population is assumed to be normal distribution but the sample sizes are small enough that the statistic on which inference is based is not normally distributed because it relies...
, and log-likelihood.

Rather than select a single definition, Gledhill proposes that collocation involves at least three different perspectives: (i) cooccurrence, a statistical view, which sees collocation as the recurrent appearance in a text of a node and its collocates, (ii) construction, which sees collocation either as a correlation between a lexeme and a lexical-grammatical pattern, or as a relation between a base and its collocative partners and (iii) expression, a pragmatic view of collocation as a conventional unit of expression, regardless of form. It should be pointed out here that these different perspectives contrast with the usual way of presenting collocation in phraseological studies. Traditionally speaking, collocation is explained in terms of all three perspectives at once, in a continuum:

‘Free Combination’ ? ‘Bound Collocation’ ? ‘Frozen Idiom’


External links