Identifying Multi-Word Units in Context

Philip, Gill (2005) Identifying Multi-Word Units in Context. [Preprint]
Full text available as:
[img]
Preview
PDF
Download (111kB) | Preview

Abstract

Abstract: Far from being linguistic anomalies, multi-word expressions abound in natural language, yet their identification is surprisingly problematic. The same combination of words can occur as a compositional, fully lexical string or as a delexicalised multi-word unit (MWU). How can these different manifestations of a series of words be distinguished one from the other? To exacerbate the problem, the creativity of language users results in the appearance of non-canonical forms of MWUs. How can these innovative uses be retrieved so that they can be incorporated into a comprehensive analysis of the MWU under study? This paper sets forth procedures for retrieving non-canonical variants from large general reference corpora, and addresses the disambiguation of compositional and non-compositional multi-word strings from a collocational standpoint.

Abstract
Document type
Preprint
Creators
CreatorsAffiliationORCID
Philip, Gill
Keywords
canonical, variation; non-compositional; salience
Subjects
DOI
Deposit date
12 Sep 2005
Last modified
16 May 2011 11:42
URI

Other metadata

Downloads

Downloads

Staff only: View the document

^