Computational Linguistics - How Important Is Semantics? Compared to What?

Authors Avatar

Computational Linguistics - How Important Is Semantics? Compared to What?

Semantics has to do with meaning, the nature of which has been the subject of much philosophical debate: what is it exactly and how can it be represented? This essay is concerned with semantics from the perspective of Computational Linguistics, which is essentially concerned with building or attempting to build computational models of natural language. These Natural Language Processing or Understanding systems have a wide range of practical applications as well as providing insight into human understanding and perception.

        The prominance of semantics in the title can be taken as suggesting that it is indeed highly relevant to Computational Linguistics. However it is also implied, quite correctly, that semantics is not the only relevant area of interest. Natural Language systems usually have a number of different components and briefly outlining them here will help to put semantics in context.

Phonetics is concerned with the analysis of spoken language. It is generally considered to be a specialized area of research and many centres for Computational Linguistics deal mainly with written language.

Morphology involves the analysis of the composition and (and also meaning) of individual words. Often at this stage that syntactic categories are assigned to words, since interpretation of affixes may depend on the category of a word. For example, drinks could be either a plural noun or a first person singular verb.

Syntactic analysis imposes structure on a flat string of words according to the grammatical categories of words. The resulting structure is referred to as a parse. Ambiguity is a major problem for parsing and for Natural Language Processing in general, since it leads to multiple parses of a single sentence many of which can be later rejected

Semantics is generally concerned with assigning meaning to the structures created by the syntactic parse. If no meaning can be assigned to the structure it is nonsensical and can be rejected; for example: "Colourless green ideas sleep furiously" (Chomsky 1957) is syntactically well-formed but semantically ill-formed.

Pragmatics can crudely be defined as an inferential process which relates a sentence to the context in which it occurs, in order to understand the conveyed meaning as opposed to the truth-conditional meaning. Both semantics and pragmatics need recourse to knowledge about the world or the domain being modelled. The difficulties that this requirement poses will be discussed below.

Although the processes introduced above are often seen as constituting modules, which implies that they are discrete and self-contained, this perhaps reflects the computational requirements of a system rather than reality, since the boundaries are often fuzzy.

        In order to measure the importance of semantics in comparison to the other linguistic components of an NPL system we can look at what a system which stops short of incorporating semantic processes is capable of in comparison to one which does actually exploit semantics. Secondly to demonstrate the limitations of semantics we can briefly consider how pragmatics could extend the applications of a system.

        Syntax and parsing is the most mature field of study in Computational Linguistics. Parsing uses a grammar with rules and a parser which matches rules to the sentence to infer structure into it. There are a number of different strategies: top-down and bottom-up, either of which can be combined with techniques for choosing between alternate paths. The resulting structure provides a basis for compositional semantic analysis. Not all systems carry out syntactic parsing. Direct semantic parsing is however computationally expensive since the semantic component has to choose its own constituents and a significant amount of inferencing is invoked. Furthermore it is not always possible to extract meaning without refering to grammatical facts.

Join now!

        (1) The sun orbits the earth

        In (1) syntactic facts produce the correct interpretation in which the sun revolves around the planet earth despite this seeming semantically anomolous. Syntactic processing can deal with important linguistic generalisations about word order, number and case agreement

        Lexical and structural ambiguity and the problems they cause are linked to the issue of knowledge representation. Lexical ambiguities arise when alternate meanings can be assigned to a word and structural ambiguities arise when there is more than one structure which can be assigned to a sentence.

        (2) The results are represented on the table

...

This is a preview of the whole essay