Page Content

Tutorials

What Is WSD Word Sense Disambiguation In NLP

What is WSD

What is WSD
What is WSD

Resolving lexical ambiguity is the primary goal of Word Sense Disambiguation, a crucial problem in Natural Language Processing (NLP). It is said to be basically a classification issue, with the objective being to ascertain a word’s right sense or meaning in a certain situation. WSD seeks to determine which of a word’s many possible meanings or senses is best suited for its particular usage in a phrase or text.

As early as 1955, the goal of WSD was identified as a separate problem in the history of natural language processing (NLP), first motivated by the requirements of machine translation (MT). It was also seen by early researchers like Bar-Hillel as a major roadblock to the advancement of universal MT.

Types of lexical ambiguity

The ambiguity that results from a word form having several meanings is addressed by WSD. The following categories of lexical ambiguity are distinguished:

Homonymy: Words with different meanings but the same form (spelling or sound). Examples include ‘bat’ (implement) and ‘bat’ (animal), or ‘light’ (as opposed to dark) and ‘light’ (as opposed to heavy). The term “bank” used to refer to a financial organization as opposed to a river bank, for instance, is frequently mentioned as an example of homonymy. However, point out that whether or not ordinary people perceive a link is more important than historical relatedness.

Polysemy: Several meanings of the same term that are connected. The meanings of “title” such as name/heading, legal ownership, or document, or the distinction between “bank” (a financial organization) and “bank” (in a casino) are a few examples. Since the meaning distinctions in polysemy are usually more nuanced than in homonymy, it is thought to be more troublesome for NLP.

Categorial Ambiguity: Where a word can have several meanings and be employed as separate parts of speech. ‘Book’, for instance, may be either a verb (to register charges) or a noun (physical thing). Syntactic information (part of speech labelling) by itself is frequently sufficient to resolve categorical ambiguity. However, it usually takes more than simply syntax to resolve homonymy and polysemy.

WSD concentrates on clearing out ambiguities within the same part of speech, whereas POS tagging distinguishes between word usages such as the noun “bank” and the verb “to bank.” The majority of WSD systems presume that POS tagging has been completed beforehand.

Systems require a predetermined inventory of potential word senses in order to execute WSD. The sense inventory that is selected has a considerable impact on the WSD task’s nature and difficulty. These inventories may be obtained:

Dictionaries: Machine-readable dictionaries were utilised in a lot of early work, which assigned sense numbers to words. Longman and Collins COBUILD are two examples. Lesk’s method and other WSD algorithms frequently employ dictionary meanings as hints.

Thesauri: These make use of Roget’s Thesaurus and other semantic categorisations.

WordNet: A popular resource for English WSD is this one. The database is arranged according to sets of synonyms, or synsets, and contains relations such as hyponymy and hypernymy (IS-A). Conceptual semantic knowledge is what WordNet aims to capture. SemCor is a large corpus containing WordNet sense annotations. Some people feel that WordNet’s sense granularity is too fine.

Ad hoc / Specialized Inventories: It is possible to manually define sense inventories for certain apps or activities. For example, Medical Subject Headings (MeSH) may be the inventory used for automated indexing of medical materials.

Translation Equivalents: The set of potential translations in a target language can serve as the sense inventory for machine translation systems. It has been demonstrated that using translation ambiguities enhances MT.

IntoNotes: This is yet another important corpus resource with sense annotations.

The main source of evidence for disambiguation is the context in which a word occurrence occurs. Although words farther away in the phrase, paragraph, or page can also help, their predictive strength decreases. Generally, the words that are closest to the target term are the most predictive. Another crucial factor is the type of syntactic link that exists between words.

How to Perform WSD

WSD may be approached in a variety of ways, from contemporary statistical and machine learning techniques to earlier rule-based and knowledge-based systems:

Knowledge-Based / Dictionary-Based: These techniques make use of outside lexical resources. For instance, Lesk’s algorithm selects the meaning with the greatest overlap of terms by comparing the target word’s dictionary definitions with those of the words in its context.

Supervised Methods: These use a corpus of ambiguous phrases that have been manually tagged with their senses to train a classifier. Lexical context, POS tags of adjacent words, syntactic connections, and subject or domain information are examples of features that can be employed for categorisation. A substantial quantity of training data is needed for supervised algorithms, ideally for every polysemous word. The nearest neighbours with contextual embeddings are a common component of supervised algorithms.

Unsupervised Methods (Sense Induction): Such approaches cluster the circumstances in which a word appears in order to automatically identify or ‘induce’ word senses, as opposed to relying on predetermined sense inventories. They cluster situations without giving them predetermined labels.

Semi-supervised / Lightly Supervised Methods: These frequently use a big quantity of unlabelled data and a small amount of labelled data, combining aspects of supervised and unsupervised learning. Bootstrapping methods learn more repeatedly after starting with a small number of tagged examples.

WSD Methods

In WSD, a number of heuristics are frequently used:

One Sense Per Discourse: The finding that a word typically retains its meaning throughout a single text or conversation. For coarser or unrelated senses, this works better.

One Sense Per Collocation: The idea that words that collocate appear close to a target word offer reliable and potent clues about its meaning.

Applications of WSD

WSD is a crucial first step for other NLP activities, although it is not usually an end-user application in and of itself. Among the applications are:

Information Retrieval (IR): Identifying the many meanings of “tank” is one example of how WSD may assist in mapping words to their unique meanings to increase search relevancy.

Machine Translation (MT): As previously stated, WSD can enhance the quality of MT, and various senses call for diverse translations.

Intelligent Dictionaries and Thesauri: These tools can offer definitions and synonyms that are pertinent to the material being read by determining the sense that is appropriate for the situation.

Named Entity Disambiguation: WSD has many characteristics and methods with determining which real-world entity a name refers to (for example, “Madison” as a person or city).

Ambiguous Abbreviations and Acronyms: Distinguishing acronyms such as “IRA” is also connected.

Determining and agreeing upon the boundaries of word senses is a major problem in WSD. The way that different dictionaries and resources divide senses can differ significantly. Because even human judges might disagree on tiny differences, the granularity of sense distinctions (fine-grained vs. coarse-grained) presents additional difficulties. Research is ongoing to evaluate WSD systems, particularly across various sense inventories or granularity levels. Standard assessment benchmarks are provided by tasks such as Senseval. Artificial ambiguous words, or “pseudo-words,” can be produced to produce vast amounts of test and training data for assessment. An upper constraint on WSD accuracy is provided by human performance.

Index