Page Content

Posts

Understanding Dependency Grammar NLP Sentence Structure

Dependency grammar And Dependency parsing

Dependency grammar And Dependency parsing
Dependency grammar And Dependency parsing

What is Dependency Grammar?

One important framework for explaining the syntactic structure of sentences is dependency grammar. Dependency grammar does not have phrasal nodes, in contrast to constituent-based formalisms. Rather, lexical elements (words) connected by binary, asymmetric dependency relations make up the structure. Usually, syntactic functions like subject (SUBJ), object (OBJ), or modifier (MOD) are used to identify these relations, or edges. A head and its dependant have a relationship. The dependant is a type of modifier, and the head is regarded as the main word. The tensed verb is typically regarded as the sentence’s head, and all other words are either dependent upon it or related to it through a chain of dependencies.

Also Read About Context Free Grammars In NLP Natural Language Processing

A directed acyclic graph connecting the words in the surface sentence can be used to illustrate a dependency structure. Formally speaking, it can be described as a labelled, directed tree with arcs representing labelled, directed relations between the nodes, which are the tokens in the input sentence. Every node in a dependency tree has exactly one incoming arc, while the root node is the only node with no incoming arcs. Every vertex has its own path from the root. Projectivity, a constraint based on the linear order of words, is a crucial notation in dependency analysis. If edges can be drawn above a dependency graph without crossing while the words are in linear order, then the graph is projective. This indicates that a word and its offspring make up a continuous sequence. Non-projective trees, which are crucial for many languages but less necessary for English, can be produced via graph-based parsers.

Meaning-Text Theory and Functional Generative Description are two theoretical frameworks that use dependency grammar. In languages like Czech and Turkish that have free or variable word order, it is particularly used for treebank annotation. Dependency grammar’s ability to resolve parsing problems using lexical information rather than a complex phrase structure superstructure is one of its main advantages. It operates directly in terms of interdependence between words. Important information is immediately encoded by head-dependent relations, which frequently serve as a reliable stand-in for the semantic link between predicates and their arguments.

Dependency grammar has a far longer history than constituency or phrase structure grammars. It originated with the ancient grammarians and was formally stated in Tesnière’s work from the 20th century. Machine translation attempts introduced the first computational work on dependency parsing.

Dependency grammars concentrate on how words relate to one another and explicitly describe head-dependent relations and functional categories, whereas phrase structure grammars emphasize how words join to form constituents and explicitly represent phrases and structural categories. Dependency relations are implicitly recognized by phrase structure grammars, and more modern frameworks incorporate elements of both. Treebanks annotated in one formalism can occasionally be translated to the other, and dependency structures can be obtained from lexicalized constituent trees.

What is Dependency Parsing?

The process of determining the syntactic head of each word in an input phrase in order to derive its syntactic structure is known as dependency parsing. It is a process for identifying one or more trees that match a statement with proper grammar. This creates a dependency graph with binary head-dependent relations as arcs and sentence words as nodes.

Dependency parsing relies heavily on dependency treebanks. They serve as the gold standard for evaluation, are used to train dependency parsers, and supply information for supervised machine learning techniques that learn scoring functions across potential syntactic analyses.

Dependency parsing techniques

Dependency parsing techniques
Dependency parsing techniques

Dependency parsing techniques fall into two major families:

Transition-based parsing

  • Utilizes the notion of shift-reduce parsing.
  • Uses a buffer of tokens to be parsed and a stack to construct the parsing.
  • Uses a number of transition operations to gradually build the syntactic structure. Transitions include LEFTARC, RIGHTARC, and SHIFT, to name a few.
  • At each stage, the optimal transition is selected using a classifier (such as logistic regression, support vector machines, or neural networks).
  • The classifier often uses the following features: modifier counts, distance between items, word/part-of-speech of stack and buffer elements, and combinations of these.
  • The algorithm is built on a greedy stack.
  • When heads are remote from dependents or when sentences are lengthy, transition-based approaches may not work well.
  • In general, they outperform graph-based techniques.

Also Read About What Is FST In NLP? Role Of FST In NLP Tasks & Limitations

Graph-based parsing

  • Scores whole dependency trees instead than depending on avaricious local choices.
  • Finds the tree that maximizes a certain score by searching the space of potential trees for a given text.
  • Frequently employs an edge-factored score that breaks down into separate dependency arcs.
  • Finding the highest-scoring spanning tree in a graph that shows every conceivable labelled directed arc between words becomes the parsing issue.
  • Able to yield trees that are not projective.
  • More accurate overall, especially for lengthy sentences, than transition-based parsers.
  • Although features are limited to taking into account a single arc, they are comparable to transition-based approaches.
  • Algorithms such as the structured perceptron can be used to learn.
  • In cubic time, graph-based parsing is usually slower. For global discriminative models, dependency representations are easier to handle than phrase structure.

The output of dependency parsers is compared to gold standard dependency treebanks in order to evaluate them. Among the metrics are labeled/unlabeled accuracy scores and the attachment score, which calculates the percentage of words linked to the correct head.

Information extraction, semantic parsing, question answering, and machine translation are just a few of the downstream language processing activities that dependency parsing directly aids in. For example, parsers trained on treebanks such as the Arabic Treebank (for Arabic) and the Penn Treebank (for English) can be utilized in syntax-based machine translation. Syntactic dependencies that are helpful for answering queries and retrieving information are captured by dependency parsing.

Index