Parts-Of-Speech Tagging Using Maximum Entropy Model.

Date of Submission

December 2010

Date of Award

Winter 12-12-2011

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)


Garain, Utpal (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

Many different researchers, using a wide variety of techniques, have examined the task of Part-of-Speech (POS) tagging. The task itself consists of assigning basic grammatical word classes such as verb, noun and adjective to individual words, and is a fundamental step in many Natural Language Processing (NLP) tasks. The tags it assigns are used in other processing tasks such as chunking and parsing, as well as more complex tasks such as question answering and automatic summarization systems. Maximum Entropy modeling is one of the techniques that have been used to perform POS tagging, and gives state-of-theart accuracyWe aim to find better ways to perform POS tagging on unknown words. We will use an existing Maximum Entropy POS tagger that already performs at stateof-the-art level, and implement additional new features in order to increase its accuracy. These features will be able to represent real values in any range greater than zero, rather than a binary 0 or 1 as has been the case for Maximum Entropy modeling system have used in the past.The features themselves will encapsulate information found from the context around a word, as observed for unknown words. For example, if we find an unknown word in the test data, then it may still appear many times in a much larger unannotated corpus. By looking at the surrounding words in these contexts, we can formulate an idea of what POS tag should be assigned. This can be seen in the sentence below:The frub house is up on the hillHere, frub is the unknown word, and as a human we could conclude that it is an adjective or noun. This is because it sits between a determiner and a noun, which is a position often assumed by words with these two tags. Also, if we can find the word frub in other places, then we can get an even better, more reliable dea of what its correct tag should be. This is what the large un-annotated corpus gives us: a number of examples of how and where unknown words are used.We should also note that we do not need to know the correct POS tags for the and house. We can determine simply from the words themselves, that frub is occupying a position that is also taken up by words such as big or club, these being examples of adjectives and nouns respectively. Also, the fact that the word the precedes our unknown word tells us a lot by itself, as this is an extremely common word that exists with only tag. Our aim then, is to take this intuitive reasoning for determining the correct tag for an unknown word, and create features that aid the Maximum Entropy model in doing the same.We will begin by describing the previous work that has taken place on the task of POS tagging, including the corpora that are used and the techniques that have been applied to the task. This will continue onto particular methods that have attempted to better classify unknown words, and then the statistical machine learner that we will be using: Maximum Entropy modeling. This will be followed by an extensive description of the experiments we performed, and the alterations to the Maximum Entropy features and calculations that were required to achieve the best performance. We then proceed to a thorough analysis and discussion of the results we attained, and finally, further applications, uses of, and improvements to the methods described.


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


This document is currently not available here.