Indian Statistical Institute

Doctoral Thesis

Doctor of Philosophy

Computer Science


Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)


Chaudhuri, Bidyut Baran (CVPR-Kolkata; ISI)

A distinctive intelligent trait of human beings is the ability to carry out meaningful communication through language. The communication may be direct as in spoken conversation or indirect as in written form, through the audio-visual media, etc. Linguistic ability in humans have fascinated scholars ever since man first learnt to use language. Linguistics, the branch of study involved in studying the nature of human linguistic communication, is perhaps as old as language itself. The invention of the computer added a new dimension to linguistics. Making the computer emu- late human linguistic behaviour was taken up as a challenge by computer scientists. Branches of study like Cybernetics, Machine Translation (MT) from one language to another, Natural Language Generation, Natural Language Processing/ Under- standing (NLP/ NLU) developed as a result. During the mid-1950s, a branch of modern linguistics known as Computational Linguistics (CL) came into existence. CL is dedicated mostly to develop more realistic models of cognition. MT, NLP, NLU, etc. were adopted by the artificial intelligence (AI) community of computer scientists as domains of research towards making computers more intelligent. AI workers in the above lines had to work in close association with computational linguists in this regard. There is however one major distinction in the viewpoint of CL and a NLP/ NLU. CL is more concerned about gaining insight into why and how human beings display cognitive behaviour as seen around us. NLP is more concerned in making the computer emulate such behaviours with implementation (as computer programs) of the linguistic theories, albeit only for a workable sub-set of the target language. The present work is primarily implementation motivated. However, at times certain linguistic conjectures have also been suggested.ELIZA of Weizenbaum [174] was an early attempt towards making computer be; have intelligently. However the treatment was quite superficial. Chomsky (37, 38] first proposed a formal mathematical theory of human cognitive behaviour. Subse-. quent refinement in the original Chomskian principles led to a major sub-discipline called the Government and Binding (GB). At present, there are many other impor- tant computational linguistic theories, some quite close and some quíte distant from GB-theories. The Generalized Phrase Structured Grammar (GPSG) of Gazder et al [70), the Tree Adjoint Grammar (TAG) of Joshi [85), Functional Unification Grammar (FUG) of Kay [100, 109] and the Lexical Functional Grammar (LFG) of Kaplan et al [88] as some of the modern formalisms. In the implementation scenario, the Augmented Transition Network of Woods [181, 183), Marcus Parser of Marcus [121, 122], Slot and Modifier approach of McCord [126] and Definite. Clause Grammar of Pereira (138] are a few important syntactic techniques. The GPŞG, the FUG, the TAG and the LIG formalisms also come with associated im- plementation semantics. The mathematical properties of linguistic theories have been analyzed by Perrault [140]. Kay [110), Kaplan et al [87] and Koslkenniemi [116, 117, 118] have proposed important formalisms for morphological processing at the lexical level. There are quite a few reported works on application of th above principles in lexicon design, as discussed later.History of Indian linguistics dates back to more than a thousand years before Christ. Grammarians like Panini, Katyayana, Patanjali, etc.


Mathematics Commons