Date of Submission
6-11-2026
Date of Award
6-19-2026
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)
Supervisor
Bhattacharya, Ujjwal
Abstract (Summary of the Work)
In this work I build a system that recognizes isolated American Sign Language (ASL) words, and I use it to ask one fairly direct question: when training data is scarce, is it better to look at the video pixels or at the geometry of the signer’s body? To find out, I train two very different models on exactly the same clips. The first is appearance-based. Every frame is run through standard preprocessing and a ResNet50 backbone pre-trained on ImageNet, which turns it into a 2048-dimensional feature vector, and a Bidirectional LSTM then reads that sequence over time. The second model never sees a pixel. It works only on Media Pipe key points, the tracked coordinates of the body and the two hands, and feeds them to a Transformer encoder. So both models have to learn the same two things, the shape of the hands in each frame and the way those shapes move across frames, and both are trained, validated and tested under one identical protocol. What I care about throughout is a recognizer that is accurate but still light enough to be useful in practice, so it could eventually make communication a little easier between people who sign and people who do not.
Control Number
CS2426
DOI
https://dspace.isical.ac.in/items/f6544d28-6b9d-48ef-a19e-ab4d50891463
DSpace Identifier
http://hdl.handle.net/10263/7738
Recommended Citation
Soni, Saurabh Kumar, "American Sign Language Recognition and Analysis Using Deep Learning" (2026). Master’s Dissertations. 462.
https://digitalcommons.isical.ac.in/masters-dissertations/462