ISI Digital CommonsCopyright (c) 2023 Indian Statistical Institute All rights reserved.
https://digitalcommons.isical.ac.in
Recent documents in ISI Digital Commonsen-usFri, 08 Dec 2023 02:18:59 PST3600Insilico Identification of Disease Genes using Microarray Data and Protein-Protein Interaction Networks.
https://digitalcommons.isical.ac.in/masters-dissertations/199
https://digitalcommons.isical.ac.in/masters-dissertations/199Thu, 30 Nov 2023 01:06:44 PST
One of the important problems in functional genomics is how to select the disease genes. In this regard, the paper presents a new similarity measure to compute the functional similarity between two genes. It is based on the information of protein-protein interaction networks. A new gene selection algorithm is introduced to identify disease genes, integrating judiciously the information of gene expression profiles and protein-protein interaction networks. The proposed algorithm selects a set of genes from microarray data as disease genes by maximizing the relevance and functional similarity of the selected genes. The performance of the proposed algorithm, along with a comparison with other related methods, is demonstrated on colorectal cancer data set.
]]>
Ekta SahaDesign of an Efficient Content-Based Image Retrieval (CBIR) System.
https://digitalcommons.isical.ac.in/masters-dissertations/198
https://digitalcommons.isical.ac.in/masters-dissertations/198Thu, 30 Nov 2023 01:06:44 PST
This thesis reports a study on an accurate and fast Content-Based Image Retrieval (CBIR) system which is widely used for representation, storage and retrieval of images. In the proposed method, effort has been made to select discriminating features for image representation. Relevant data structures have been proposed for image database so that efficient retrieval can be effected. Performance evaluation has been carried out over a number of binary images in terms of space and time complexities. Finally comparison has been made with some of the existing CBIR systems.
]]>
Debi Prasad SahooSAT Solver Based Multi Cycle Droop Fault Testing.
https://digitalcommons.isical.ac.in/masters-dissertations/196
https://digitalcommons.isical.ac.in/masters-dissertations/196Thu, 30 Nov 2023 01:06:43 PST
Driven by Mooreâ€™s Law for more than four decades, the complexity and scale of VLSI integration has reached unforeseeable heights today. The increased density of switching devices and rising frequency has led to large power consumptions per unit area. Due to high frequency of operation and inductive effects of the power grid lines, a noticeable power drop occurs when logic gates within close physical proximity of each other switch simultaneously. This drop in power, known as droop, propagates along the power supply lines, decaying exponentially with spatial and temporal distance from its origin. This is manifest a few clock cycles later, in the form of a reduced power drop at a neighboring via, giving rise to the possibility of timing faults at some gates in that via. Such faults are known as multi cycle droop faults (MDF). In this dissertation, a new approach is taken towards modeling of these faults in combinational circuits, using the concepts of Boolean Satisfiability to provide more flexibility and efficiency in test generation for detection of these faults. Finally, a prototype algorithm to generate test vectors for multi-cycle droop faults in full-scan circuits is presented and discussed.
]]>
Santanu BhowmickQuery Processing in Dynamic Federation for Loosely-Coupled and Tightly-Coupled Systems.
https://digitalcommons.isical.ac.in/masters-dissertations/197
https://digitalcommons.isical.ac.in/masters-dissertations/197Thu, 30 Nov 2023 01:06:43 PST
A federated database system (FDBS) is a collection of autonomous databases where each member has local query processing facility for its own users and in addition, they can share certain global queries that are decomposed and distributed among the different members of the federation. Heimbigner & McLeod [1]considered a federated architechture as a loosely coupled structure which does not support interdatabase dependencies and does not have a global schema either. Sheth & Larson(2) on the other hand, described the federated system with much broader capabilities. Depending on the degree of autonomy present among the component databases, they classified FDBS as loosely-coupled and tighty coupled systems. A tightly coupled system would definitely specify interdatabase relationships and would have a global schema. Larson et.al|5) identified the conflicts present among the participating databases. Besides identifying conflicts, Ozsu and Valduriez[3] also described the different stages of integration in a heterogeneous environment. However, all the proposals and rescarch effort referred so far believe in the existence of all the component databases before the federation is built.Bagchi(4] proposed a dynamic environment of Data Federation in which the component databases may join the federation or withdraw from it without affecting the transactions currently running. Component databases not involved in this dynamic change should continue to process their queries. While each component has its own local queries they should maintain a federation to model an organisational setup and thus would provide a structure for shared access. So a global query to the federated structure should be decomposed and distributed among the component databases. During processing, a participating database should not differentiate between a query local to itself and a sub-query assigned to it by adecomposed global query. A dynamic federation demands that a federated structure should grow and shrink with addition or deletion of a database. Some applications need such a structure.1.2 MotivationWe normally form a single database for an enterprise (An enterprise is a reasonably self-contained,commercial,scientific or any other organisation). In a relational system we can update the schema by adding new attributes to a relation or by forming a new relation ac- cording to the requirement. Update also includes deletion of attributes or relations. The above considerations, which can be modelled by a single database, has the following limitations :1. A large enterprise in general consists of a number of sub-enterprisos. Though these sub- enterprises contain overlapping information(some relations are fully or partially common in the sense that they refer to same set of attributes), each such sub-enterprises may be large enough so as to be considered as a separate database.2. Each sub-enterprise has users who are interested only with information related to that sub-enterprise and may not be interested in others.3. Each database may need local autonomy for efficient maintenance and query processing4. Sub-enterprises should act independent to each other, so that error/crash in one should not hinder the local operations in other sub-enterprises.
]]>
Rana AichTrust Based Event Model.
https://digitalcommons.isical.ac.in/masters-dissertations/194
https://digitalcommons.isical.ac.in/masters-dissertations/194Thu, 30 Nov 2023 01:06:42 PST
Over the last decade, information generation and distribution has gone through a complete over haul. Information generation is no more captured in the hands of few power centres and has expanded into a social platform. This growth has been primarily spear headed by citizen reporting, smart mobile devices and social media where information are generated by the people, for the people and about the people. This dissertation work attempts to model such a scenario where the temporal factor and the trustworthiness of such information is of utmost importance.
]]>
Sachu Thomas IsaacEffect of Circuit Structure on Path Delay Fault Testability in Vlsi Design.
https://digitalcommons.isical.ac.in/masters-dissertations/195
https://digitalcommons.isical.ac.in/masters-dissertations/195Thu, 30 Nov 2023 01:06:42 PST
Failure that cause logic circuits to malfunction at the desired clock rate and thus violate timing specifications are currently receiving much attention .Such failures are modeled as delay faults. They facilitate delay testing. Since design for testability is the approach followed now-a-days several synthesis for path delay fault (PDF) testability is studied in much depth. Local transformation is one such approach. A new approach for applying local transformations is considered here. This approach can be used along with the existing local transformation approaches to get better results.Additionally , a software implementation of the proposed algorithm is also considered here and the experimental results are quiet attractive in comparison to existing local transformation algorithms.
]]>
Biplab SarkarCategorization of Images Using Content-Based Features: A Data Mining Approach.
https://digitalcommons.isical.ac.in/masters-dissertations/193
https://digitalcommons.isical.ac.in/masters-dissertations/193Thu, 30 Nov 2023 01:06:41 PST
Images are being extensively used in every sphere of our life. Apart from overwhelming influence of television, common people look for images in newspapers, advertisements, item catalogues, entertainment, education, architecture, painting and many others. Professionals use image in criminology (e.g., fingerprint identification, face recognition), medicine (e.g., case-based diagnosis from radiographs or scan data), education (e.g., searching for material in Library), fashion design, historical archiving, fine arts and so on. Most of the cases the problem is to find a desired image from a large collection or, in other words, retrieve images similar to the image at hand from large number available in some collections. Image search and retrieval is a field of very active research since the 1970â€™s. However, the field has observed a steady exponential growth in recent years as a result of unparalleled increase in the volume of digital images. Thousands of images are generated everyday for different applications. These images are either stored in a local database or are available from remote ones. Thus a huge amount of information is out there and can easily be accessed through world-wide web. Professionals of various fields intend to access and utilize these images for their purpose. However, we cannot access to or make use of the information unless it is properly organized for efficient browsing and retrieval, because searching and locating a desired piece of image from varied and large collection usually result in a total frustration. Two major research communities, namely Database Management and Computer Vision, are putting considerable effort towards the solution of this problem. Accordingly two major approaches have emerged: one being text based and the other visual based respectively.Early systems of image retrieval exploited the capabilities of text based Database management Systems. Images are first manually annotated using a set of keywords that describe the content of the image best. Images are indexed and arranged using these keywords, finally images are retrieved based on text based query. Major research in this direction includes Data Modeling, Indexing Structure, Multi-dimensional Indexing, Efficient Searching and Query Design and Evaluation. However, these text-based image retrieval techniques face two major problems: labor intensiveness and annotation impreciseness. When image collection is large, enormous amount of man-hour is required to annotate those images manually. Problem became more and more acute since early 1990â€™s when world-wide web allow access remotely placed image databases. The second problem is more crucial and is due to semantic of image content. Because of rich content in the images and the subjectivity of human perception, same image may be perceived differently by different persons. As a result, same image may be annotated by different set of keywords by different persons. Thus image annotation in general is neither unique nor adequate; hence affects the performance of image retrieval system to a large extent. This leads to development and flourishing the alternate approach, namely Content Based Image Retrieval (CBIR) system1.2 What is CBIR? Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases. The term CBIR seems to have originated in 1992, when it was used by T. Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision.In CBIR systems the term â€œcontent-basedâ€ means that the search will analyze the actual contents of the image. The term 'content' in this context might refer colors, shapes, textures, or any other information that can be derived from the image itself. Without the ability to examine image content, searches must rely on metadata such as captions or keywords, which may be laborious or expensive to produce. There is growing interest in CBIR because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but requires humans to personally describe every image in the database. This is impractical for very large databases, or for images that are generated automatically, e.g. from surveillance cameras. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" avoid this problem but still face the same scaling issues.
]]>
Aditya N.Quantum Secret Sharing in a Distributed Quantum Network.
https://digitalcommons.isical.ac.in/masters-dissertations/192
https://digitalcommons.isical.ac.in/masters-dissertations/192Thu, 30 Nov 2023 01:06:41 PST
(Quantum computation and quantum information is the study of information procensing task that can be accomplished using quantum mechanical system.Like many simple but profound ideas it was a long time before anybody thought of doing information processing using quantum mechanical system.The story began at the turn of the twentieth century when a revolution was guing in science. Several problem arisen in physics. To explain those problems the modern theory of Quantum mechanics was introduced. Since then quantum mechanics has been an indispensable part of Science, and has been applied with enormous suceess to everything under and inside Sun, including the structure of the atom, superconductors, the structure of DNA , and the clementary particles of Nature.What is Quantum mechanics ? in a word quantum mechanics is a mathe- matical framework for the construction of physical thcory.For example Quantum electrodynamics can describe the interaction of atoms and light with accuracy. Quantum electrodynanics is built under the framework of quantum mechanies. The relation of a particular physical theory like Quantum electrodynamics with quantum mechanics is just as computer operating system is related to a specific application software.The rules of quantum mechanics are simple but even the experts find them connterintuitive. One of the major goals of quantum information and quantum computation in to develop tools which sharpen our intuition about quantum me- ehanies. In the early 1980 the interest arose whether it is possible to signal faster than light using quantum mechanics which is a big no-no according to Einsteins 1leory of relativity. This problem has a nice implication towarcds another famous problem of quantum mechanics can we clone an unknown quantum state ? If the answer is yes then it is possible to signal faster than light! fortunately it. was proved that unknown quantum state can not be cloned in general - a land- mark result of quantum mechanics which effectively supports Einsteins theory of relativity.Another related historical strand contributing to the development of quantum computation and quantum information is the interest dating to the 197ls , of obtaining complete control over single quautn Nysten.Sinee the 1974ls many techuique for controlling single quantam state has been developed. Quan- tum computation and Quantiun information naturally fits into this problem. Despite this intense interest , efforts to build quantum information processing Nystes have resulted in modest success to date. Small quantum computers, a-jualble of doiug dozen of operations on a few qubits( state of a two lewel qunntam inechnnienl syntem) represent the current nrt of praetical qunstism compating. Experimental prototype of quantun cryptography has been demonstrated and Ihna reached in the level of real world application.So far we have been talking about rules and power of quant um merhanies. But what this has to do with computer science? Let us now turn our attention to computer science - another triumph of twentieth century. The modern incarnation of computer science was announced by Alan Turing in a remarkable pajer in 1936. Turing developed a imodel of computation known an Turing machine, which now we know as programmable computer.
]]>
Partha MukhopadhyayInteger Linear Programming Based Scheduling for H.264 Video Decoding in Multi Core Processor.
https://digitalcommons.isical.ac.in/masters-dissertations/191
https://digitalcommons.isical.ac.in/masters-dissertations/191Thu, 30 Nov 2023 01:06:40 PST
Demand of high quality video based technologies are increasing with highdefinition televisions, video streaming through internet and many other applications. Compression ratio of the previous standards are not enough for these upcoming technologies. The latest video compression standard, ITU-T recommended H.264/AVC (also known as ISO/IEC 14496 (MPEG-4) Part 10 for Advanced Video Coding) is expected to become the video standard of choice in the coming years for its higher compression ratio and use of more efficient technologies.H.264/AVC is an open, licensed standard that supports the most efficient video compression techniques available today. The average bit rate reduction by H.264 encoder is of 80% compared to the Motion JPEG format and 50% the MPEG-4 Part 2 standard, without compromising the image quality. This means, much less network bandwidth and storage space are required for a video file; or in another way, much higher video quality can be achieved for a given bit rate.1.1 MotivationH.264/AVC is very appropriate for the applications like multimedia streaming, high quality video broadcasting, video storage in optical and magnetic discs. But, these applications requires high speed encoding and decoding of video data. H.264/AVC encoder and decoder both have a sequential, data dependent flow of execution. This property makes it difficult to leverage the potential performance gain that could be achieved by the use of emerging many core processors.Dedicated silicon implementations of H.264/AVC codecs are presently available which can perform 30fps encode/decode for 1080p video sequences. Hardware implementations for each new video compression standard on different platform is costly enough. That is why we need a parallel software implementation of H.264/AVC codec that can perform as efficiently as the hardware implemen5 tation and can run on different hardware platforms. If the hardware platform changes the software implementation needs smaller amount of change than dedicated silicon implementation.1.2 ScopeIn this project, we consider an H.264 decoder and explore the possibilities of parallelism. There are a couple of reasons behind taking up the decoder (and not the encoder) as part of this parallelization effort. Firstly, the encoding problem is a natively parallel one, and hence, lends itself more naturally to a parallelized execution environment and there are already numerous successful attempts in this direction. However, the decoding algorithm poses certain challenges to parallelization. Secondly, there is a decoding step inside the encoder as well and therefore, any success in parallelizing the decoder would naturally expedite the encoder as well.The key task in parallelization of the H.264 decoder is to find a scheduler, which can distribute the decoder flow of execution into several cores efficiently. The scheduler must consider data dependency issues as well as inter-communication and synchronization between the processing units. There are many proposed schedulers for this core allocation, using different strategies. Performance of these schedulers depends on the scalability and hardware utilization it can achieve.Our objective is to find a scheduler which can allocate the processor cores for decoding in most efficient way so that the time to decode is optimized.
]]>
Somabrata PramanikColor Image Compression Using Wavelets.
https://digitalcommons.isical.ac.in/masters-dissertations/190
https://digitalcommons.isical.ac.in/masters-dissertations/190Thu, 30 Nov 2023 01:06:40 PST
1.1 Image CompressionImage Compression refers to a process in which the amount of data used to represent image is reduced to meet a bit rate requirement, while the quality of the reconstructed image satisfies a requirement for a certain application and the complexity of computation involved is affordable for the application. Over the past few years thc world has witnessed a growing demand for visual based information and communications applications. As the demand for now applications and higher quality for cxisting applications continues to rise, the transmission and storage of the visual information becomes a critical issue. The reason is that the higher image or video quality requires larger volume of information. How ever transmission media have a finite and limited bandwidth. To illustrate this problem consider a typical (512x512) gray level (8-bit) image. This image has 20,97,152 bits. By using a 64 Kbit/s communication channel, it will take about 33 seconds to transmit the image. Where as this might be acceptable for one-time transmission of a single image, it would definitely not be acceptable for tele- conference applications, where some form of continuous motion is required. To store digital version of a 90 minute black and white movie, at 30 frames/sec, with each fran having (512x512x8) bits, would require 3.397386e+11, over 42 Gbytes. So efficie. image compression algorithms are appreciable.
]]>
Barindra Nath DuttaThe Bibliographic Citation Recommendation Problem.
https://digitalcommons.isical.ac.in/masters-dissertations/188
https://digitalcommons.isical.ac.in/masters-dissertations/188Thu, 30 Nov 2023 01:06:39 PST
An essential step in authoring a research paper is inclusion of appropriate references or citations. Incorporating relevant references increase academic weight of the paper by presenting links to similar contributions and highlighting the novelty of the work under discussion. This work is becoming increasingly more demanding with increase in volume of published works. Recommender systems for bibliographic citations aim to ease the burden on the author by suggesting possible references globally as well as contextually. More than 200 papers have been published in last two decades exploring various approaches to the problem. In spite of this, no definitive results are available about what approaches work best. Conflicting reports have been published regarding the relative effectiveness of content-based and collaborative filtering based techniques. Arguably the most important reason for this lack of consensus is the dearth of standardized test collections and evaluation protocols, such as those provided by TREC-like forums; forcing research workers to use their own data sets for experiments. A practice that makes objective comparison of techniques a near impossibility. Recent publication of â€œCiteseerX: A scholarly big data setâ€ makes available raw material for addressing the problem, pending making it into a standard test-evaluation framework. We discuss in this report our efforts in designing a test collection with a well defined evaluation protocol by solving problems with the data set, supplementing the data set with standard queries and their relevance judgments. We also report performances of some standard proposed recommendation approached on our test setup.
]]>
Kunal RayFinding 3D Structure of Proteins Using Characteristics of Short Sequences.
https://digitalcommons.isical.ac.in/masters-dissertations/189
https://digitalcommons.isical.ac.in/masters-dissertations/189Thu, 30 Nov 2023 01:06:39 PST
1.1 IntroductionProteins are the most structurally complex macromolecules known. They are long chain of molecules. They can be regarded as necklaces of 20 different amino acids that are arranged in different order to make chains of up to thousands of amino acids. The result is an extreme variety of proteins, each type with its own unique structure and function. In order to carry out their function, each protein must take a particular shape, known as its fold. When a protein is put into a solvent, within a very short time it takes a particular 3D shape. This self assembling process is called folding.Sometimes the proteins misfold (i.e. do not fold correctly) and they can aggregate. Aggregation of misfolded proteins is believed to be the cause of some disorders such as Alzheimers lissuses, ParkiIns clisense, prion disease (e.g., mad cow discase) and soue caneers. The diverse range of diseases that results from protein misfolding has made this a subject of in- tense investigation: learning how proteins fold will teach us how to design protein-sized nano-machines that can do similar tasks and it will help us to prevent or reverse diseases in which proteins have departed from the correct folding route. However, it is very time consuming to find the 3D structure of a protein using X-ray CrystallographÄ±y or Nuclear Magnetic Resonance(NMR) imaging. Hence, researchers are working on finding computational methods for protein fold prediction. In this thesis we shall propose some methods to predict 3D structures of proteins from its amino acid sequences exploiting statistical information available in proteins with known 3D structures. In particular we made the following contributions.1. We proposed a mechanism for generation of self-orgnnizing map for structures called Structural Self-Organizing Map(SSOM). This method can be applied in arcas other than protein folding also.2. We proposed a modified form of mountaiu clustering called Structural Mountain Clustering Method(SMCM) that is very effective for the prob- lem understudy and is simpler.3. The Structural Self-Organizing Mlap is then augmented by two subelus- tering methods resulting in two schemes for building block generation.4. We applied these three new methods to find representative hexaners from a given data base and compared performanwe of the proposeed schemes to an existing method.5. We then used the extracted hexamers to reconstruct some proteins. The results are quite good.
]]>
Sudeepta Kumar OjhaPolynomials Having Sparse Multiples.
https://digitalcommons.isical.ac.in/masters-dissertations/187
https://digitalcommons.isical.ac.in/masters-dissertations/187Thu, 30 Nov 2023 01:06:38 PST
Stream ciphers form an important class of secret-key encryption schemes. They are widely used in applications since they present many advantages: they are usually faster than common block ciphers and they have less complex hard- ware circuitry. Moreover, their use is particularly wellsuited when errors may occur during the transmission because they avoid error propagation. In a binary ad ditive stream cipher the ciphertext is obtained by adding bitwise the plaintext to a pseudorandom sequence s, called the key stream (or the running-key). The runningkey is produced by a pseudorandom generator whose initialization is the secret key shared by the users. Most attacks on such ciphers therefore consist in recovering the initialization of the pseudorandom generator from the knowledge of a few cipher text bits (or of some bits of the running-key in known-plaintext attacks).Linear feedback shift registers (LFSRS) are the basic components of most keystream generators since they are appropriate to hardware implementations, produce sequences with good statistical properties and can be easily analyzed.Linear Feedback Shift Register (LFSR) is a system which generates a pseudo- random bit-sequence using a binary recurrence-relation of the forman = C1an-1 + c2an-2 + ..+ Ck-1an-k+1 + Ckan-k (1.1)where Ck = 1 and for 1 â‰¤ i < k, ciÆ{0, 1}. The length of a LFSR correspond to the order k of the linear-recurrence-relation used. The number of taps t of an LFSR is the number of non-zero bits in { c1, c2,...,Ck}. The successive bits of the LFSR are emitted using the chosen recurrence relation after intialising the seed (ao, a1, a2, , ak-1) of LFSR.The (1.1) is related to the following polynomial over GF(2)C(x) = 1+ c1z + c2xÂ² + ..+ Ckxk(1.2The (1.2) is called the Connection Polynomial of the LFSR.The LFSR-generated sequence of the linear-recurrence-relation(Irr) related to a connection polynomial is same as the one for the corresponding Irr of multiple polynomial of the connection polynomial.In the stream-cipher systems, the key-stream is usually generated by com- bining the outputs of more than one LFSR using a nonlinear boolean function. This arrangement significantly increases the robustness of the system against possible attacks. This keystream is bitwise XORed with the message bitstream to produce the cipher. The decryption machinery is identical to the encryption machinery (see Figure 1.1).In such a systen, n bits from n different LFSRS are generated at each clock.These n bits are the input to the boolean function F(X1, X2, X3,..., Xn). The output of the boolean function F is the key-straem K.The cipher stream C is the XORing of K and the message stream M. i.e., C = K eM.Consider the connection polynomial of degree dxd+ad-1xd-1+ad-2xd-2+...+a1x+1where ai Æ {0,1},i, â‰¤ i â‰¤ d-1. We take connection polynomial of size d with least significant bit starts from the right hand side and the most significant bit at the leftmost postion. There is a tap at ith position if and only if a = 1.
]]>
Suresh Kumar R.Learning Conoidal Structures in Connectionist Framework.
https://digitalcommons.isical.ac.in/masters-dissertations/186
https://digitalcommons.isical.ac.in/masters-dissertations/186Thu, 30 Nov 2023 01:06:38 PST
A two layer neural network model is designed which accepts image coordinates as the input and learns the parametric form of conoidal shapes (lines/circles/ellipses) adaptively. It provides an efficient representation of visual information embedded in the connection weights and the parameter of the processing elements. It not only reduces the large space requirements as in classical Hough transform, but also represents parameters with high precision, even in presence of noise. The performance of the methodology is compared with other existing algorithms and has been found to excel over those algorithms in many cases.
]]>
Anirban DasHeuristic Algorithms for Determining Feasible Routing Order in Non Sliceable Floorplans.
https://digitalcommons.isical.ac.in/masters-dissertations/185
https://digitalcommons.isical.ac.in/masters-dissertations/185Thu, 30 Nov 2023 01:06:38 PST
In this dissertation we make an experimental study of the algorithms suggested in CSB913 for determining a feasible channel routing or der in nonslicible VLSI floorplans by reserving channels. Since the problem is NP-complete, a comparative study of the different heuristics based on time complexity and quality of the solution has been done.
]]>
Rajan GangadharanRecommender System for Bibliographic Citations.
https://digitalcommons.isical.ac.in/masters-dissertations/183
https://digitalcommons.isical.ac.in/masters-dissertations/183Thu, 30 Nov 2023 01:06:37 PST
While writing research papers, we wish to find the best possible references for what we have written in our paper. Finding them manually is both time consuming and difficult. A citation recommender system takes a research paper draft as input and outputs citation recommendations. The recommenderâ€™s job is challenging as the recommendation should not only be relevant to the paper in general, but also should be relevant to the local context of the paper in composition.1.1 Introduction to Recommender SystemsRecommender systems is one of the fields which has grown in parallel to the web. It is also a field that grew out of necessity, as the amount of information available on the web has become increasingly enormous. John Naisbitt once said: â€œWe are drowning in information but starved for knowledge.â€[9] So, it is important to have good technologies that can translate information to knowledge. One such technology that has become successful is Recommender Systems. M. Deshpande and G. Karypis defined Recommender Systems as: â€œa personalized information filtering technology used to either predict whether a particular user will like a particular item (prediction problem) or to identify a set of N items that will be of interest to a certain user (top-N recommendation problem)â€[3]There are many approaches to build Recommender Systems. These approaches are typically classified as follows :â€¢ Content-based : Recommendations are selected based on the target userâ€™s previously liked content.â€¢ Collaborative Filtering : Recommendations are selected based on items liked by other users with similar tastes and preferences.â€¢ Hybrid approaches : They combine Collaborative Filtering and Content Based Methods.1.2 Introduction to Citation recommender systemsCurrent citation recommender systems can broadly be classified into three categories.â€¢ The first category of recommenders try to complete the citation list of an input text. Here, some of the citations are already specified by the author. For example, McNee et al proposed an approach using collaborative filtering that falls into this category. Their algorithm analyses the citation graph and builds ratings. The details of this algorithm are discussed in the next chapter[10].â€¢ The second category of recommenders receive just a text as input and generate recommendations from them. For example, Strohman et al. used a two-step recommendation algorithm. They first generated a candidate list of recommendations using the content and citation graph and in the second step, they ranked these recommendations[14].â€¢ The third category of recommenders, placeholders, ie places where citations should be added, are also specified in the text. For example, He et al proposed an approach which proposed recommendations for specified locations[4].Our recommender falls into the third category
]]>
Jayadev DasikaAlgorithms on Geometric Graphs.
https://digitalcommons.isical.ac.in/masters-dissertations/184
https://digitalcommons.isical.ac.in/masters-dissertations/184Thu, 30 Nov 2023 01:06:37 PST
Geometric intersection graphs are intensively studied both for their practical motivations and interesting theoretical properties. Map labelling, frequency allocation in wireless network, resource allocation in line network are some of the areas where geometric intersection graphs play an important role in formulating problems. Here two types of problems are usually considered: (i) characterization problems, and (ii) solving some useful optization problems. In the characterization problem, given an arbitrary graph, one needs to check whether it belongs to the intersection graph of a desired type of objects. The second kind of problem deals with designing efficient algorithm for solving some useful optization problems for an intersection graph of a known type of objects. It is to be noted that several practically useful optization problems, for example, finding the largest clique, minimum vertex cover, maximum independent set, etc. are NP-hard for general graph. There are some problems for which getting an efficient approximation algorithm with good approximation factor is also very difficult. In this area of research, the geometric properties of the intersecting objects are used to design efficient algorithm for these optimization problems. The characterization problem is important in the sense that for the intersection graph of some types of objects, efficient algorithms are sometimes already available for solving the desired optimization problem.The simplest type of geometric intersection graph is the interval graph, which is obtained by the overlapping information of a set of intervals on a real line. The characterization problem can be easily solved in O(|V | + |E|) time by showing that the graph is chordal and its complementary graph is a comparability graph [Gol04].All the standard graph-theoretic optimization probelms, for example, finding minimu vertex cover, maximum independent set, largest clique, minimum clique cover, minimum coloring, etc, can be solved in polynomial time for the interval graph [Gol04]. Any graph G = (V, E) can be represented as the intersection graph of a set of axis parallel boxes in some dimension. The boxicity of a graph with n nodes is the minimum dimension d such that the given graph can be represented as an intersection graph of n axis parallel boxes in dimension d. A graph has boxicity at most one if and only if it as an interval graph. Many optimization problems can be solved or approximated more efficiently for graphs with bounded boxicity. For instance, the maximum clique problem for the intersectio graph of axis parallelrectangles in 2D can be computed in O(n log n) time using a plane sweep strategy [NB95].The maximum independent set of rectangle intersection graph is extensively used in map labelling. The maximum independent set for equal height rectangle intersection graph are shown to admit a PTAS. A 2-factor approximation algorithm is very easy to get in O(n log n) time [AvKS98]. In Chapter 4 we propose that piercing set for bounded height rectangles is fixed parameter tractable.A graph G = (V, E) is said to be a disk graph if it is obtained from the intersection of a set of disks. Unit disk graphs play important role in formulating different important problems in mobile ad hoc network. In mobile network, the base stations can be viewed as nodes on unit disk graph; the range of each base station is the same. Different problems on this network can be formulated as the graph-theoretic problems on unit disk graph. Recognizing whether an arbitrary graph is unit disk graph is NP-cmplete [BK98]. Maximum clique can be computed in polynomial time for unit disk graph[CCJ90].In Chapter 3 we propose a PTAS for maximum independent set of unit disk graph. A 3-factor approximation algorithm for minimum clique cover of unit disk graph is also described in that chapter. We also propose a 4-factor approximation algorithm for the minimum piercing set of points for a set of unit disks distributed randomly on the plane. Here the piercing points can be chosen to be any point on the plane. In the discrete piercing set problem, a point set P is given. The unit circles are all centered at the points in P. The objective is to choose the minimum set of points in P to pierce all the circles. We propose a 15-factor approximation algorithm for this problem
]]>
Minati DeDetection and Classification of Psoriasis in Histopathology Images.
https://digitalcommons.isical.ac.in/masters-dissertations/182
https://digitalcommons.isical.ac.in/masters-dissertations/182Thu, 30 Nov 2023 01:06:36 PST
Recent advances in imaging techniques has lead better visual representation of the internals of our body for clinical analysis and medical intervention, however the task is tedious and subject to interpreter variability. An automated quantitative analysis of the images would not only relieve us of the human effort, but considerably reduce the inaccuracies involved. The current work explores the techniques of image processing and analysis to extract vital information out of psoriasis histopathology images, and discuss a method of classifying these into diseased or non diseased classes.
]]>
Kushal SenVoronoi Game on Graphs.
https://digitalcommons.isical.ac.in/masters-dissertations/181
https://digitalcommons.isical.ac.in/masters-dissertations/181Thu, 30 Nov 2023 01:06:36 PST
Voronoi game is a geometric model of competitive facility location problem, where each market player comes up with a set of possible locations for placing their facilities. The objective of each player is to maximize the region occupied on the underlying space. In this thesis we consider one round Voronoi game with two players. Here the underlying space is a road network, which is modeled by a graph embedded in R2. In this game each of the players places a set of facilities and the underlying graph is subdivided according to the nearest neighbour rule. The player which dominates the maximum region of the graph wins the game. This thesis mainly deals with the problem of determining optimal strategies of the players. We characterize the optimal facility locations of second player given a placement by first player. Using this result we design a polytime algorithm for determining the optimal strategy of second player on trees. We also show that the same problem is P-hard when the underlying space is a general graph. Moreover we present a 1.58 factor approximation algorithm for the above mentioned problem. Then we concentrate on optimal strategy of first player. We give a lower bound on the optimal payoff of first player. We discuss optimal strategy of first player for (1, 1) and (2, 1) game on tree network. Then we characterize optimal facility locations of first player for (1, 1) game on graph network.
]]>
Sayan BandyapadhyayInteractive Co-Segmentation Using Histogram Matching and Bipartite Graph Construction.
https://digitalcommons.isical.ac.in/masters-dissertations/179
https://digitalcommons.isical.ac.in/masters-dissertations/179Thu, 30 Nov 2023 01:06:35 PST
Co-segmentation is defined as jointly partitioning multiple images having same or similar objects of interest into foreground and complementary part is mapped as background.In this thesis a new interactive co-segmentation method using a global energy function and a local smooth energy function with the help of histogram matching is being proposed. The global scribbled energy takes the help of histograms of the regions in the image to be co-segmented and the user scribbled images to estimate the probability of each region belonging either to foreground or background region. The local smooth energy function helps in estimating the probability of regions having similar colour appearance.To further improve the quality of the segmentation, bipartite graph is constructed using the segments. The algorithm has been implemented on iCoseg and MSRC benchmark data sets and the experimental results show significant good results compared to many state-of-the-art unsupervised co-segmentation and supervised interactive co-segmentation methods.
]]>
Harsh Bhandari