Conference Articles

NLP-IISERB@Simpletext2022: To Explore the Performance of BM25 and Transformer Based Frameworks for Automatic Simplification of Scientific Texts

Sourav Saha, Indian Statistical Institute, Kolkata
Dwaipayan Roy, Indian Institute of Science Education and Research Kolkata
B. Yuvaraj Goud, Indian Institute of Science Education and Research Bhopal
Chethan S. Reddy, Indian Institute of Science Education and Research Bhopal
Tanmay Basu, Indian Institute of Science Education and Research Bhopal

Document Type

Conference Article

Publication Title

CEUR Workshop Proceedings

Abstract

CLEF SimpleText 2022 lab focuses on developing effective systems to identify relevant passages from a given set of scientific articles. The lab has organized three tasks this year. Task 1 is focused on passage retrieval from the given data for a query text. These passages can be complex and hence require further simplification to be carried out in tasks 2 and 3. The BioNLP research group at the Indian Institute of Science Education and Research Bhopal (IISERB) in collaboration with two different information retrieval research groups at IISER Kolkata and ISI Kolkata participated only in Task 1 of this challenge and submitted three runs using three different retrieval models. The paper explores the performance of these retrieval models for the given task. We used a standard BM25 model as our first run to identify 1000 relevant passages for each query. Moreover, the passages for each query were ranked based on their similarity scores generated by the BM25 model. For our second run, we used a BERT (Bidirectional Encoder Representations from Transformers) based re-ranking method, called as Mono-BERT to further rank the 1000 passages retrieved by our first run for each query. A pre-trained sequence to sequence model based re-ranking method, called MonoT5 was used as our third run to reorder the 1000 passages retrieved by the Mono-BERT model for each query. As the official results of this task are not yet announced, we cannot explore the performance of our submissions. However, we have manually checked the retrieved results of many queries for each run, which indicate that the performance improved from run 1 to run 2 and further to run 3.

First Page

2852

Last Page

2857

Publication Date

1-1-2022

Recommended Citation

Saha, Sourav; Roy, Dwaipayan; Goud, B. Yuvaraj; Reddy, Chethan S.; and Basu, Tanmay, "NLP-IISERB@Simpletext2022: To Explore the Performance of BM25 and Transformer Based Frameworks for Automatic Simplification of Scientific Texts" (2022). Conference Articles. 466.
https://digitalcommons.isical.ac.in/conf-articles/466

This document is currently not available here.

COinS

Conference Articles

NLP-IISERB@Simpletext2022: To Explore the Performance of BM25 and Transformer Based Frameworks for Automatic Simplification of Scientific Texts

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Conference Articles

NLP-IISERB@Simpletext2022: To Explore the Performance of BM25 and Transformer Based Frameworks for Automatic Simplification of Scientific Texts

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links