Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala
Document Type
Conference Article
Publication Title
ACM International Conference Proceeding Series
Abstract
The evaluation of content moderation systems requires reliable benchmark data. This task becomes particularly formidable for low-resource languages, where obtaining or curating such data poses significant challenges. Addressing this issue, HASOC 2023 organised various shared tasks focused on identifying offensive content in low-resource languages. This paper reports on tasks for hate speech detection in several Indo-Aryan languages - Assamese, Bengali, Gujarati, and Sinhala as well as a Sino-Tibetan language, Bodo, for which limited linguistic resources currently exist. The shared task involved the compilation of multiple datasets. In total, nearly 200 runs were submitted by more than 30 teams, which are presented and analysed in this report.
First Page
13
Last Page
15
DOI
10.1145/3632754.3633278
Publication Date
12-15-2023
Recommended Citation
Ranasinghe, Tharindu; Ghosh, Koyel; Pal, Aditya Shankar; Senapati, Apurbalal; Dmonte, Alphaeus Eric; Zampieri, Marcos; Modha, Sandip; and Satapara, Shrey, "Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala" (2023). Conference Articles. 496.
https://digitalcommons.isical.ac.in/conf-articles/496