Lexical Reranking of Semantic Retrieval (LeSeR) for Regulatory Question Answering

Accepted at RegNLP @ COLING 2025 | Secured 4th position in the workshop. Our system for the COLING 2025 RegNLP RIRAG challenge focused on advanced information retrieval and answer generation in regulatory domains. We combined embedding models (Stella, BGE, CDE, Mpnet) with fine-tuning and reranking to retrieve relevant documents. Our novel approach, LeSeR, achieved strong results with a recall@10 of 0.8201 and map@10 of 0.6655. This work demonstrates the potential of NLP techniques in regulatory applications, particularly for retrieval-augmented generation systems, and identifies areas for future improvements in robustness and domain adaptation

December 2024 · Jebish Purbey, Drishti Sharma, Siddhant Gupta, Khawaja Murad, Siddartha Pullakhandam, Ram Mohan Rao Kadiyala

Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages

Accepted at Chipsal @ COLING 2025. This study addresses hate speech detection and target identification in Devanagari-scripted languages (Hindi, Marathi, Nepali, Bhojpuri, Sanskrit). Subtask B focuses on detecting hate speech, while Subtask C identifies specific targets, such as individuals or communities. The proposed MultilingualRobertaClass model, based on the ia-multilingual-transliterated-roberta transformer, uses contextualized embeddings for multilingual and transliterated contexts. It achieved 88.40% accuracy in Subtask B and 66.11% in Subtask C on the test set.

December 2024 · Siddhant Gupta, Siddh Singhal, Azmine Toushik Wasi

Sequential Learning for Claim Verification and Explanation Generation in Financial Domains

Accepted at FINLP-FNP-LLMFinLegal @ COLING 2025 | Secured 3rd position in the workshop. Our system for the COLING 2025 FMD challenge focused on detecting financial misinformation using large language models (Qwen, Mistral, Gemma-2) combined with pre-processing and sequential learning. It not only classified fraudulent content with an F1-score of 0.8283 but also generated clear explanations, achieving a ROUGE-1 score of 0.7253. This work demonstrates the potential of LLMs in combating financial misinformation, improving transparency, and highlights areas for future enhancements in robustness and domain adaptation.

December 2024 · Jebish Purbey, Siddhant Gupta, Nikhil Manali, Siddartha Pullakhandam, Drishti Sharma, Ashay Srivastava, Ram Mohan Rao Kadiyala