Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali

Published in Submitted and accepted as workshop paper in Chipsal, COLING 2025, 2025

In this work, we evaluate the feasibility of domain adaptive pre-training in a low-resource setting, namely the Nepali language. We use synthetic data to continue training Llama 3 8B to adapt it to the Nepali language in a 4-bit QLoRA setting. We evaluate the adapted model on its performance, forgetting, and knowledge acqui- sition. We compare the base model and the final model on their Nepali generation abili- ties, their performance on popular benchmarks, and run case-studies to probe their linguistic knowledge in Nepali.

Recommended citation: Duwal, S., Prasai, S., & Manandhar, S. (2024). Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali. arXiv preprint arXiv:2412.13860.
Download Paper