In recent years, the development of large pretrained language models has been a major advancement in the field of artificial intelligence. These models, initially trained on general domain data, are fine-tuned for specific applications. However, as these models grow in size, the traditional approach of full fine-tuning, which involves training all the models’ parameters, is becoming impractical. To address this challenge, various Parameter-Efficient Fine-Tuning (PEFT) methods have been introduced. Among them, Low-Rank Adaptation (LoRA) is an approach that maintains the pretrained model weights frozen and incorporates compact low-rank decomposition matrices into each layer of the model. This technique significantly reduces the requirement for computational resources, notably the number of GPUs needed. Furthermore, models fine-tuned using LoRA can be deployed in parallel in a production environment with remarkable efficiency, making them highly suitable for real-world applications.