--- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct library_name: peft datasets: - sanaa-11/math-dataset language: - fr --- # Model Card for LLaMA 3.1 Fine-Tuned Model ## Model Details ### Model Description - **Developed by**: Sanaa Abril - **Model Type**: Fine-tuned Causal Language Model - **Language(s) (NLP)**: French - **License**: - **Finetuned from model**: Meta LLaMA 3.1 8B Instruct ### Model Sources [optional] - **Repository**: https://huggingface.co./sanaa-11/mathematic-exercice-generator/tree/main - ## Uses ### Direct Use - **Primary Application**: This model is primarily used for generating math exercises tailored to Moroccan students in French, based on specific lessons and difficulty levels. - **Example Use Case**: Educators can input lesson topics to generate corresponding exercises for classroom use or online learning platforms. ### Downstream Use [optional] - **Potential Applications**: The model can be extended or adapted to create exercises in other languages or for different educational levels. ### Out-of-Scope Use - **Not Suitable For**: The model is not designed for high-stakes assessments, as it may generate exercises that require further validation by subject matter experts. ## Bias, Risks, and Limitations - **Bias**: The model may inherit biases from the data it was trained on, potentially generating exercises that reflect unintended cultural or linguistic biases. - **Risks**: There is a risk of generating mathematically incorrect exercises or exercises that do not align with the intended curriculum. - **Limitations**: The model's accuracy and relevance may decrease when generating exercises outside of its training domain or when applied to advanced mathematical topics not covered during fine-tuning. ### Recommendations - **For Educators**: It is recommended to review the generated exercises for correctness and relevance before using them in a classroom setting. - **For Developers**: Fine-tune the model further or adjust the training data to mitigate any biases and improve the quality of the generated content. ## How to Get Started with the Model Use the following code snippet to load and generate exercises using the model: ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel, PeftConfig import torch # Base model name model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct" # Load the base model without specifying rope_scaling model = AutoModelForCausalLM.from_pretrained( model_name, device_map="auto", # Adjust based on your environment offload_folder="./offload_dir", # Specify a folder for offloading if necessary torch_dtype=torch.float16, # Use float16 for better performance on compatible hardware revision="main" # Specify the correct revision if needed ) # Load the adapter configuration config = PeftConfig.from_pretrained("sanaa-11/mathematic-exercice-generator") # Load the adapter weights into the model model = PeftModel.from_pretrained(model, "sanaa-11/mathematic-exercice-generator") # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) ``` ``` generated_text = "" prompt = "Fournis un exercice basé sur la vie reelle de difficulté moyenne de niveau 2 annee college sur les fractions." for _ in range(5): inputs = tokenizer(prompt + generated_text, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_length=1065, temperature=0.7, top_p=0.9, num_beams=5, repetition_penalty=1.2, no_repeat_ngram_size=2, pad_token_id=tokenizer.eos_token_id, early_stopping=False ) new_text = tokenizer.decode(outputs[0], skip_special_tokens=True) generated_text += new_text print(new_text) ``` ## Training Details ### Training Data - **Dataset**: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage. ### Training Procedure #### Preprocessing [optional] - **Data Cleaning**: Text normalization, tokenization, and padding were applied to prepare the data. - **Tokenization**: The French tokenizer provided by Hugging Face was used to process the text data. ### Training Hyperparameters - **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources. - **Batch Size**: 1 (with gradient accumulation steps of 8) - **Number of Epochs**: 8 - **Learning Rate**: 5e-5 ## Evaluation ### Testing Data, Factors & Metrics **Testing Data** - A separate subset of 10% of the dataset was reserved for evaluation. **Factors** - **Complexity of Generated Exercises**: Exercises were evaluated based on their complexity relative to the intended difficulty level. **Metrics** - **Training Loss**: The loss measured during training. - **Validation Loss**: The loss measured on the validation dataset during training. **Results** - **Training and Validation Loss**: The model was evaluated based on training and validation loss over 8 epochs. The results indicate that the model's performance improved significantly after the first few epochs, with a steady decrease in both training and validation loss. The final validation loss achieved was 0.154888, indicating a good fit to the validation data without significant overfitting. ### Summary **Model Examination** - The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset. ## Environmental Impact **Carbon Emissions** - **Hardware Type**: Tesla T4 GPU - **Hours Used**: 12 hours - **Cloud Provider**: Kaggle - **Carbon Emitted**: [Can be estimated using the Machine Learning Impact calculator by Lacoste et al. (2019)] ### Technical Specifications [optional] **Model Architecture and Objective** - The model is based on the LLaMA 3.1 architecture, fine-tuned to generate text in French for educational purposes, specifically math exercises. **Compute Infrastructure** - The model was trained on Kaggle’s free-tier environment, leveraging a single Tesla T4 GPU. **Hardware** - **GPU**: Tesla T4 with 16GB RAM **Software** - **Transformers Version**: 4.44.0 - **PEFT Version**: 0.12.0 ### Citation [optional] **BibTeX**: ```bibtex @misc{your_name_2024_model, author = {Sanaa Abril}, title = {Fine-Tuned LLaMA 3.1 for Generating Math Exercises}, year = {2024}, publisher = {Hugging Face}, note = {\url{https://huggingface.co./sanaa-11/mathematic-exercice-generator}} } **APA**: Abril, S. (2024). Fine-Tuned LLaMA 3.1 for Generating Math Exercises. Hugging Face. https://huggingface.co./sanaa-11/mathematic-exercice-generator ### More Information [optional] - For further details or questions, feel free to reach out to the model card authors. ### Model Card Authors [optional] - **Sanaa Abril** - sanaa.abril@gmail.com ### Framework versions - **Transformers**: 4.44.0 - **PEFT**: 0.12.0