sanaa-11 commited on
Commit
5f2f211
1 Parent(s): 7427532

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -3,6 +3,8 @@ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
3
  library_name: peft
4
  datasets:
5
  - sanaa-11/math-dataset
 
 
6
  ---
7
  # Model Card for LLaMA 3.1 Fine-Tuned Model
8
 
@@ -91,7 +93,7 @@ for _ in range(5):
91
  ## Training Details
92
 
93
  ### Training Data
94
- - **Dataset**: The model was fine-tuned on a custom dataset consisting of 11,106 rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French.
95
 
96
  ### Training Procedure
97
 
@@ -102,9 +104,9 @@ for _ in range(5):
102
  ### Training Hyperparameters
103
  - **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
104
  - **Batch Size**: 1 (with gradient accumulation steps of 8)
105
- - **Number of Epochs**: 10
106
  - **Learning Rate**: 5e-5
107
- - **Optimizer**: AdamW
108
 
109
  ## Evaluation
110
 
@@ -126,7 +128,7 @@ for _ in range(5):
126
  ### Summary
127
  **Model Examination**
128
  - The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
129
- - While F1 score and perplexity were not used in this evaluation, the training and validation losses provide a strong indication of the model's performance and its potential for generating accurate and relevant math exercises.
130
 
131
  ## Environmental Impact
132
  **Carbon Emissions**
 
3
  library_name: peft
4
  datasets:
5
  - sanaa-11/math-dataset
6
+ language:
7
+ - fr
8
  ---
9
  # Model Card for LLaMA 3.1 Fine-Tuned Model
10
 
 
93
  ## Training Details
94
 
95
  ### Training Data
96
+ - **Dataset**: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage.
97
 
98
  ### Training Procedure
99
 
 
104
  ### Training Hyperparameters
105
  - **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
106
  - **Batch Size**: 1 (with gradient accumulation steps of 8)
107
+ - **Number of Epochs**: 8
108
  - **Learning Rate**: 5e-5
109
+
110
 
111
  ## Evaluation
112
 
 
128
  ### Summary
129
  **Model Examination**
130
  - The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
131
+
132
 
133
  ## Environmental Impact
134
  **Carbon Emissions**