huihui-ai commited on
Commit
d51e71c
1 Parent(s): 4b9b4b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -96,3 +96,16 @@ while True:
96
  print(f"Qwen: {response}")
97
 
98
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  print(f"Qwen: {response}")
97
 
98
  ```
99
+
100
+ ## Evaluations
101
+ The following data has been re-evaluated and calculated as the average for each test.
102
+
103
+ | Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated-v2 | Qwen2.5-7B-Instruct-abliterated |
104
+ |-------------|---------------------|------------------------------------|---------------------------------|
105
+ | IF_Eval | 76.44 | 77.82 | 76.49 |
106
+ | MMLU Pro | 43.12 | 42.03 | 41.71 |
107
+ | TruthfulQA | 62.46 | 57.81 | 64.92 |
108
+ | BBH | 53.92 | 53.01 | 52.77 |
109
+ | GPQA | 31.91 | 32.17 | 31.97 |
110
+
111
+ The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/blob/main/eval.sh)