huihui-ai
/

Qwen2.5-7B-Instruct-abliterated-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

huihui-ai commited on 3 days ago

Commit

d51e71c

•

1 Parent(s): 4b9b4b2

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -96,3 +96,16 @@ while True:
     print(f"Qwen: {response}")
 ```

     print(f"Qwen: {response}")
 ```
+## Evaluations
+The following data has been re-evaluated and calculated as the average for each test.
+| Benchmark   | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated-v2 | Qwen2.5-7B-Instruct-abliterated |
+|-------------|---------------------|------------------------------------|---------------------------------|
+| IF_Eval     | 76.44               | 77.82                              | 76.49                           |
+| MMLU Pro    | 43.12               | 42.03                              | 41.71                           |
+| TruthfulQA  | 62.46               | 57.81                              | 64.92                           |
+| BBH         | 53.92               | 53.01                              | 52.77                           |
+| GPQA        | 31.91               | 32.17                              | 31.97                           |
+The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/blob/main/eval.sh)