metadata
license: apache-2.0
datasets:
- Intel/orca_dpo_pairs
- Locutusque/Hercules-v3.0
language:
- en
tags:
- conversational
inference:
parameters:
do_sample: true
temperature: 0.8
top_p: 0.95
top_k: 40
min_new_tokens: 2
max_new_tokens: 250
repetition_penalty: 1.1
NeuralReyna-Mini-1.8B-v0.2
Description
Taken aloobun/Reyna-Mini-1.8B-v0.2 and further fine-tuned it using DPO using the Intel/orca_dpo_pairs dataset.
This model has capabilities in coding, math, science, roleplay, and function calling.
This model was trained on OpenAI's ChatML prompt format.
Evaluation
GPT4ALL:
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 0 | acc | 0.3208 | ± | 0.0136 |
none | 0 | acc_norm | 0.3336 | ± | 0.0138 | ||
arc_easy | 1 | none | 0 | acc | 0.6035 | ± | 0.0100 |
none | 0 | acc_norm | 0.5833 | ± | 0.0101 | ||
boolq | 2 | none | 0 | acc | 0.6526 | ± | 0.0083 |
hellaswag | 1 | none | 0 | acc | 0.4556 | ± | 0.0050 |
none | 0 | acc_norm | 0.6076 | ± | 0.0049 | ||
openbookqa | 1 | none | 0 | acc | 0.2600 | ± | 0.0196 |
none | 0 | acc_norm | 0.3460 | ± | 0.0213 | ||
piqa | 1 | none | 0 | acc | 0.7236 | ± | 0.0104 |
none | 0 | acc_norm | 0.7307 | ± | 0.0104 | ||
winogrande | 1 | none | 0 | acc | 0.6062 | ± | 0.0137 |
Disclaimer
This model may have overfitted to the DPO training data, and may not perform well.
Contributions
Thanks to @aloobun and @Locutusque for their contributions to this model.