File size: 3,790 Bytes
e6e4843
 
 
 
ed8a3f0
 
e6e4843
 
 
a2d076b
e6e4843
 
 
 
 
4b2fb94
 
e6e4843
 
 
 
 
8cc1383
 
e6e4843
 
 
 
 
 
 
 
 
 
 
f914883
e6e4843
 
 
 
 
 
f914883
 
 
 
e6e4843
f914883
e6e4843
f914883
 
 
 
 
 
e6e4843
 
 
ef63ef7
f914883
4b2fb94
 
8cc1383
 
 
 
 
 
 
 
 
 
 
a2d076b
34974ae
a2d076b
34974ae
a2d076b
34974ae
 
 
 
a2d076b
6220554
2fcb077
6220554
 
 
 
a2d076b
e6e4843
 
 
 
 
016dfa2
e6e4843
 
 
016dfa2
 
 
 
 
 
 
e6e4843
 
016dfa2
e6e4843
016dfa2
e6e4843
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
license: apache-2.0
language:
- en
tags:
- merge
---
<!-- header start -->
<!-- 200823 -->

<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://github.com/janhq/jan/assets/89722390/35daac7d-b895-487c-a6ac-6663daaad78e" alt="Jan banner" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>

<p align="center">
    <a href="https://jan.ai/">Jan</a
> 
    - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
</p>
<!-- header end -->

# Model Description
This model uses the `Slerp` merge method from the best models on 14th Dec on the [
OpenLLM Leaderboard](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard):
1. [viethq188/LeoScorpius-7B-Chat-DPO](https://huggingface.co./viethq188/LeoScorpius-7B-Chat-DPO)
2. [GreenNode/GreenNodeLM-7B-v1olet](https://huggingface.co./GreenNode/GreenNodeLM-7B-v1olet)

- base model: [GreenNode/GreenNodeLM-7B-v1olet](https://huggingface.co./GreenNode/GreenNodeLM-7B-v1olet)

The yaml config file for this model is here:

```yaml
slices:
  - sources:
      - model: viethq188/LeoScorpius-7B-Chat-DPO
        layer_range: [0, 32] 
      - model: GreenNode/GreenNodeLM-7B-v1olet
        layer_range: [0, 32]
merge_method: slerp
base_model: GreenNode/GreenNodeLM-7B-v1olet
parameters:
  t:
    - filter: lm_head 
      value: [0.55]
    - filter: embed_tokens
      value: [0.7]
    - filter: self_attn
      value: [0.65, 0.35]
    - filter: mlp
      value:  [0.35, 0.65]
    - filter: layernorm
      value: [0.4, 0.6]
    - filter: modelnorm
      value: [0.6]
    - value: 0.5 # fallback for rest of tensors
dtype: bfloat16
```

Thank you [Undi95](https://huggingface.co./Undi95) for the secret sauce and [Charles Goddard](https://huggingface.co./chargoddard) for mergekit.

# Prompt template

Work best on:

```
{system_message}
### Instruction:
{prompt}

### Response:

```

# Run this model
You can run this model using [Jan Desktop](https://jan.ai/) on Mac, Windows, or Linux.

Jan is an open source, ChatGPT alternative that is:

- ๐Ÿ’ป  **100% offline on your machine**: Your conversations remain confidential, and visible only to you.
- ๐Ÿ—‚๏ธ **An Open File Format**: Conversations and model settings stay on your computer and can be exported or deleted at any time.
- ๐ŸŒ **OpenAI Compatible**: Local server on port `1337` with OpenAI compatible endpoints
- ๐ŸŒ **Open Source & Free**: We build in public; check out our [Github](https://github.com/janhq)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/r7VmEBLGXpPLTu2MImM7S.png)

# About Jan
Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.

Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.

# Jan Model Merger
This is a test project for merging models.

# Open LLM Leaderboard Evaluation Results

Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_jan-hq__trinity-v1).

| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 74.8|
| ARC (25-shot)         | 72.27         |
| HellaSwag (10-shot)   | 88.36   |
| MMLU (5-shot)         | 65.2|
| TruthfulQA (0-shot)   | 69.31 |
| Winogrande (5-shot)   | 82  |
| GSM8K (5-shot)        | 71.65        |

# Acknowlegement
- [mergekit](https://github.com/cg123/mergekit)
- [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
- [SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
- [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)