Post
1591
Remember when
@Microsoft
released Phi-3 models... π€
Yup, the ones that had π¦Llama 3 8B beat on MMLU using 3.8B parameters! π
Now they are on the LMSYS Chatbot Arena Leaderboard! ππ
Medium(14B) ranks near GPT-3.5-Turbo-0613, but behind Llama 3 8B. π
Phi-3 Small(7B) is close to Llama-2-70B, and Mistral fine-tunes. π
What about the Phi-3 Mini(3.8B), that was giving Llama 3 8B a run for its money on MMLU? It gets an arena score of 1037 (#73) against 1153 (#22) of Llama 3 8B π€Ό
Looks like there is a struggle here between perplexity and inherent knowledge! π€
And Microsoft picked knowledge with high perplexity π§
Now I am even more intrigued: what is @Meta feeding its π¦ Llamas?πΎ
π Leaderboard: https://chat.lmsys.org/?leaderboard
Yup, the ones that had π¦Llama 3 8B beat on MMLU using 3.8B parameters! π
Now they are on the LMSYS Chatbot Arena Leaderboard! ππ
Medium(14B) ranks near GPT-3.5-Turbo-0613, but behind Llama 3 8B. π
Phi-3 Small(7B) is close to Llama-2-70B, and Mistral fine-tunes. π
What about the Phi-3 Mini(3.8B), that was giving Llama 3 8B a run for its money on MMLU? It gets an arena score of 1037 (#73) against 1153 (#22) of Llama 3 8B π€Ό
Looks like there is a struggle here between perplexity and inherent knowledge! π€
And Microsoft picked knowledge with high perplexity π§
Now I am even more intrigued: what is @Meta feeding its π¦ Llamas?πΎ
π Leaderboard: https://chat.lmsys.org/?leaderboard