Suggestion: Support for Multi-Task Fine-Tuning Format

#2
by ramosma0111 - opened

I've noticed that QWEN-Audio is unable to comprehend multiple questions in one go. For instance, when asked, "What does this audio say? And what is the emotion of this sound?" the model only responds to the first question, ignoring the second. This issue likely arises from the fine-tuning data format, which seems ill-equipped to handle such queries.

Sign up or log in to comment