Ostixe360
/

lp-music-caps

music-captioning

Inference Endpoints

Model card Files Files and versions Community

lp-music-caps / README.md

Ostixe360's picture

Update README.md

ced41e9 verified 5 months ago

|

history blame contribute delete

No virus

2.18 kB

	---
	license: mit
	datasets:
	- seungheondoh/LP-MusicCaps-MSD
	- seungheondoh/LP-MusicCaps-MC
	language:
	- en
	metrics:
	- bleu
	- bertscore
	tags:
	- music
	- music-captioning
	---

	# LP-MusicCaps-HF

	This is the LP-MusicCaps model but loadable by the hf library directly

	# Original Model Card

	- Repository: [LP-MusicCaps repository](https://github.com/seungheondoh/lp-music-caps)
	- Paper: [ArXiv](https://arxiv.org/abs/2307.16372)

	# :sound: LP-MusicCaps: LLM-Based Pseudo Music Captioning

	[![Demo Video](https://i.imgur.com/cgi8NsD.jpg)](https://youtu.be/ezwYVaiC-AM)

	This is a implementation of [LP-MusicCaps: LLM-Based Pseudo Music Captioning](#). This project aims to generate captions for music. 1) Tag-to-Caption: Using existing tags, We leverage the power of OpenAI's GPT-3.5 Turbo API to generate high-quality and contextually relevant captions based on music tag. 2) Audio-to-Caption: Using music-audio and pseudo caption pairs, we train a cross-model encoder-decoder model for end-to-end music captioning

	> [LP-MusicCaps: LLM-Based Pseudo Music Captioning](#)
	> SeungHeon Doh, Keunwoo Choi, Jongpil Lee, Juhan Nam
	> To appear ISMIR 2023


	## TL;DR


	<p align = "center">
	<img src = "https://i.imgur.com/2LC0nT1.png">
	</p>

	- [1.Tag-to-Caption: LLM Captioning](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/llm_captioning): Generate caption from given tag input.
	- [2.Pretrain Music Captioning Model](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/music_captioning): Generate pseudo caption from given audio.
	- [3.Transfer Music Captioning Model](https://github.com/seungheondoh/lp-music-caps/tree/main/lpmc/music_captioning/transfer.py): Generate human level caption from given audio.

	## Open Source Material

	- [pre-trained models](https://huggingface.co./seungheondoh/lp-music-caps)
	- [music-pseudo caption dataset](https://huggingface.co./datasets/seungheondoh/LP-MusicCaps-MSD)
	- [demo](https://huggingface.co./spaces/seungheondoh/LP-Music-Caps-demo)

	are available online for future research. example of dataset in [notebook](https://github.com/seungheondoh/lp-music-caps/blob/main/notebook/Dataset.ipynb)