--- title: Merged Model Finder Api emoji: 📊 colorFrom: yellow colorTo: gray sdk: docker pinned: false license: apache-2.0 --- # HF Merged Models Finder ## Launch the API To start the API you can use Docker: ```bash docker build -t merged-models-api . && docker run -dp 127.0.0.1:3000:7860 merged-models-api ``` You can go the localhost:3000 to see the API doc. ## Populate the DB To store the relation between the models (merged models and their base models) we use a lighweight graph database: [Cog DB](https://github.com/arun1729/cog) The files are saved on the repo. To populate the database you can run the following command. Make sure you created an pyenv and installed the required packages. This will fetch the data from Hugging Face hub (this takes quite some time). ```bash pip install -r requirements.txt && python -m app.populate_db ``` ## Improvements - Currently, all the data is hosted in a folder on the repository. This is obviously not scalable and not easy to maintain. An external graph database, more production-ready, would be preferable (like Neo4j). This could easily allow us to handle 100k+ merged models. - If we have 100k+ models we would have to change how the endpoint to get all the merged models and the one to get the all the base models work. We would need to have a pagination on the API (on the UI the models will be fetched while scrolling) and a way to search through the API (if a user starts to type a model id for instance) - In this API, we assume the data on HF Hub is not changing. There are some challenges to maintaining this DB up to date. This can easily be achieved with a process running every few minutes to list the Y last created models (using the 'lastModified' as a sorting key when fetching the models) and add them to the graph if they are not already there. - The fetching of the data through HF Hub API is very slow. What takes most of the time is fetching either the `README.md` or the `mergekit_config.yaml` file. We could definitely improve this by having a more asynchronous or parallelized approach.