Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
sugatoray
's Collections
LLMs
LLM Tools
AV LLMs
LLM Training Datasets
Papers
Leaderboards 🔥
Papers-MoE
Papers-LLMEval
LLM LLAMA3
Papers-Fundamentals
TFM: TimeSeries Foundation Models
Papers-Benchmarks
LLMs-EmbeddingModels
LLMs + Mamba
LLM + Datasets : Finance
AV LLMs
updated
8 days ago
A collection of Audio, Video and Visual LLMs.
Upvote
2
myshell-ai/OpenVoice
Text-to-Speech
•
Updated
Apr 24
•
384
Running
937
🤗
OpenVoice
dataautogpt3/ProteusV0.3
Text-to-Image
•
Updated
Feb 12
•
37.4k
•
89
ByteDance/SDXL-Lightning
Text-to-Image
•
Updated
Apr 3
•
76.2k
•
1.88k
openai/whisper-large-v3
Automatic Speech Recognition
•
Updated
Aug 12
•
4.23M
•
•
3.43k
stabilityai/TripoSR
Image-to-3D
•
Updated
Aug 9
•
23.9k
•
445
Efficient-Large-Model/VILA-7b
Text Generation
•
Updated
Mar 4
•
890
•
25
google/paligemma-3b-pt-896
Image-Text-to-Text
•
Updated
Jul 19
•
70.2k
•
106
microsoft/Phi-3-vision-128k-instruct
Text Generation
•
Updated
Aug 20
•
114k
•
894
stabilityai/stable-audio-open-1.0
Text-to-Audio
•
Updated
Jul 31
•
19.8k
•
866
OpenVLA: An Open-Source Vision-Language-Action Model
Paper
•
2406.09246
•
Published
Jun 13
•
36
aiola/whisper-medusa-v1
Updated
Aug 3
•
502
•
172
merve/idefics3llama-vqav2
Updated
14 days ago
•
8
black-forest-labs/FLUX.1-schnell
Text-to-Image
•
Updated
Aug 16
•
1.01M
•
•
2.37k
Running
on
Zero
98
😻
Llama3.1 S V0.2 Checkpoint 2024 08 20
gpt-omni/mini-omni
Text-to-Speech
•
Updated
21 days ago
•
4
•
355
fishaudio/fish-speech-1.4
Text-to-Speech
•
Updated
1 day ago
•
5.11k
•
349
Running
on
Zero
133
📲🫴🏻👁
Tonic's GOT OCR
GOT - OCR (from : UCAS, Beijing)
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
8 days ago
•
134k
•
699
apple/coreml-sam2-large
Mask Generation
•
Updated
12 days ago
•
104
•
13
coreml-projects/sam-2-studio
Updated
12 days ago
•
11
mistralai/Pixtral-12B-2409
Updated
8 days ago
•
9
•
312
Upvote
2
Share collection
View history
Collection guide
Browse collections