---
pipeline_tag: text-to-image
tags:
- Any2Any
- Image+Text-to-Text
---


**Lumina-mGPT** is a family of multimodal autoregressive models capable of various vision and language tasks, particularly excelling in generating flexible photorealistic images from text descriptions.

[![Lumina-mGPT](https://img.shields.io/badge/Paper-Lumina--mGPT-2b9348.svg?logo=arXiv)](https://arxiv.org/abs/2408.02657)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6358a167f56b03ec9147074d/hgaCZdtmdlCDcZ8tb4Rme.png)

# Usage
We provide the implementation of Lumina-mGPT, as well as sampling code, in our [github repository](https://github.com/Alpha-VLLM/Lumina-mGPT).