Furkan Gözükara
AI & ML interests
Articles
Organizations
MonsterMMORPG's activity
Where To Download And Install
You can download our APP from here : https://www.patreon.com/posts/110613301
1-Click to install on Windows, RunPod and Massed Compute
Official APP is here where you can try : fancyfeast/joy-caption-alpha-one
Have The Following Features
Auto downloads meta-llama/Meta-Llama-3.1–8B into your Hugging Face cache folder and other necessary models into the installation folder
Use 4-bit quantization — Uses 8.5 GB VRAM Total
Overwrite existing caption file
Append new caption to existing caption
Remove newlines from generated captions
Cut off at last complete sentence
Discard repeating sentences
Don’t save processed image
Caption Prefix
Caption Suffix
Custom System Prompt (Optional)
Input Folder for Batch Processing
Output Folder for Batch Processing (Optional)
Fully supported Multi GPU captioning — GPU IDs (comma-separated, e.g., 0,1,2)
Batch Size — Batch captioning
Full article here : https://medium.com/@furkangozukara/multi-gpu-flux-fu
Image 1
Image 1 shows that only first part of installation of Kohya GUI took 30 minutes on a such powerful machine on a very expensive Secure Cloud pod — 3.28 USD per hour
There was also part 2, so just installation took super time
On Massed Compute, it would take like 2–3 minutes
This is why I suggest you to use Massed Compute over RunPod, RunPod machines have terrible hard disk speeds and they are like lottery to get good ones
Image 2, 3 and 4
Image 2 shows speed of our very best config FLUX Fine Tuning training shared below when doing 2x Multi GPU training
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 3 shows VRAM usage of this config when doing 2x Multi GPU training
Image 4 shows the GPUs of the Pod
Image 5 and 6
Image 5 shows speed of our very best config FLUX Fine Tuning training shared below when doing a single GPU training
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 6 shows this setup used VRAM amount
Image 7 and 8
Image 7 shows speed of our very best config FLUX Fine Tuning training shared below when doing a single GPU training and Gradient Checkpointing is disabled
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 8 shows this setup used VRAM amount
....
Full article posted here : https://medium.com/@furkangozukara/single-block-layer-flux-lora-training-research-results-and-lora-network-alpha-change-impact-with-e713cc89c567
Conclusions
As expected, as you train lesse parameters e.g. LoRA vs Full Fine Tuning or Single Blocks LoRA vs all Blocks LoRA, your quality get reduced
Of course you earn some extra VRAM memory reduction and also some reduced size on the disk
Moreover, lesser parameters reduces the overfitting and realism of the FLUX model, so if you are into stylized outputs like comic, it may work better
Furthermore, when you reduce LoRA Network Rank, keep original Network Alpha unless you are going to do a new Learning Rate research
Finally, very best and least overfitting is achieved with full Fine Tuning
Check figure 3 and figure 4 last columns — I make extracted LoRA Strength / Weight 1.1 instead of 1.0
Full fine tuning configs and instructions > https://www.patreon.com/posts/112099700
Second best one is extracting a LoRA from Fine Tuned model if you need a LoRA
Check figure 3 and figure 4 last columns — I make extracted LoRA Strength / Weight 1.1 instead of 1.0
Extract LoRA guide (public article) : https://www.patreon.com/posts/112335162
Third is doing a all layers regular LoRA training
Full guide, configs and instructions > https://www.patreon.com/posts/110879657
And the worst quality is training lesser blocks / layers with LoRA
Full configs are included in > https://www.patreon.com/posts/110879657
So how much VRAM and Speed single block LoRA training brings?
All layers 16 bit is 27700 MB (4.85 second / it) and 1 single block is 25800 MB (3.7 second / it)
All layers 8 bit is 17250 MB (4.85 second / it) and 1 single block is 15700 MB (3.8 second / it)
Image Raw Links
Figure 0 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
Full article is here public post : https://www.patreon.com/posts/112335162
This was short on length so check out the full article - public post
Conclusions as below
Conclusions
With same training dataset (15 images used), same number of steps (all compared trainings are 150 epoch thus 2250 steps), almost same training duration, Fine Tuning / DreamBooth training of FLUX yields the very best results
So yes Fine Tuning is the much better than LoRA training itself
Amazing resemblance, quality with least amount of overfitting issue
Moreover, extracting a LoRA from Fine Tuned full checkpoint, yields way better results from LoRA training itself
Extracting LoRA from full trained checkpoints were yielding way better results in SD 1.5 and SDXL as well
Comparison of these 3 is made in Image 5 (check very top of the images to see)
640 Network Dimension (Rank) FP16 LoRA takes 6.1 GB disk space
You can also try 128 Network Dimension (Rank) FP16 and different LoRA strengths during inference to make it closer to Fine Tuned model
Moreover, you can try Resize LoRA feature of Kohya GUI but hopefully it will be my another research and article later
Image Raw Links
Image 1 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
Image 2 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
Image 3 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
Image 4 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
Image 5 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
much lesser degree but i can't say fully fixes yet :/
Configs and Full Experiments
Full configs and grid files shared here : https://www.patreon.com/posts/kohya-flux-fine-112099700
Details
I am still rigorously testing different hyperparameters and comparing impact of each one to find the best workflow
So far done 16 different full trainings and completing 8 more at the moment
I am using my poor overfit 15 images dataset for experimentation (4th image)
I have already proven that when I use a better dataset it becomes many times betters and generate expressions perfectly
Here example case : https://www.reddit.com/r/FluxAI/comments/1ffz9uc/tried_expressions_with_flux_lora_training_with_my/
Conclusions
When the results are analyzed, Fine Tuning is way lesser overfit and more generalized and better quality
In first 2 images, it is able to change hair color and add beard much better, means lesser overfit
In the third image, you will notice that the armor is much better, thus lesser overfit
I noticed that the environment and clothings are much lesser overfit and better quality
Disadvantages
Kohya still doesn’t have FP8 training, thus 24 GB GPUs gets a huge speed drop
Moreover, 48 GB GPUs has to use Fused Back Pass optimization, thus have some speed drop
16 GB GPUs gets way more aggressive speed drop due to lack of FP8
Clip-L and T5 trainings still not supported
Speeds
Rank 1 Fast Config — uses 27.5 GB VRAM, 6.28 second / it (LoRA is 4.85 second / it)
Rank 1 Slower Config — uses 23.1 GB VRAM, 14.12 second / it (LoRA is 4.85 second / it)
Rank 1 Slowest Config — uses 15.5 GB VRAM, 39 second / it (LoRA is 6.05 second / it)
Final Info
Saved checkpoints are FP16 and thus 23.8 GB (no Clip-L or T5 trained)
According to the Kohya, applied optimizations doesn’t change quality so all configs are ranked as Rank 1 at the moment
I am still testing whether these optimizations make any impact on quality or not
Detailed Full Workflow
Medium article : https://medium.com/@furkangozukara/ultimate-flux-lora-training-tutorial-windows-and-cloud-deployment-abb72f21cbf8
Windows main tutorial : https://youtu.be/nySGu12Y05k
Cloud tutorial for GPU poor or scaling : https://youtu.be/-uhL2nW7Ddw
Full detailed results and conclusions : https://www.patreon.com/posts/111891669
Full config files and details to train : https://www.patreon.com/posts/110879657
SUPIR Upscaling (default settings are now perfect) : https://youtu.be/OYxVEvDf284
I used my Poco X6 Camera phone and solo taken images
My dataset is far from being ready, thus I have used so many repeating and almost same images, but this was rather experimental
Hopefully I will continue taking more shots and improve dataset and reduce size in future
I trained Clip-L and T5-XXL Text Encoders as well
Since there was too much push from community that my workflow won’t work with expressions, I had to take a break from research and use whatever I have
I used my own researched workflow for training with Kohya GUI and also my own self developed SUPIR app batch upscaling with face upscaling and auto LLaVA captioning improvement
Download images to see them in full size, the last provided grid is 50% downscaled
Workflow
Gather a dataset that has expressions and perspectives that you like after training, this is crucial, whatever you add, it can generate perfect
Follow one of the LoRA training tutorials / guides
After training your LoRA, use your favorite UI to generate images
I prefer SwarmUI and here used prompts (you can add specific expressions to prompts) including face inpainting :
https://gist.github.com/FurkanGozukara/ce72861e52806c5ea4e8b9c7f4409672
After generating images, use SUPIR to upscale 2x with maximum resemblance
Short Conclusions
Using 256 images certainly caused more overfitting than necessary
...
I have done total 104 different LoRA trainings and compared each one of them to find the very best hyper parameters and the workflow for FLUX LoRA training by using Kohya GUI training script.
You can see all the done experiments’ checkpoint names and their repo links in following public post: https://www.patreon.com/posts/110838414
After completing all these FLUX LoRA trainings by using the most VRAM optimal and performant optimizer Adafactor I came up with all of the following ranked ready to use configurations.
You can download all the configurations, all research data, installers and instructions at the following link : https://www.patreon.com/posts/110879657
Tutorials
I also have prepared 2 full tutorials. First tutorial covers how to train and use the best FLUX LoRA locally on your Windows computer : https://youtu.be/nySGu12Y05k
This is the main tutorial that you have to watch without skipping to learn everything. It has total 74 chapters, manually written English captions. It is a perfect resource to become 0 to hero for FLUX LoRA training.
The second tutorial I have prepared is for how to train FLUX LoRA on cloud. This tutorial is super extremely important for several reasons. If you don’t have a powerful GPU, you can rent a very powerful and very cheap GPU on Massed Compute and RunPod. I prefer Massed Compute since it is faster and cheaper with our special coupon SECourses. Another reason is that in this tutorial video, I have fully in details shown how to train on a multiple GPU setup to scale your training speed. Moreover, I have shown how to upload your checkpoints and files ultra fast to Hugging Face for saving and transferring for free. Still watch first above Windows tutorial to be able to follow below cloud tutorial : https://youtu.be/-uhL2nW7Ddw
For upscaling SUPIR used : https://youtu.be/OYxVEvDf284
Experimenting captions vs non-captions. So we will see which yields best results for style training on FLUX.
Generated captions with multi-GPU batch Joycaption app.
I am showing 5 examples of what Joycaption generates on FLUX dev. Left images are the original style images from the dataset.
I used my multi-GPU Joycaption APP (used 8x A6000 for ultra fast captioning) : https://www.patreon.com/posts/110613301
I used my Gradio batch caption editor to edit some words and add activation token as ohwx 3d render : https://www.patreon.com/posts/108992085
The no caption dataset uses only ohwx 3d render as caption
I am using my newest 4x_GPU_Rank_1_SLOW_Better_Quality.json on 4X A6000 GPU and train 500 epochs — 114 images : https://www.patreon.com/posts/110879657
Total step count is being 500 * 114 / 4 (4x GPU — batch size 1) = 14250
Taking 37 hours currently if I don’t terminate early
Will save a checkpoint once every 25 epochs
Full Windows Kohya LoRA training tutorial : https://youtu.be/nySGu12Y05k
Full cloud tutorial I am still editing
Hopefully will share trained LoRA on Hugging Face and CivitAI along with full dataset including captions.
I got permission to share dataset but can’t be used commercially.
Also I will hopefully share full workflow in the CivitAI and Hugging Face LoRA pages.
Multi-GPU batch caption with JoyCaption. JoyCaption uses Meta-Llama-3.1–8B and google/siglip-so400m-patch14–384 and a fine tuned image captioning neural network.
Link : https://www.patreon.com/posts/110613301
Link for batch caption editor : https://www.patreon.com/posts/108992085
Coding multi-gpu in Python and Torch and bitsandbytes was truly a challange.
Our APP uses JoyCaption image captioning fine tuned model.
Our APP supports bitsandbytes 4bit model loading as well even in multi GPU mode (9.5 GB VRAM)
Tested on 8x RTX A6000 (cloud) and RTX 3090 TI + RTX 3060 (my PC)
1-click to install on Windows, RunPod and Massed Compute
Excellent caption quality, automatically distributes images into each GPU, lots of features. You can resume caption with skip captioned images option.
For full details checkout screenshots
10 GB, 16 GB, 24 GB and 48 GB GPU configs added - 10 GB config is like 3x to 5x slower sadly
Massed Compute, RunPod and Windows Kohya SS GUI LoRA installers added to the zip file
Also right now testing new 16 GB FLUX LoRA training config and new way of regularization images. Moreover testing Apply T5 Attention Mask too. Lets see if Kohya FLUX LoRA workflow will become even better or not
Also massive grids comparisons shared here : https://www.reddit.com/r/StableDiffusion/comments/1eyj4b8/kohya_ss_gui_very_easy_flux_lora_trainings_full/
there is this space but looks like giving error : https://huggingface.co./spaces/yuhj95/resshift
thanks for comment
Official Repo : https://github.com/zsyOAOA/ResShift
I have developed a very advanced Gradio APP.
Developed APP Scripts and Installers : https://www.patreon.com/posts/110331752
Features
It supports following tasks:
Real-world image super-resolution
Bicubic (resize by Matlab) image super-resolution
Blind Face Restoration
Automatically saving all generated image with same name + numbering if necessary
Randomize seed feature for each generation
Batch image processing - give input and output folder paths and it batch process all images and saves
1-Click to install on Windows, RunPod, Massed Compute and Kaggle (free account)
Windows Requirements
Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git
If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial
https://youtu.be/-NjNy7afOQ0
How to Install on Windows
Make sure that you have the above requirements
Extract files into a folder like c:/reshift_v1
Double click Windows_Install.bat and it will automatically install everything for you with an isolated virtual environment folder (VENV)
After that double click Windows_Start_app.bat and start the app
When you first time use a task it will download necessary models (all under 500 MB) into accurate folders
If during download it fails, file gets corrupted sadly it doesn't verify that so delete files inside weights and restart
How to Install on RunPod, Massed Compute, Kaggle
Follow the Massed_Compute_Instructions_READ.txt and Runpod_Instructions_READ.txt
For Kaggle follow the notebook written steps
An example video of how to use my RunPod, Massed Compute scripts and Kaggle notebook can be seen
https://youtu.be/wG7oPp01COg
AuraSR is a 600M parameter upsampler model derived from the GigaGAN paper. It works super fast and uses a very limited VRAM below 5 GB. It is deterministic upscaler. It works perfect in some images but fails in some images so it is worth to give it a shot.
GitHub official repo : https://github.com/fal-ai/aura-sr
I have developed 1-click installers and a batch upscaler App.
You can download installers and advanced batch App from below link:
https://www.patreon.com/posts/110060645
Check the screenshots and examples below
Windows Requirements
Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git
If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial
https://youtu.be/-NjNy7afOQ0
How to Install and Use on Windows
Extract the attached GigaGAN_Upscaler_v1.zip into a folder like c:/giga_upscale
Then double click and install with Windows_Install.bat file
It will generate an isolated virtual environment venv folder and install requirements
Then double click and start the Gradio App with Windows_Start_App.bat file
When first time running it will download models into your Hugging Face cache folder
Hugging Face cache folder setup explained below
https://www.patreon.com/posts/108419878
All upscaled images will be saved into outputs folder automatically with same name and plus numbering if necessary
You can also batch upscale a folder
How to Install and Use on Cloud
Follow Massed Compute and RunPod instructions
Usage is same as on Windows
For Kaggle start a Kaggle notebook, import our Kaggle notebook and follow the instructions
App Screenshots and Examples below
Official repo : https://github.com/ZhengPeng7/BiRefNet
Download APP and installers from : https://www.patreon.com/posts/109913645
Hugging Face Demo : ZhengPeng7/BiRefNet_demo
I have developed a very advanced Gradio APP for this with full proper file saving and batch processing. Also my version removes BG and saves as transparent background.
The APP uses huge VRAM for high resolution images. However it is still working uber fast even though using shared VRAM. So make sure that you have high RAM or set virtual RAM.
Click below to see how to set virtual RAM on Windows.
https://www.windowscentral.com/how-change-virtual-memory-size-windows-10
On Massed Compute A6000 GPU (31 cents per hour) you can very fast remove even very high res images backgrounds.
Currently we have 1 click installers for RunPod, Massed Compute, Kaggle and Windows.
Windows Requirements
Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git
If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial
https://youtu.be/-NjNy7afOQ0
How To Use On Windows
Just extract files into like c:/BiRefNet_v1
Double click Windows_Install.bat file and it will generate a isolated virtual environment and install requirements
It will automatically download models into your Hugging Face cache (best model under 1 GB)
Then start and use the Gradio APP with Windows_Start_App.bat
Cloud How To Use
Massed Compute, RunPod has instructions txt files. Follow them
Kaggle has all the instructions 1 by 1
On Kaggle set resolution 1024x1024 or you will get out of memory error
Amazing I was planning to make gradio for this
nice thanks+
awesome added to my scripts make list
Animals Live animation added
All of the main repo changes and improvements added to our modified and improve app
Link : https://patreon.com/posts/107609670
Works perfect on Massed Compute, RunPod, free Kaggle account and Windows
1-Click to install with instructions
All tested and verified
Windows tutorial : https://youtu.be/FPtpNrmuwXk
Cloud (RunPod, Massed Compute & free Kaggle account) tutorial : https://youtu.be/wG7oPp01COg
Making XPose / UniPose / ops library compiling working was a challenge on Massed Compute and Kaggle.
why this is not single file i was gonna test in swarmUI :/
nice but still reduced quality :/ i should compare with dev 20 steps
🔗 Comprehensive Tutorial Video Link ▶️ https://youtu.be/bupRePUOA18
FLUX represents a milestone in open source txt2img technology, delivering superior quality and more accurate prompt adherence than #Midjourney, Adobe Firefly, Leonardo Ai, Playground Ai, Stable Diffusion, SDXL, SD3, and Dall E3. #FLUX, a creation of Black Forest Labs, boasts a team largely comprised of #StableDiffusion's original developers, and its output quality is truly remarkable. This statement is not hyperbole; you'll witness its capabilities in the tutorial. This guide will demonstrate how to effortlessly install and utilize FLUX models on your personal computer and cloud platforms like Massed Compute, RunPod, and a complimentary Kaggle account.
🔗 FLUX Setup Guide (publicly accessible) ⤵️
▶️ https://www.patreon.com/posts/106135985
🔗 FLUX Models One-Click Robust Automatic Downloader Scripts ⤵️
▶️ https://www.patreon.com/posts/109289967
🔗 Primary Windows SwarmUI Tutorial (Essential for Usage Instructions) ⤵️
▶️ https://youtu.be/HKX8_F1Er_w
🔗 Cloud-based SwarmUI Tutorial (Massed Compute - RunPod - Kaggle) ⤵️
▶️ https://youtu.be/XFUZof6Skkw
🔗 SECourses Discord Server for Comprehensive Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
🔗 SECourses Reddit Community ⤵️
▶️ https://www.reddit.com/r/SECourses/
🔗 SECourses GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion
🔗 Official FLUX 1 Launch Announcement Blog Post ⤵️
▶️ https://blackforestlabs.ai/announcing-black-forest-labs/
Video Segments
0:00 Introduction to the state-of-the-art open source txt2img model FLUX
5:01 Process for integrating FLUX model into SwarmUI
....
check the file names in the below given imgsli to see all details
SwarmUI on L40S is used to compare - 1.82 it / second step speed for 1024x1024
imgsli link that compares all : https://imgsli.com/MjgzNzM1
SwarmUI full tutorial public post : https://www.patreon.com/posts/106135985
1-Click FLUX models downloader scripts for Windows, RunPod and Massed Compute are in below post
https://www.patreon.com/posts/109289967
free Kaggle account notebook that supports FLUX already : Download from here : https://www.patreon.com/posts/106650931
prompt :
(medium full shot) of (awe-inspiring snake) with muscular body, amber eyes, bronze brown armored scales, venomous fangs, coiling tail, gemstone-studded scales frills, set in a barren desert wasteland, with cracked earth and the remains of ancient structures, a place of mystery and danger, at dawn, ,Masterpiece,best quality, raw photo, realistic, very aesthetic, dark
CFG 1 - seed 1 - FLUX CFG is default : 3.5
Full public SwarmUI tutorial
Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI
https://youtu.be/HKX8_F1Er_w
Full public Cloud SwarmUI tutorial
How to Use SwarmUI & Stable Diffusion 3 on Cloud Services Kaggle (free), Massed Compute & RunPod
https://youtu.be/XFUZof6Skkw
It's true: happened also to a video I created from a photo of my wife. But, plus of becoming a little more asiatic, she also grew vampire teeth
totally related to dataset. so their dataset is very likely to unbalanced
haha this is interesting :D
You probably seen those mind blowing AI made videos. And the day has arrived. The famous Kling AI is now worldwide available for free. In this tutorial video I will show you how to register for free with just email to Kling AI and use its mind blowing text to video animation, image to video animation and text to image, and image to image capabilities. This video will show you non-cherry pick results so you will know the actual quality and capability of the model unlike those extremely cherry pick example demos. Still, #KlingAI is the only #AI model that competes with OpenAI's #SORA and it is real to use.
🔗 Kling AI Official Website ⤵️
▶️ https://www.klingai.com/
🔗 SECourses Discord Channel to Get Full Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
🔗 Our GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion
🔗 Our Reddit ⤵️
▶️ https://www.reddit.com/r/SECourses/
Gradio is amazing to develop such amazing apps in short time. Used Claude 3.5 to develop it :)
Scripts are available here : https://www.patreon.com/posts/108992085
Massed Compute coupon we have is available indefinitely - that is what they told me and still working
Massed Compute coupon we have is available indefinitely - that is what they told me and still working
For this task currently LivePortrait is best
A new tutorial is anticipated to showcase the latest changes and features in V3, including Video-to-Video capabilities and additional enhancements.
This post provides information for both Windows (local) and Cloud installations (Massed Compute, RunPod, and free Kaggle Account).
🔗 Windows Local Installation Tutorial ️⤵️
▶️ https://youtu.be/FPtpNrmuwXk
🔗 Cloud (no-GPU) Installations Tutorial for Massed Compute, RunPod and free Kaggle Account ️⤵️
▶️ https://youtu.be/wG7oPp01COg
The V3 update introduces video-to-video functionality. If you're seeking a one-click installation method for LivePortrait, an open-source zero-shot image-to-animation application on Windows, for local use, this tutorial is essential. It introduces the cutting-edge image-to-animation open-source generator Live Portrait. Simply provide a static image and a driving video to create an impressive animation in seconds. LivePortrait is incredibly fast and adept at preserving facial expressions from the input video. The results are truly astonishing.
With the V3 update adding video-to-video functionality, those interested in using LivePortrait but lacking a powerful GPU, using a Mac, or preferring cloud-based solutions will find this tutorial invaluable. It guides you through the one-click installation and usage of LivePortrait on #MassedCompute, #RunPod, and even a free #Kaggle account. After following this tutorial, you'll find running LivePortrait on cloud services as straightforward as running it locally. LivePortrait is the latest state-of-the-art static image to talking animation generator, surpassing even paid services in both speed and quality.
This is kind of announcement sharing. I think the offer looks decent.
For who doesn’t know our channel here : https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
The job offer is in ai-related-job-offers sub-channel
I know there are a lot of experts around here that can install and use easily. But I have prepared solid tutorials for newbies and shown how to use this amazing top quality app LivePortrait. I have to say congrats to the developers.
I am also a researcher you can see my LinkedIn profile here but recently I am shifted into AI lectures : https://www.linkedin.com/in/furkangozukara/
Both Windows and Cloud tutorial has manually written (100% accurate) captions / subtitles. Also both have manually written by me very detailed video chapters.
Windows LivePortrait Tutorial : https://youtu.be/FPtpNrmuwXk
Cloud LivePortrait Tutorial : Massed Compute, RunPod & Kaggle : https://youtu.be/wG7oPp01COg
## Windows LivePortrait Tutorial Video Chapters
- 0:00 Introduction to LivePortrait: A cutting-edge open-source application for image-to-animation conversion
- 2:20 Step-by-step guide for downloading and installing the LivePortrait Gradio application on your device
- 3:27 System requirements and installation process for LivePortrait
- 4:07 Verifying the successful installation of required components
- 5:02 Confirming installation completion and preserving installation logs
- 5:37 Initiating the LivePortrait application post-installation
....
Tutorial link : https://youtu.be/XFUZof6Skkw
It has manually written captions / subtitles and also video chapters.
If you are a GPU poor this is the video you need
In this video, I demonstrate how to install and use #SwarmUI on cloud services. If you lack a powerful GPU or wish to harness more GPU power, this video is essential. You'll learn how to install and utilize SwarmUI, one of the most powerful Generative AI interfaces, on Massed Compute, RunPod, and Kaggle (which offers free dual T4 GPU access for 30 hours weekly). This tutorial will enable you to use SwarmUI on cloud GPU providers as easily and efficiently as on your local PC. Moreover, I will show how to use Stable Diffusion 3 (#SD3) on cloud. SwarmUI uses #ComfyUI backend.
🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985
🔗 Windows Tutorial for Learn How to Use SwarmUI ➡️ https://youtu.be/HKX8_F1Er_w
🔗 How to download models very fast to Massed Compute, RunPod and Kaggle and how to upload models or files to Hugging Face very fast tutorial ➡️ https://youtu.be/X5WVZ0NMaTg
🔗 SECourses Discord ➡️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
🔗 Stable Diffusion GitHub Repo (Please Star, Fork and Watch) ➡️ https://github.com/FurkanGozukara/Stable-Diffusion
Coupon Code for Massed Compute : SECourses
Coupon works on Alt Config RTX A6000 and also RTX A6000 GPUs
https://youtu.be/HKX8_F1Er_w
Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your mind will be blown after you watch this tutorial and learn its amazing features. StableSwarmUI uses #ComfyUI as the back end thus it has all the good features of ComfyUI and it brings you easy to use features of Automatic1111 #StableDiffusion Web UI with them. I really liked SwarmUI and planning to do more tutorials for it.
🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985
0:00 Introduction to the Stable Diffusion 3 (SD3) and SwarmUI and what is in the tutorial
4:12 Architecture and features of SD3
5:05 What each different model files of Stable Diffusion 3 means
6:26 How to download and install SwarmUI on Windows for SD3 and all other Stable Diffusion models
8:42 What kind of folder path you should use when installing SwarmUI
10:28 If you get installation error how to notice and fix it
11:49 Installation has been completed and now how to start using SwarmUI
12:29 Which settings I change before start using SwarmUI and how to change your theme like dark, white, gray
12:56 How to make SwarmUI save generated images as PNG
13:08 How to find description of each settings and configuration
13:28 How to download SD3 model and start using on Windows
13:38 How to use model downloader utility of SwarmUI
14:17 How to set models folder paths and link your existing models folders in SwarmUI
14:35 Explanation of Root folder path in SwarmUI
14:52 VAE of SD3 do we need to download?
Full Windows YouTube Tutorial : https://youtu.be/xLqDTVWUSec
Ever wished your static images could talk like magic? Meet V-Express, the groundbreaking open-source and free tool that breathes life into your photos! Whether you have an audio clip or a video, V-Express animates your images to create stunning talking avatars. Just like the acclaimed D-ID Avatar, Wav2Lip, and Avatarify, V-Express turns your still photos into dynamic, speaking personas, but with a twist—it's completely open-source and free to use! With seamless audio integration and the ability to mimic video expressions, V-Express offers an unparalleled experience without any cost or restrictions. Experience the future of digital avatars today—let's dive into how you can get started with V-Express and watch your images come alive!
1-Click V-Express Installers Scripts ⤵️
https://www.patreon.com/posts/105251204
Requirements Step by Step Tutorial ⤵️
https://youtu.be/-NjNy7afOQ0
Official Rope GitHub Repository Free To Install and Use ⤵️
https://github.com/tencent-ailab/V-Express
SECourses Discord Channel to Get Full Support ⤵️
https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
It is open source you can easily install by following github instructions
1-Click Rope Installers Scripts (contains both Windows into an isolated Python VENV and Massed Compute — Cloud — No GPU)⤵️
https://www.patreon.com/posts/most-advanced-1-105123768
Tutorials are made only for educational purposes. On cloud Massed Compute machine, you can run with staggering 20 threads and can FaceSwap entire movies. Fully supports face tracking and multiple face changes.
Mind-Blowing Deepfake Tutorial: Turn Anyone into Your Fav Movie Star! Better than Roop & Face Fusion ⤵️
https://youtu.be/RdWKOUlenaY
Best Deepfake Open Source App ROPE — So Easy To Use Full HD Feceswap DeepFace, No GPU Required Cloud ⤵️
https://youtu.be/HLWLSszHwEc
It generates a VENV and install everything inside it. Works with Python 3.10.x - I suggest 3.10.11
Also you need C++ tools and Git. You can follow this tutorial to install all : https://youtu.be/-NjNy7afOQ0
Updated 27 May 2024 : https://www.patreon.com/posts/95759342
21 January 2024 Update
SDXL model upgraded to ip-adapter-faceid-plusv2_sd15
Kaggle Notebook upgraded to V3 and supports SDXL now
First of all I want to thank you so much for this amazing model.
I have spent over 1 week to code the Gradio and prepare the video. I hope you let this thread remain and even add to the Readme file.
After video has been published I even added face embedding caching mechanism. So now it will calculate face embedding vector only 1 time for each image, thus super speed up the image generation.
Instantly Transfer Face By Using IP-Adapter-FaceID: Full Tutorial & GUI For Windows, RunPod & Kaggle : https://youtu.be/rjXsJ24kQQg
chapters are like below
0:00 Introduction to IP-Adapter-FaceID full tutorial
2:19 Requirements to use IP-Adapter-FaceID gradio Web APP
2:45 Where the Hugging Face models are downloaded by default on Windows
3:12 How to change folder path where the Hugging Face models are downloaded and cached
3:39 How to install IP-Adapter-FaceID Gradio Web APP and use on Windows
5:35 How to start the IP-Adapter-FaceID Web UI after the installation
5:46 How to use Stable Diffusion XL (SDXL) models with IP-Adapter-FaceID
5:56 How to select your input face and start generating 0-shot face transferred new amazing images
6:06 What does each option on the Web UI do explanations
It works, @MonsterMMORPG !
https://huggingface.co./spaces/Fabrice-TIERCELIN/SUPIR
SUPIR is now available on HuggingFace 🙂 I have disabled LLaVa because there is still an error with it. I will try to fix it in the future. Add links everywhere!
congrats
OK so I have created a template space. Of course it's not working itself because it runs on a CPU but people can duplicate it on a GPU. It should work but I can only test the interface. I say that they need 60 GB VRAM. Correct me if it's wrong. I will wait for feedback.
Our apps works with 29gb ram on kaggle
Can't tell others
@Fabrice-TIERCELIN
we have a working Kaggle notebook
Also we have installers for runpod and massed compute
Stable Cascade is another amazing model for Stability AI
Weights are published
Stable Cascade Full Tutorial for Windows — Predecessor of SD3–1-Click Install Amazing Gradio APP : https://youtu.be/q0cYhalUUsc
Stable Cascade Full Tutorial for Cloud — Predecessor of SD3 — Massed Compute, RunPod & Kaggle : https://youtu.be/PKDeMdEObNo
sadly i can't for this. I also don't know and this requires good GPU
I have prepared installer scripts and full tutorials for Windows (requires min 8 GB VRAM GPU), Massed Compute (I suggest this if you don’t have a strong GPU), RunPod and a free Kaggle account (works perfect as well but slow).
Windows Tutorial : https://youtu.be/m4pcIeAVQD0
Cloud (Massed Compute, RunPod & Kaggle) Tutorial : https://youtu.be/LeHfgq_lAXU
In this video, I explain how to 1 click install and use the most advanced image upscaler / enhancer in the world that is both commercially and open source available. The upscaler that I am going to introduce you is open source #SUPIR and the model is free to use. SUPIR upscaler is many times better than both paid Topaz AI and Magnific AI and you can use this upscaler on your computer for free forever. The difference of SUPIR vs #Topaz and #Magnific is like ages. So in this tutorial you are going to learn everything about how to install, update and use SUPIR upscaler on your personal computer. The video shows Windows but it works perfectly fine on Linux as well.
Scripts Download Link ⤵️
https://www.patreon.com/posts/99176057
Samplers and Text CFG (Text Guidance Scale) Comparison Link ⤵️
https://imgsli.com/MjU2ODQz/2/1
How to install accurate Python, Git and FFmpeg on Windows Tutorial ⤵️
https://youtu.be/-NjNy7afOQ0
Full DreamBooth / Fine-tuning Tutorial ⤵️
https://youtu.be/0t5l6CP9eBg
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild : https://arxiv.org/abs/2401.13627
Authors introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. Leveraging multi-modal techniques and advanced generative prior, SUPIR marks a significant advance in intelligent and realistic image restoration. As a pivotal catalyst within SUPIR, model scaling dramatically enhances its capabilities and demonstrates new potential for image restoration. Authors collect a dataset comprising 20 million high-resolution, high-quality images for model training, each enriched with descriptive text annotations. SUPIR provides the capability to restore images guided by textual prompts, broadening its application scope and potential
The tutorial is over 2 hours literally with manually fixed captions and perfect video chapters.
Most Awaited Full Fine Tuning (with DreamBooth effect) Tutorial Generated Images - Full Workflow Shared In The Comments - NO Paywall This Time - Explained OneTrainer - Cumulative Experience of 16 Months Stable Diffusion
In this tutorial, I am going to show you how to install OneTrainer from scratch on your computer and do a Stable Diffusion SDXL (Full Fine-Tuning 10.3 GB VRAM) and SD 1.5 (Full Fine-Tuning 7GB VRAM) based models training on your computer and also do the same training on a very cheap cloud machine from MassedCompute if you don't have such computer.
Tutorial Readme File ⤵️
https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/OneTrainer-Master-SD-1_5-SDXL-Windows-Cloud-Tutorial.md
Register Massed Compute From Below Link (could be necessary to use our Special Coupon for A6000 GPU for 31 cents per hour) ⤵️
https://bit.ly/Furkan-Gözükara
Coupon Code for A6000 GPU is : SECourses
0:00 Introduction to Zero-to-Hero Stable Diffusion (SD) Fine-Tuning with OneTrainer (OT) tutorial
3:54 Intro to instructions GitHub readme
4:32 How to register Massed Compute (MC) and start virtual machine (VM)
5:48 Which template to choose on MC
6:36 How to apply MC coupon
8:41 How to install OT on your computer to train
9:15 How to verify your Python, Git, FFmpeg and Git installation
12:00 How to install ThinLinc and start using your MC VM
12:26 How to setup folder synchronization and file sharing between your computer and MC VM
13:56 End existing session in ThinClient
14:06 How to turn off MC VM
14:24 How to connect and start using VM
14:41 When use end existing session
16:38 How to download very best OT preset training configuration for SD 1.5 & SDXL models
18:00 How to load configuration preset
18:38 Full explanation of OT configuration and best hyper parameters for SDXL
.
.
.
thanks a lot
Sadly the post character count is limited so please read full info on Medium here
https://medium.com/@furkangozukara/compared-effect-of-image-captioning-for-sdxl-fine-tuning-dreambooth-training-for-a-single-person-961087e42334
I did over 100 trainings empirically to find best hyper parameters. And training U-NET + Text Encoder 1 yields better results that only U-NET @researcher171473
Full config and instructions are shared here : https://www.patreon.com/posts/96028218
Used SG161222/RealVisXL_V4.0 as a base model and OneTrainer to train on Windows 10 : https://github.com/Nerogar/OneTrainer
The posted example x/y/z checkpoint comparison images are not cherry picked. So I can get perfect images with multiple tries.
Trained 150 epochs, 15 images and used my ground truth 5200 regularization images : https://www.patreon.com/posts/massive-4k-woman-87700469
In each epoch only 15 of regularization images used to make DreamBooth training affect
As a caption only “ohwx man” is used, for regularization images just “man”
You can download configs and full instructions here : https://www.patreon.com/posts/96028218
Hopefully full public tutorial coming within 2 weeks. I will show all configuration as well
The tutorial will be on our channel : https://www.youtube.com/SECourses
Training speeds are as below thus durations:
RTX 3060 — slow preset : 3.72 second / it thus 15 train images 150 epoch 2 (reg images concept) : 4500 steps = 4500 3.72 / 3600 = 4.6 hours
RTX 3090 TI — slow preset : 1.58 second / it thus : 4500 * 1.58 / 3600 = 2 hours
RTX 3090 TI — fast preset : 1.45 second / it thus : 4500 * 1.45 / 3600 = 1.8 hours
A quick tutorial for how to use concepts in OneTrainer : https://youtu.be/yPOadldf6bI
100% this is next level. thanks for comment @ajibawa-2023
@ameerazam08 100%. I am talking with original developers for CPU Offloading too if they hopefully add.
This model is simply mind-blowing. At the bottom of this post, you will see side-by-side comparisons of SUPIR versus the extremely expensive online service, Magnific AI. Magnific is known to be the best among the community. However, SUPIR is by far superior. SUPIR also significantly outperforms Topaz AI upscale. SUPIR manages to remain faithful to the original image almost 100% while adding details and achieving super upscaling with the best realism.
You can read the full blog post here : https://huggingface.co./blog/MonsterMMORPG/supir-sota-image-upscale-better-than-magnific-ai
SD 3 can follow prompts many times better than SD 1.5 or SDXL. It is even better than Dall-E3 in following text / spelling prompts.
The realism of the SD 3 can't be even compared with Dall-E3, since every Dall-E3 output is like a digital render.
Can't wait to get approved of Stability AI early preview program to do more intensive testing.
Some people says be skeptical about cherry picking. I agree but I hope that these Stability AI released images are not that heavy cherry picking.
You can see SD3 vs Dall-E3 comparison here : https://youtu.be/DJxodszsERo
You can read on:
Patreon (public) : https://www.patreon.com/posts/how-to-deploy-on-97919576
Medium (public) : https://medium.com/@furkangozukara/how-to-deploy-a-pod-on-runpod-and-verify-it-is-working-20e47031c0b5
CivitAI (public) : https://civitai.com/articles/3994/how-to-deploy-a-pod-on-runpod-and-verify-it-is-working
LinkedIn (public) : https://www.linkedin.com/pulse/how-deploy-pod-runpod-verify-working-furkan-g%2525C3%2525B6z%2525C3%2525BCkara-lgplf%3FtrackingId=EuNOjpKCSQ%252BVfpiQV3D6KQ%253D%253D/?trackingId=EuNOjpKCSQ%2BVfpiQV3D6KQ%3D%3D
Dev . to (public) : https://dev.to/furkangozukara/how-to-deploy-a-pod-on-runpod-and-verify-it-is-working-3pop