Documentation
About
Summary
Project Name | Purpose | Type | Diffusers? | ComfyUI? |
---|---|---|---|---|
Instant Family | Identity-picture to picture | ID Prompt | ||
Stylus Diffusion | Selects LoRAs from a database | Meta-System | ||
Align Your Steps | Optimum scheduler; better images and better prompt adherence | Scheduler | ✅ | |
HiDiffusion | Faster inference, better images at 2048px and 4096px | Model-Modifier | ✅ | ✅ |
Hyper-SD | Only 1 - 8 steps in the scheduler; uses TCD | LoRA Scheduler | ✅ | ✅ |
VideoGigaGAN | Upscale videos | Model | ||
PanFusion | 360-degree panoramic image generation | Model | ||
TCD (Trajectory Consistent Distillation) | Turbo Scheduler | Scheduler | ✅ | ✅ |
Stable Diffusion 3 | Text-To-Image Model | Model | ||
PhotoMaker | Uses its own model to process images; adds identity to existing models | Model Model-Modifier | ||
GigaGAN | Text to image | Model |
Instant Family
Description: Specify faces of people, and generate images with all of their faces present.
- Date: May 2024
- Authors: Chanran Kim, Jeongin Lee, Shichang Joung, Bongmo Kim, Yeul-Min Baek
- Paper: https://arxiv.org/abs/2404.19427
- Code: No weights or code available yet; coming very soon?
Story Diffusion
Description: Some attention-technique to generate consistent characters somehow?
- Date: May 2024
- Authors: ByteDance
- Paper: https://arxiv.org/abs/2405.01434
- Info: https://storydiffusion.github.io/
- Code: I'm not sure how this works
Stylus
Description: When the user types a prompt, this meta-expert-system selects from a list of available LoRAs to help it better fulfill the user's request. It automatically mixes these in.
- Authors: Carnegie Mellon, UC Berkley, Google Deepmind
- Info and Paper: https://stylus-diffusion.github.io
- Code: Nothing yet
Align Your Steps
Description: Nvidia's mathematical/theoretical analysis to find the optimal denoising schedule for diffusion models, resulting in better-quality images and prompt adherence.
- Date: April 2024
- Authors: Nvidia
- Info: https://research.nvidia.com/labs/toronto-ai/AlignYourSteps
- Library Used: None
- Code: None
- ComfyUI Implementations: Built into ComfyUI core as an array of numbers produced by an 'align your steps' node.
HiDiffusion
Description: Modifies existing Stable Diffusion models to generate higher-resolution images (2048px or 4096px) directly, without duplication artifacts, and provides a speed improvement.
- Date: April 2024
- Authors: MEGVII Technology
- Info and Paper: https://hidiffusion.github.io/
- Library Used: Pypi-package for diffusers
- Code: Takes a diffuser's pipeline class and modifies it with one line of code. It works on SDXL, SDXL Turbo, SD2, and SD1.
- ComfyUI Implementations:
Hyper-SD
Description: Provided as a LoRA add-on for SD1 and SDXL; enables these models to run in 1 - 8 steps, greatly reducing inference time.
- Date: April 2024
- Authors: ByteDance
- Info and Paper: https://hyper-sd.github.io/
- Model Weights: https://huggingface.co/ByteDance/Hyper-SD/tree/main
- Library: Easily implemented in diffusers; just load the LoRA, fuse them into the pipeline, then use the TCD scheduler.
- ComfyUI Implementations: Use the TCD (Trajectory Consistent Distillation) custom-node as a scheduler, and adjust the eta-parameter:
VideoGigaGAN
Description: A video-super-resolution model from Adobe that upscales videos from 128px to 1024px.
- Date: April 2024
- Authors: Adobe
- Info and Paper: https://videogigagan.github.io/
- Code: Adobe has not released any code. However, it's being recreated by lucidrains:
PanFusion
Description: 360-degree panoramic image generation, trained on Matterport 3D data. Works well for generating Skyboxes.
- Date: April 2024
- Authors: ???
- Info and Paper: https://chengzhag.github.io/publication/panfusion/
- Code: https://github.com/chengzhag/PanFusion
- Library: Uses diffusers as a dependency, but is not part of the diffusers library yet. It has its own custom-made python scripts.
- ComfyUI Implementation: None that I know of yet
- Note: There is also an older paper, Feb 2023, that tries panoramic image generation that was included in the diffusers library. https://huggingface.co/papers/2302.08113
Trajectory Consistent Distillation (TCD)
Description: A new scheduler used with turbo-diffusion models. Replaces the LCM (latent consistency model) scheduler.
- Date: March 2024
- Authors: Jianbin Zheng, Southern China University of Technology
- Paper: https://arxiv.org/abs/2402